Backend Engineer·UAV Systems Researcher·Builder
Two years shipping Java and Python backend services across production Salesforce ecosystems — real SLAs, real incident response, real enterprise clients. Concurrently researching fault-tolerant UAV swarm coordination with results validated across 10 independent random seeds. I build for environments where failure has consequences.
Systems designed to solve real problems — from production backends to active UAV swarm simulation research.
Fault-Tolerant Multi-Agent Control Architecture · Python · PyBullet · CTDE MAPPO
A full physics-based simulation framework for UAV swarm coordination under realistic fault conditions. The architecture uses a hierarchical control design — a high-frequency inner loop for stability paired with a lower-frequency agentic supervisor handling fault classification and mission logic. A CTDE MAPPO policy handles multi-UAV task allocation under degraded communication and sensor noise, reducing post-fault tracking error by up to 16.8% vs. uncoordinated baselines. Localized coordination scales O(N) linearly vs. O(N²) for centralized approaches — delivering 28.5% lower tracking error across GPS drift and communication-degraded scenarios. All results validated across 10 independent random seeds with 100% run stability.
Root-Cause Analysis Engine for Distributed Services
FastAPI backend ingesting logs, metrics, and deployment events across 10+ microservices. Normalized data pipeline standardizes 5+ heterogeneous telemetry formats (Prometheus, structured JSON, raw log streams). Root-cause engine combines temporal correlation, Z-score anomaly strength, and dependency graph traversal to rank multi-factor diagnoses — surfacing the correct fault in the top-3 results 87% of the time.
Conversational RAG Document Assistant · Full Stack
React/TypeScript frontend + FastAPI backend deployed on GCP Cloud Run, handling real active user sessions. LlamaIndex vector pipeline chunks and indexes PDF/TXT documents for semantic retrieval. Incremental embedding logic skips unchanged chunks on re-index — cutting re-index time by 45%. Ollama integration enables fully offline LLM inference, eliminating cloud API costs for local deployments. Sub-2-second query response on commodity hardware.
I'm an MS Computer Science student at UT Arlington (graduating Dec 2026) with 2+ years of professional backend engineering behind me — not side projects, but production Java and Apex services handling live enterprise workloads across a Salesforce ecosystem. I've resolved production incidents under 2-hour SLA windows, optimized queries cutting latency by 35%, and maintained 99.8% API uptime across 3 client deployments.
In parallel, I'm doing graduate research in UAV swarm coordination and fault-tolerant multi-agent control — building simulation frameworks and training MARL policies that hold up under GPS drift, sensor corruption, and communication degradation. The kind of work where the system has to keep flying even when things break.
I've also shipped two AI products independently: a RAG document assistant with real active users deployed on GCP, and an observability engine for distributed microservice diagnosis built at a hackathon. Both run on infrastructure I provisioned and deployed end-to-end.
I do my best work on hard technical problems with real constraints — bandwidth-limited links, multi-fault injection, production SLAs. If the system can't afford to just restart and retry, that's exactly the environment I want to be in.
Tools I've shipped production code with — grouped by domain.
Working on autonomous systems, distributed infrastructure, or hard problems in defense tech? I want to hear about it.