Seattle, Washington, United States
Built Golang/Python backend services for large-scale distributed systems on Kubernetes across AWS and GCP, supporting AI/ML workloads and largescale
data processing.
Developed AI-driven automation, IDE/CLI agents, and RAG-based troubleshooting workflows with LLM observability/retrieval systems, reducing on-call
effort ~70% and accelerating incident response.
Designed and implemented traffic splitting and mirroring platform with guardrails and internal UI tooling, enabling 900+ services to safely test
production changes.
Built automated system to detect and clean up orphaned Kubernetes resources (persistent volumes, deployments), reducing cloud waste by ~20% and
improving cluster efficiency.
Deployed and optimized Karpenter with EKS and custom compute classes, reducing idle capacity by ~25% and improving resource utilization,
observability, and monitoring using Prometheus and Grafana.