Experience
2024 — Now
2024 — Now
Remote
Lead engineer for large-scale LLM serving and GPU infrastructure powering tier-1 customers across multi-cloud Kubernetes platforms.
• Designed and operated large-scale LLM serving pipelines using disaggregated prefill/decode on multi-node H100 clusters (vLLM, Go, KubeRay), improving P99 latency by 60% and 4x-ing throughput for mission-critical production workloads
• Built fair-share GPU scheduling across shared clusters using Kueue, KEDA, and fractional GPU partitioning, cutting idle GPU cost by ~$200K/month while maintaining 99.9% job success and cluster availability
• Designed and deployed multi-tenant, cloud-native Kubernetes platform (AppHub IDP) with GitOps (ArgoCD, Tekton), service mesh, and disaster recovery; cut deployment time from 2 days to 15 minutes across multiple engineering teams
• Standardized end-to-end observability for AI agents using OpenTelemetry, Prometheus, and high-cardinality metrics, enabling sub-second root cause analysis of distributed failures and improving incident MTTR
• Built MLflow-backed experiment and model infrastructure on Kubernetes (PostgreSQL, MinIO) as a shared platform, automating deployments via Go CLIs and Terraform across AWS, GCP, and Azure
• Contributed open-source patches to NVIDIA Dynamo, improving distributed inference throughput and memory efficiency on A100/H100 GPU clusters
• Led ISO/IEC 27001 and SOC2 Type II compliance programs, architecting security controls across cloud infrastructure
2023 — 2024
2023 — 2024
Seattle, Washington, United States
• Developed and benchmarked distributed inference systems using Triton Inference Server, Kubeflow, and Kubernetes, achieving significant throughput improvements for LLM serving workloads in HPC environments
• Built RAG (Retrieval-Augmented Generation) pipelines using LangChain and LLMs, enabling semantic search over enterprise knowledge bases with sub-2-second query latency
• Deployed and managed multi-node GPU clusters using Slurm workload manager and HPCC systems, running MPI-based distributed training jobs with NVIDIA cuDNN and CUDA optimization
• Implemented end-to-end MLOps pipelines using Apache Airflow and MLflow, automating model training, validation, and deployment workflows on AWS and GCP
• Architected Nginx-based reverse proxy and service mesh configurations with Kubernetes, improving API gateway performance and enabling zero-downtime deployments
• Contributed to internal developer platform (IDP) tooling using Helm Charts and YAML-based Kubernetes operators, reducing onboarding time for new engineering teams
2019 — 2022
2019 — 2022
Bengaluru, Karnataka, India
• Designed and maintained distributed microservices architecture using Java, Python, and Scala on Kubernetes, supporting enterprise clients across banking and financial services sectors
• Built and optimized CI/CD pipelines using Jenkins, Docker, and Terraform on AWS and Azure, reducing deployment time by automating infrastructure provisioning and release management
• Implemented container orchestration solutions using Kubernetes and Docker Swarm, migrating legacy monolithic applications to cloud-native microservices architecture
• Developed high-throughput data processing systems using Apache Airflow and SQL, handling large-scale ETL workflows for enterprise analytics platforms
• Engineered reliability improvements including Nginx load balancing, DNS configuration, and Single Sign-On (SSO) integrations, improving system uptime and security posture for client-facing applications
• Collaborated with cross-functional teams to deliver SOC2 compliant software systems, implementing security controls and audit logging across distributed services running on AWS and Azure infrastructure
2021 — 2022
2021 — 2022
Bengaluru, Karnataka, India
Co-founded an Ed-tech startup focused on developing an integrated, single-window application portal to streamline access to academic and career resources for students, contributing to the platform’s design, functionality, and market readiness.
• Drove product adoption by onboarding schools, colleges and universities across regions.
• Built and led a team of 20 across tech, operations, and outreach functions.
• Made strategic partnerships with relevant resource providers, software providers and onboarded advisory board.
• Led core functions including team building, operational strategy, and technology development management.
• Led platform design, market readiness, and go-to-market planning for the student-facing application.
2018 — 2019
Vellore Area, India
Education
Stevens Institute of Technology
Master's degree
Vellore Institute of Technology