I build the infrastructure that powers frontier AI — from large-scale LLM serving on multi-node GPU clusters to cloud-native Kubernetes platforms handling mission-critical production workloads.

Experience

Multiscale AISenior Software Engineer

2024 — Now

Remote

Lead engineer for large-scale LLM serving and GPU infrastructure powering tier-1 customers across multi-cloud Kubernetes platforms.

Designed and operated large-scale LLM serving pipelines using disaggregated prefill/decode on multi-node H100 clusters (vLLM, Go, KubeRay), improving P99 latency by 60% and 4x-ing throughput for mission-critical production workloads

Built fair-share GPU scheduling across shared clusters using Kueue, KEDA, and fractional GPU partitioning, cutting idle GPU cost by ~$200K/month while maintaining 99.9% job success and cluster availability

Designed and deployed multi-tenant, cloud-native Kubernetes platform (AppHub IDP) with GitOps (ArgoCD, Tekton), service mesh, and disaster recovery; cut deployment time from 2 days to 15 minutes across multiple engineering teams

Standardized end-to-end observability for AI agents using OpenTelemetry, Prometheus, and high-cardinality metrics, enabling sub-second root cause analysis of distributed failures and improving incident MTTR

Built MLflow-backed experiment and model infrastructure on Kubernetes (PostgreSQL, MinIO) as a shared platform, automating deployments via Go CLIs and Terraform across AWS, GCP, and Azure

Contributed open-source patches to NVIDIA Dynamo, improving distributed inference throughput and memory efficiency on A100/H100 GPU clusters

Led ISO/IEC 27001 and SOC2 Type II compliance programs, architecting security controls across cloud infrastructure

Multiscale AISoftware Engineer Intern

2023 — 2024

Seattle, Washington, United States

Developed and benchmarked distributed inference systems using Triton Inference Server, Kubeflow, and Kubernetes, achieving significant throughput improvements for LLM serving workloads in HPC environments

Built RAG (Retrieval-Augmented Generation) pipelines using LangChain and LLMs, enabling semantic search over enterprise knowledge bases with sub-2-second query latency

Deployed and managed multi-node GPU clusters using Slurm workload manager and HPCC systems, running MPI-based distributed training jobs with NVIDIA cuDNN and CUDA optimization

Implemented end-to-end MLOps pipelines using Apache Airflow and MLflow, automating model training, validation, and deployment workflows on AWS and GCP

Architected Nginx-based reverse proxy and service mesh configurations with Kubernetes, improving API gateway performance and enabling zero-downtime deployments

Contributed to internal developer platform (IDP) tooling using Helm Charts and YAML-based Kubernetes operators, reducing onboarding time for new engineering teams

Wipro LimitedSoftware Engineer

2019 — 2022

Bengaluru, Karnataka, India

Designed and maintained distributed microservices architecture using Java, Python, and Scala on Kubernetes, supporting enterprise clients across banking and financial services sectors

Built and optimized CI/CD pipelines using Jenkins, Docker, and Terraform on AWS and Azure, reducing deployment time by automating infrastructure provisioning and release management

Implemented container orchestration solutions using Kubernetes and Docker Swarm, migrating legacy monolithic applications to cloud-native microservices architecture

Developed high-throughput data processing systems using Apache Airflow and SQL, handling large-scale ETL workflows for enterprise analytics platforms

Engineered reliability improvements including Nginx load balancing, DNS configuration, and Single Sign-On (SSO) integrations, improving system uptime and security posture for client-facing applications

Collaborated with cross-functional teams to deliver SOC2 compliant software systems, implementing security controls and audit logging across distributed services running on AWS and Azure infrastructure

FilvelopEx-Co-Founder

2021 — 2022

Bengaluru, Karnataka, India

Co-founded an Ed-tech startup focused on developing an integrated, single-window application portal to streamline access to academic and career resources for students, contributing to the platform’s design, functionality, and market readiness.

Drove product adoption by onboarding schools, colleges and universities across regions.

Built and led a team of 20 across tech, operations, and outreach functions.

Made strategic partnerships with relevant resource providers, software providers and onboarded advisory board.

Led core functions including team building, operational strategy, and technology development management.

Led platform design, market readiness, and go-to-market planning for the student-facing application.

Toastmasters InternationalAssistant Area Director Administration

2018 — 2019

Vellore Area, India

Education

Stevens Institute of Technology

Master's degree

Vellore Institute of Technology

Experience+3

Education

Master's degree

Bachelor's degree

Experience