Experience
2025 — Now
2025 — Now
Sunnyvale, CA
Building internal developer tooling on the Performance Infrastructure team at Meta — turning fragmented, manual workflows into self-service platform abstractions.
• Consolidated multiple fragmented benchmarking tools into a unified, self-service platform that abstracts away the underlying tooling complexity. Developers now specify the metrics they care about (latency, throughput, resource utilization), and the platform handles tool selection, orchestration, and execution — eliminating the need to babysit long-running benchmarks or hunt through messy logs for results
• Built persistent execution tracking across all underlying benchmarking tools — storing commands, configs, run metadata, and resulting metrics — making benchmarks fully reproducible and enabling teams to compare runs over time without manual log archaeology
• Designed the platform so the underlying benchmarking tools are transparent to users: a single interface in, clean metrics out, with full traceability of every run
Technologies: Python, distributed systems, internal developer platforms, developer tooling, developer productivity
2021 — 2025
2021 — 2025
Toronto, ON
• Reduced model serving memory footprint by 600MB per instance across
1,000+ production models by replacing TorchServe with a decoupled serving
architecture that separated the serving application from the serving
framework, improving modularity and eliminating unnecessary memory
overhead.
• Designed and launched a distributed embeddings service for a RAG-based
retrieval system by building a high-throughput, batched and asynchronous
inference pipeline that transforms chunked documents into embeddings,
enabling scalable semantic retrieval in production.
• Reduced infrastructure provisioning time from 10 days to 8 minutes by
re-architecting a shared, PR-driven Terraform pipeline into a self-service CLI
backed by a serverless control plane, enabling safe, on-demand infrastructure
provisioning with guardrails and eliminating cross-team deployment
bottlenecks.
• Improved ML development velocity and platform scalability by transitioning
infrastructure ownership from research teams to a centralized platform and
introducing standardized APIs and observability practices.
• Designed and implemented isolated, GDPR-compliant experimentation environments for ML engineers and data scientists, enabling safe and rapid ML experimentation on production-grade data without impacting production systems or violating data privacy requirements.
Technologies: Python, AWS, Terraform, Docker, ECS, Kubernetes, CI/CD, PyTorch, RAG, embeddings, inference serving, Infrastructure as Code
2020 — 2021
2020 — 2021
Toronto, Ontario, Canada
2018 — 2020
2018 — 2020
Montreal, QC
Built and owned the containerized compute platform at Plusgrade (travel industry fintech) that service teams used to deploy and run their applications. Designed the platform end-to-end — from the ECS-based container orchestration layer to the out-of-the-box observability stack — so that product teams could ship independently without managing infrastructure.
• Increased system availability from 99.5% to 99.99% by architecting the containerized compute platform on ECS with gated CI/CD pipelines, replacing monolithic EC2 deployments with a self-service, component-level deployment model
• Built the platform's observability layer from scratch: centralized logging with the EFK stack (Elasticsearch, Fluentd, Kibana), monitoring and alerting with Prometheus, Grafana, and AlertManager — provided as turnkey capabilities for all service teams
• Pioneered large-scale infrastructure modernization, migrating services from EC2 Classic to VPC and containerizing workloads on ECS
• Decomposed monolithic systems into microservices and introduced component-level CI/CD pipelines, enabling independent deployments and faster release cycles
Technologies: AWS (ECS, EC2, VPC), Docker, CI/CD, microservices, EFK Stack, Prometheus, Grafana, AlertManager, observability, distributed systems
2017 — 2018
Education
Arizona State University
Master of Science at Computer Engineering
2012 — 2014
Shiraz University
BS Computer Science and Engineering
2006 — 2011