Experience
2021 — Now
2021 — Now
Building Apollo's Data Platform from ground up for internal analytics.
● Tech lead for AI Agents platform - a Python API service on k8s/GKE. Built out observability for the service on Grafana - with Prometheus, OTEL, logs, alerts-as-code. Built out LLM quality monitoring platform, including LLM tracing, LLM regression framework, working on LLM evaluations at the moment
● Tech lead for real-time ingestion pipelines from production MongoDB into data lake, merging 30M CDC events (720GB) per hour into analytical storage, using Apache Beam, Spark, and Iceberg
● Built and maintained pipelines that ingest 100TB of production MongoDB data to Snowflake data warehouse weekly, with no impact on production, at 1/10 the cost of paying for a Saas, using Airflow and inhouse pipeline design
● Built and maintained infrastructure (Apache Airflow) to orchestrate data pipelines
● Built and maintained dev tools for the Data Platform team, such as CI/CD, monitoring, dev environments, etc.
2019 — 2021
2019 — 2021
San Francisco Bay Area
● Real-time system performance
● Data retention and legal hold on Quip documents
● Encryption performance
● Salesforce Permissions in Quip
● Rust R&D
2018 — 2018
2018 — 2018
San Francisco Bay Area
● Scaled Quip's Slack integration service horizontally
● Implemented process for internal handling of customer issues
2016 — 2017
2016 — 2017
San Francisco Bay Area
Worked on multiple functionalities of Doculus' document processor web app, which includes:
● Logger for Doculus' server, which is able to log to console, files, or Loggly
● Diff and Merge documents, showing changes between versions for better document management
● Find and Replace on documents
● Import and Export Microsoft Word documents to Doculus' platform
● UI Components
2016 — 2016
University of Michigan
Worked in the Cloud team, whose responsibilities include collecting and storing data from manufacturing units, of the research project. I contributed to the project by:
● Set up a Kafka stream to collect data for the cloud cluster
● Store the data in a HDFS cluster
● Visualize the Kafka stream in real-time using InfluxDB and Grafana
Education
University of Michigan