Building at the intersection of engineering and machine learning.

Experience

2023 — Now

Mountain View, California, United States

Designed and implemented multi-threaded, asynchronous data pipeline in Rust to replace the legacy C# implementation that processes billions of audit logs per day.

Created a spatial hardware inventory dataset and built a machine learning pipeline for failure detection, reaching an 80% true positive rate on an imbalanced dataset.

Harvard Medical SchoolMachine Learning Researcher

2022 — 2023

Cambridge, Massachusetts, United States

Developed an end-to-end, multi-modal data preprocessing and ML pipeline for image captioning classification. Published thesis paper, supervised by Professor Gabriel Kreiman.

Created a custom image-text dataset, generated contextual embeddings from BERT and ResNet-18, and leveraged PCA dimensionality reduction to improve efficiency. Implemented training + inference with ML classifiers (SVM, Naive Bayes, DNN) on compressed contextual embeddings, achieving 70% accuracy (competitive with SOTA) with a linear SVM compared to 40.4% with static embeddings.

StochasticMachine Learning Engineer

2021 — 2021

Cambridge, MA

Implemented post-training, attention head pruning (magnitude/gradient) for transformer pipeline. Benchmarking inference performance with ONNX Runtime on AWS instance with Docker. Wrote a PyTorch to ONNX converter and backing for FastT5 transformer pipeline, resulting in a 2x inference speed up.

Education

Harvard John A. Paulson School of Engineering and Applied Sciences

Master of Science - MS

Harvard University

Bachelor of Arts - BA

Massachusetts Institute of Technology