Experienced software engineer with over 7 years of hands-on expertise in diverse areas including product platform, data platform, machine learning infrastructure, and backend services.
Experience
2021 — Now
2021 — Now
Greater Seattle Area
Domain owner of Venmo’s enterprise data platform, overseeing the processing of 2TB+ daily data across databases, data lakes, and warehouses. As tech lead in data and machine learning infrastructure, I spearheaded the integration of machine learning capabilities into the Venmo app. My work supports finance, risk, product, marketing, customer service, and AI domains, enabling data-driven decisions at scale and delivering reliable, intelligent product experiences.
Expertise:
• Data Platforms: Large-scale data processing, batch and streaming pipelines, data orchestration, migration, encryption, and access control
• Machine Learning Systems: Infrastructure development, feature pipelines, online/offline model integration, and end-to-end deployment
• Infrastructure & Reliability: Scalable design patterns, database proxy services, Infrastructure as Code (IaC), system optimization, monitoring frameworks, and automated resilience (self-detection and self-recovery error handling)
Skills: Python • AWS (EC2, EMR, EKS, MSK, DMS, Kinesis, S3, Lambda, Glue, SageMaker, Step Functions, SQS, SNS, CloudWatch) • Kafka • Kafka Connect • Change Data Capture (CDC) • ELT • ETL • Spark • Luigi • Snowflake • Databases • Terraform • AWS Cloud Development Kit (AWS CDK) • AWS CloudFormation • CI/CD • GitHub Actions • Helm • Docker • Kubernetes • Datadog • Cross-functional communication and collaboration • Agile project management • Mentoring
2020 — 2021
2020 — 2021
New York City Metropolitan Area
Designed and deployed a production-grade recommendation system for a food delivery platform, powering personalized bodybuilding meal plans with scalable, low-latency performance. Automated end-to-end machine learning workflows for data ingestion, feature engineering, model training, and deployment, reducing model update time from days to hours and boosting user engagement by 35%. Implemented a two-tier inference system for real-time personalization and continuous optimization through retraining pipelines.
Expertise:
• Recommendation Systems: Content-based and collaborative filtering, offline candidate generation, and real-time re-ranking
• Data Engineering: Automated data ingestion, batch and streaming pipelines, feature engineering, and centralized feature storage
• Machine Learning Infrastructure: Automated end-to-end workflows for full machine learning development cycle
• Machine Learning Operations & Deployment: Model governance, reproducible releases, CI/CD, scalable inference, and system monitoring
Skills: Python • TensorFlow • TensorFlow Recommenders • Amazon SageMaker • Amazon API Gateway • AWS Lambda • AWS Glue • Amazon Kinesis • DynamoDB • AWS Database Migration Service (DMS) • Change Data Capture (CDC) • PySpark • ETL • Feature Store • Databases • CI/CD • GitHub Actions • CloudWatch • Recommender Systems • Inference Systems
2019 — 2020
2019 — 2020
New York City Metropolitan Area
Designed and deployed an end-to-end machine learning platform to predict legal court decisions and analyze precedent influence from opinion texts. Developed scalable microservices for data ingestion, NLP preprocessing, and model inference, leveraging transformer-based embeddings and retrieval-augmented methods for semantic similarity search. Built and visualized large-scale legal citation networks to identify influential precedent cases and enhance legal research and decision prediction.
Expertise:
• Natural Language Processing & Retrieval: Text normalization, named entity recognition (NER), topic modeling, transformer-based contextual embeddings for semantic search and document clustering
• Machine Learning Infrastructure: Automated training, evaluation, retraining, and scalable inference with feature and vector store integration
• Data Engineering: Near-real-time ingestion, schema management, feature extraction, and workflow orchestration
• Graph Analytics & Visualization: Citation graph construction, reindexing, and centrality-based precedent analysis
Skills: Python • R • PyTorch • TensorFlow • Hugging Face • Tokenization • Lemmatization • Vectorization • Named Entity Recognition (NER) • Regular Expressions • FAISS • Elasticsearch • Neo4j • FastAPI • Docker • Kubernetes • AWS S3 • AWS Lambda • Apache Airflow • React • D3.js
2019 — 2019
2019 — 2019
New York City Metropolitan Area
Built and scaled a machine learning platform to automatically route customer complaints to the correct departments using natural language processing. Streamed complaint data in near real-time to enable continuous model updates and automated case handling. The system increased routing accuracy, improved customer satisfaction by 80%, and cut manual processing time by 88%.
Expertise:
• Natural Language Processing: Text preprocessing, regex-based extraction, vectorization, and classification
• Machine Learning: Automated end-to-end workflows for feature engineering, model training, evaluation, and deployment
• Data Engineering: Real-time ingestion and synchronization for continuous data availability
Skills: Python • Scikit-learn • TensorFlow • Keras • Naive Bayes • Logistic Regression • SVM • Random Forest • LSTM • AWS Database Migration Service (DMS) • Change Data Capture (CDC) • AWS S3 • Spark • Databases
2018 — 2019
Greater London, England, United Kingdom
Architected and delivered an automated machine learning platform that revolutionized social media branding and marketing analytics. The system identified key influencers, competitors, and events while performing sentiment and emotion analysis on large-scale multilingual data from social and online media. By automating end-to-end workflows for data ingestion, NLP processing, model training, evaluation, and visualization, it reduced manual effort from 2+ weeks to under 8 hours, enabling continuous, data-driven insights that enhanced campaign effectiveness and contributed to 43% global business growth.
Expertise:
• Natural Language Processing: Topic modeling, clustering, named entity recognition (NER), sentiment and emotion analysis across multilingual datasets
• Machine Learning: Automated pipelines for feature extraction, model training, evaluation, and deployment using classical and lexicon-based methods
• Data Infrastructure: Scalable data processing and orchestration supporting continuous analytics and prediction delivery
• Analytics & Visualization: Interactive dashboards showcasing sentiment trends, influencer networks, and event correlations for actionable marketing intelligence
Skills: Python • SpaCy • NLTK • Gensim • Scikit-learn • Spark • TF-IDF • Latent Dirichlet Allocation (LDA) • Named Entity Recognition (NER) • Logistic Regression • SVM • Random Forest • VADER • NRC Emotion Lexicon • Neo4j • Plotly Dash • Apache Airflow • AWS S3 • Amazon EMR • Docker
Education
Columbia University
Master's degree
The University of Manchester