Experienced software engineer with over 7 years of hands-on expertise in diverse areas including product platform, data platform, machine learning infrastructure, and backend services.

Experience

PayPalStaff Software Engineer

2021 — Now

Greater Seattle Area

Domain owner of Venmo’s enterprise data platform, overseeing the processing of 2TB+ daily data across databases, data lakes, and warehouses. As tech lead in data and machine learning infrastructure, I spearheaded the integration of machine learning capabilities into the Venmo app. My work supports finance, risk, product, marketing, customer service, and AI domains, enabling data-driven decisions at scale and delivering reliable, intelligent product experiences.

Expertise:

Data Platforms: Large-scale data processing, batch and streaming pipelines, data orchestration, migration, encryption, and access control

Machine Learning Systems: Infrastructure development, feature pipelines, online/offline model integration, and end-to-end deployment

Infrastructure & Reliability: Scalable design patterns, database proxy services, Infrastructure as Code (IaC), system optimization, monitoring frameworks, and automated resilience (self-detection and self-recovery error handling)

Skills: Python • AWS (EC2, EMR, EKS, MSK, DMS, Kinesis, S3, Lambda, Glue, SageMaker, Step Functions, SQS, SNS, CloudWatch) • Kafka • Kafka Connect • Change Data Capture (CDC) • ELT • ETL • Spark • Luigi • Snowflake • Databases • Terraform • AWS Cloud Development Kit (AWS CDK) • AWS CloudFormation • CI/CD • GitHub Actions • Helm • Docker • Kubernetes • Datadog • Cross-functional communication and collaboration • Agile project management • Mentoring

HaurakiSenior Software Engineer

2020 — 2021

New York City Metropolitan Area

Designed and deployed a production-grade recommendation system for a food delivery platform, powering personalized bodybuilding meal plans with scalable, low-latency performance. Automated end-to-end machine learning workflows for data ingestion, feature engineering, model training, and deployment, reducing model update time from days to hours and boosting user engagement by 35%. Implemented a two-tier inference system for real-time personalization and continuous optimization through retraining pipelines.

Expertise:

Recommendation Systems: Content-based and collaborative filtering, offline candidate generation, and real-time re-ranking

Data Engineering: Automated data ingestion, batch and streaming pipelines, feature engineering, and centralized feature storage

Machine Learning Infrastructure: Automated end-to-end workflows for full machine learning development cycle

Machine Learning Operations & Deployment: Model governance, reproducible releases, CI/CD, scalable inference, and system monitoring

Skills: Python • TensorFlow • TensorFlow Recommenders • Amazon SageMaker • Amazon API Gateway • AWS Lambda • AWS Glue • Amazon Kinesis • DynamoDB • AWS Database Migration Service (DMS) • Change Data Capture (CDC) • PySpark • ETL • Feature Store • Databases • CI/CD • GitHub Actions • CloudWatch • Recommender Systems • Inference Systems

Columbia University in the City of New YorkSoftware Engineer, Data and Machine Learning at ISERP

2019 — 2020

New York City Metropolitan Area

Designed and deployed an end-to-end machine learning platform to predict legal court decisions and analyze precedent influence from opinion texts. Developed scalable microservices for data ingestion, NLP preprocessing, and model inference, leveraging transformer-based embeddings and retrieval-augmented methods for semantic similarity search. Built and visualized large-scale legal citation networks to identify influential precedent cases and enhance legal research and decision prediction.

Expertise:

Natural Language Processing & Retrieval: Text normalization, named entity recognition (NER), topic modeling, transformer-based contextual embeddings for semantic search and document clustering

Machine Learning Infrastructure: Automated training, evaluation, retraining, and scalable inference with feature and vector store integration

Data Engineering: Near-real-time ingestion, schema management, feature extraction, and workflow orchestration

Graph Analytics & Visualization: Citation graph construction, reindexing, and centrality-based precedent analysis

Skills: Python • R • PyTorch • TensorFlow • Hugging Face • Tokenization • Lemmatization • Vectorization • Named Entity Recognition (NER) • Regular Expressions • FAISS • Elasticsearch • Neo4j • FastAPI • Docker • Kubernetes • AWS S3 • AWS Lambda • Apache Airflow • React • D3.js

IBMData Scientist

2019 — 2019

New York City Metropolitan Area

Built and scaled a machine learning platform to automatically route customer complaints to the correct departments using natural language processing. Streamed complaint data in near real-time to enable continuous model updates and automated case handling. The system increased routing accuracy, improved customer satisfaction by 80%, and cut manual processing time by 88%.

Expertise:

Natural Language Processing: Text preprocessing, regex-based extraction, vectorization, and classification

Machine Learning: Automated end-to-end workflows for feature engineering, model training, evaluation, and deployment

Data Engineering: Real-time ingestion and synchronization for continuous data availability

Skills: Python • Scikit-learn • TensorFlow • Keras • Naive Bayes • Logistic Regression • SVM • Random Forest • LSTM • AWS Database Migration Service (DMS) • Change Data Capture (CDC) • AWS S3 • Spark • Databases

Interpublic Group (IPG)Machine Learning Engineer

2018 — 2019

Greater London, England, United Kingdom

Architected and delivered an automated machine learning platform that revolutionized social media branding and marketing analytics. The system identified key influencers, competitors, and events while performing sentiment and emotion analysis on large-scale multilingual data from social and online media. By automating end-to-end workflows for data ingestion, NLP processing, model training, evaluation, and visualization, it reduced manual effort from 2+ weeks to under 8 hours, enabling continuous, data-driven insights that enhanced campaign effectiveness and contributed to 43% global business growth.

Expertise:

Natural Language Processing: Topic modeling, clustering, named entity recognition (NER), sentiment and emotion analysis across multilingual datasets

Machine Learning: Automated pipelines for feature extraction, model training, evaluation, and deployment using classical and lexicon-based methods

Data Infrastructure: Scalable data processing and orchestration supporting continuous analytics and prediction delivery

Analytics & Visualization: Interactive dashboards showcasing sentiment trends, influencer networks, and event correlations for actionable marketing intelligence

Skills: Python • SpaCy • NLTK • Gensim • Scikit-learn • Spark • TF-IDF • Latent Dirichlet Allocation (LDA) • Named Entity Recognition (NER) • Logistic Regression • SVM • Random Forest • VADER • NRC Emotion Lexicon • Neo4j • Plotly Dash • Apache Airflow • AWS S3 • Amazon EMR • Docker

Education

Columbia University

Master's degree

The University of Manchester

Experience

Education

Master's degree

Master of Science - MS