New York, New York, United States
• Developed media annotation pipelines that ingested up to 100,000 videos per day using Python, AWS Lambda, Step Functions, SQS, DynamoDB, MySQL, and Snowflake. Built accompanying service integration APIs in Typescript.
• Architected 2 new API integrations with commodity ML providers and integrated 7 new proprietary models.
• Introduced AWS Sagemaker into the media annotation pipeline reducing time-to-value for new proprietary models from weeks to days.
• Diagnosed and fixed scaling issues with distributed systems, which helped hit the goal of doubling capacity every year.
• Launched a serverless analytics pipeline for a new flagship product that handled 10 million rows of data per day.
• Championed a testing culture by introducing end-to-end testing for data pipelines and directing work to increase unit test coverage from 23% to 63%.
• Sped up a key ad analytics ingestion pipeline by 35% by maximizing I/O usage.