Led the development of a low-latency Django REST API to filter and serve 18M points of interest.
Designed software to orchestrate tens of millions of requests to external APIs, respecting rate limits. Developed this system design into a reusable template and docs used by teammates for other projects.
Built a system to assign a single logical ID to multiple records of places from different sources.
Ingested OpenStreetMap data, transforming billions of nodes and ways into millions of usable POIs.
Created a dashboard with React, MapLibre, and Visx to visualize the state of the company’s datasets.
Spearheaded updates to the POI schema and a transition to Parquet format, reducing S3 usage by 30%.
Implemented schema checks and golden-data tests to detect errors and unexpected changes in data.
Rewrote a Glue job used in most pipelines to run 10x faster, reducing costs and removing a bottleneck.
Implemented a distributed system, using Kafka and AWS Batch, to concatenate millions of S3 objects in parallel.
Refactored team’s infrastructure-as-code to support the addition of a parallel production AWS account.
Presented multiple times in company fora, to an average audience of 50 staff including engineers and other stakeholders, on infrastructure, developer experience, and global content.