Experience
2022 — Now
2022 — Now
New York, New York, United States
Currently working on the iCloud storage platform.
2021 — 2022
2021 — 2022
New York, United States
Technical lead for a project adding support for user-defined data types to Meta's data warehouse. This work focused on the following areas:
• Design of the abstract type system.
• Implementation of the reference type checker.
• Integration of UDT support into storage and query processing infrastructure.
• Development of supporting metadata infrastructure.
• Cross-functional engagement with data engineering teams to drive adoption.
• Design of a governance system for type registries.
2020 — 2021
2020 — 2021
Technical lead, data pipeline. Architected and lead the implementation of the data pipeline through which the bulk of BuzzFeed's data moves. This system is a message bus together with HTTP endpoints; data enrichment, validation, and routing services; configuration and schema registries; and archival and disaster recovery layers. It serves as the data plane for most ETL workflows, and is the primary IPC mechanism for ~600 microservices. This new pipeline increased capacity by 10x; introduced strong typing of data (requiring a gradual migration of existing data streams); centralized and automated configuration management; and leveraged a new system topology to allow serverless implementations of most services.
2018 — 2020
2018 — 2020
Designed and lead the implementation of a system for low-latency import into a data warehouse. The legacy system it replaced was a batch process that suffered from multi-hour latency and significant reliability issues. The new system approximated realtime inserts through the use of microbatches imported on a rolling basis, achieving a data latency of 2 minutes. Appropriate employment of queueing disciplines, write-ahead logs, and an exhaustively testable state machine representation of the loading process allowed the system to handle 3M loadable artifacts per day with a 99.5% latency SLA.
2017 — 2018
2017 — 2018
Technical lead, native advertising optimization. This team built recommender systems for ad targeting leveraging both user browsing histories and content similarity models.