San Diego, California, United States
• On-Prem Druid to SaaS Migration: Spearheaded migration of real-time analytics platform to Imply SaaS, covering infrastructure provisioning, codebase adaptation, observability setup, and stakeholder engagement—ensuring zero downtime and smooth user onboarding.
• Schema Evolution Governance: Drove cross-org initiative to modernize schema evolution with governance guardrails. Led system design, stakeholder alignment, and mentored engineers on schema registry integration and evolution patterns.
• Sub-Second Latency Redesign: Architected and deployed low-latency streaming solutions using Apache Flink, Kafka (MSK), and Druid, reducing average query time from 9s to 0.3s (96% improvement) and cutting costs by 98% through Druid-native optimization.
• Streaming Framework Modernization: Rebuilt core data ingestion platform using Apache Flink, Kafka (MSK), EKS, and S3, deprecating Oracle and enabling open table formats (Iceberg) on Snowflake for cross-platform interoperability. The pattern enabled other services to leverage Snowflake as backend datastore and saved ~$540k on Oracle licensing and ~$360k annual operational cost.
• Databricks Adoption: Led the evaluation and POC of Databricks vs. EMR for batch workloads. Designed migration plan and migrated critical EMR pipelines. Converted 90% of the raw tables from parquet format to interoperable format delta lake. This resulted in a cost savings of ~$350k/year.