Seattle, Washington, United States
Amazon Elastic Map Reduce (EMR) Serverless
Was a founding engineer of Amazon EMR Serverless - a new Amazon EMR offering clients to setup an EMR cluster 75% faster. Eliminates the need to setup, maintain, or configure EC2 hosts or EKS clusters.
• Architected and built a multi-tenant, distributed, backend service that serves as the resource manager for clients’ distributed applications’ (i.e. Spark and Hive) workers with >99.9% availability.
• Designed, implemented, and launched EMR Serverless’ custom images, enabling customers to install/configure packages optimized for their workloads and integrate with current CI/CD best practices. 100+ customers successfully submitted 5000+ jobs within the first two months of the launch.
• Improved container start time by reducing Docker images pull times by 90%. Worked cross-functionally with other AWS service teams and the Amazon Linux team to upgrade a core Docker dependency to achieve this.
• Implemented pre-initialized capacity support, a feature to reduce application start time, on EMR Serverless’ resource manager service.
• Collaborated with downstream services to overcome capacity constraints brought on by heavy users, those who consumed at least 4,000 vCPU’s concurrently, of EMR Serverless.
• Led the implementation of operational metrics collection and monitoring dashboards in parts of EMR Serverless to improve debugging and proactively identify issues.