Experience
2019 — Now
2019 — Now
I am a Devops consultant and Advice startups on best practices when designing and building Data & metadata infrastructure for Analytics.
Specialities:
AWS best practices VPC, subnetting , Security Groups, AWS Orgs.
Build data pipelines Batch and Streaming Data Analytics kafka.Samza,Spark,Gobblin, Flume.
Metadata Frameworks
Clients
LilyAI
Sagetap
NewtonX
2017 — Now
2017 — Now
San Jose, California
Defining charter for ML Infra Foundation teams by working across Training and Inference teams to building SRE principles across the stack
Autoscaling Inference Infra for safe ramping ML models during A/B tests.
Define ML Model health metrics(data drift across features, retrain interval, etc)
Define and Implement vendor(Nvidia, AMD) Agnostic GPU observability for Training and Inference Workloads.
SRE Tech Lead for Apache Gobblin (Batch + Streaming) which is LinkedIn's Data Ingestion Framework to ingest PB's of data into Hadoop used to drive data analytics and ML Models which power LinkedIn Site
Primary Projects:
Real-time data Analytics Querying for Infrastructure services
Apache Gobblin - Design and Implement Realtime Monitoring & Alerting Framework for Batch & Streaming Data Ingestion Pipelines.
Integrate Data Ingestion Pipelines with Datahub(A Generalized Metadata Search & Discovery Tool) for auto remediation systems
Obfuscating LinkedIn Offline member data to comply with GDPR regulations.
Mentor Engineers & Interns and advise on best practices when deploying at scale.
DMRC Committee Lead(Data Model review committee) to advise on best practices on schema design.
Speaker:
DataWorks Summit- Washington DC 2019: Obfuscating LinkedIn Member Data
Data Summit - Boston 2019: Tackling Data Ingestion Challenges at LinkedIn Scale
Microsoft/LinkedIn SRE Con- Seattle/SanJose 2019: Data Pipeline Monitoring using metrics
Courses: LinkedIn AI 200
2015 — 2017
Mountain View
Elementum SCM : Senior Site Reliability Engineer
As first SRE member of the org, my accomplishments were as follows.
Build highly scalable fault tolerant infrastructure for massive data ingestion/cleansing pipelines based on Lambda Architecture involving Spark, Kafka, Zookeeper ,Redis & Elastic Search, Hbase.
Implemented Data Pipeline orchestration using Airflow for Spark jobs
Design and Implement pipeline infrastructure leveraging AWS components SQS, Cloud Formation, Lambda's.
Automated Deployment for MicroServices running in AWS Auto Scaling mode .
Fire Fighting Infrastructure Issues on a regular basis via monitoring, alerting & custom tools .
Developed tools to Automate monitoring and troubleshooting which were leveraged by other teams.
Adding to the Automation environment through chef recipes & Jenkins scripts.
Migrate Monolithic services to Micro services architecture in a docker containers.
2012 — 2015
• Hands on experience in resolving performance bottlenecks at various levels in Informatica ETL Application, Platform Resources, Open Source drivers, Network and databases Linux/Unix/Windows).
• As Saas cloud team liaison worked on components of Informatica Cloud Agent and cloud connectors like Amazon Redshift, S3 ,Salesforce and Microsoft Dynamics.
• B2B Data Exchange Application(DX) product specialist involved debugging core components including DX server, Active message queues, Managed file Transfers(MFT),ETL and End Point Integration.
• Developed tools and scripts for automation of configuration changes and troubleshooting issues.
• Develop diagnostic tools to optimize troubleshooting and deriving faster root cause analysis and fixes
Worked on Hadoop cluster system configuration and Informatica HDFS Power Exchange Connectors on various hadoop distributions like Cloudera and HortonWorks.
• Served as a cross functional point of contact with the core services team to come up with recommendations based on product usage in live production environment.
Education
Stony Brook University
MS
2006 — 2007
Anna University Chennai
B.E
2002 — 2006