• Architected a Log Event Completeness Ranking and Replay System(US patent published)
which support:
• Measure log event completeness at application host level and rollup to data center level
• Provide percentile completeness rate.
• Rank log files in order of completeness impact, and determine the replay candidate to
achieve the target completeness value.
Using Kafka, MapReduce, Python, shell script, Argus.
• Integrated Distributed Tracing System(Zipkin) into Salesforce. Supporting more than one
billion transactions per day. Using Zipkin, Elasticsearch, Java.
• Leading in design and implementation of Kafka Pipeline Fidelity Check System, providing
realtime completeness at each topic/consumer group level. Using Kafka, Postgres, Java.
• Applied Machine Learning to anomaly detection during software release. Using
Elasticsearch and Kibana
• Worked on physical delete framework, increased deletion throughput.
• Created data replication latency monitoring system.
• Writing detailed test plans and test cases to cover business use cases, error handling and
boundary conditions as defined in technical specifications