Experience
2025 — Now
2025 — Now
San Francisco Bay Area
Embark on bringing AI to media creation
2019 — 2025
2019 — 2025
San Francisco Bay Area
• Served as a tech-lead in a team and co-owned responsibilities of building long-term roadmap on VoD quality-of-experience domain.
• Led and designed the dynamic manifest packager, which serves a foundation of advanced video features to reduce overall round-trip time for video playback and geo-regional optimization, ultimately reducing time-to-first frame and stall ratio
• Spearhead the investigation on video quality metrics, drive an alignment on video quality observability roadmap and implement the initial foundation of accurate video quality measurement.
• Led and designed an inhouse end-to-end test framework reduce the total number of high impact incidents (Sev 2 or above)
• Served as a PoC from media infra to drive alignments on multiple VoD projects across different orgs.
2018 — 2019
2018 — 2019
San Francisco Bay Area
My main responsibility is to design and develop internal streaming data pipelines by using Oracle GoldenGate, Apache Kafka and Apache Storm. This project has been serving hundreds of use cases that require low latency data, such as online machine learning, reporting and etc.
• Design and develop highly available streaming data pipelines with 1 min latency from Site databases to Kafka topics.
• Lead the design and development of metrics and alerting for streaming data pipelines
• Lead the design and development of inline validation that prove 0% data loss in the pipelines
• Develop parallel data hydration transformation in Storm to enrich data streams that scales up to tens of thousands of messages per second
• Design and develop schema evolution using Apache Avro to handle inline schema changes
• Design and develop the checkpoint mechanism to ensure data consistency in all cases
2015 — 2018
2015 — 2018
San Francisco Bay Area
I worked on multiple data pipeline projects inside PayPal, which support data integration across different analytic systems and is used by hundreds of data engineers and analysts.
• Designed and developed mini-batch replication system, which replicates 1 TB data daily
• Used Apache Spark to format over 5 TB data in HDFS based on downstream requests
• Developed and supported a batch replicating system, which replicates 15 TB data daily between several heterogeneous databases
2014 — 2015
Los Angeles Metropolitan Area
I am a teacher assistant of ITP-104 (Web Publishing), which is teaching HTML, CSS and Javascript. It also includes basic skill of using Dreamweaver and Photoshop.
• Run open lab section weekly to offer technical help to students
• Help 100+ students to solve their coding problems during lectures
Education
University of Southern California
Bachelor of Science (BS), Computer Science
2011 — 2015