Accelerates product launch in industrial application and scientific discovery for billions of users empowered by better infrastructure and data-driven decisions. Website: https://hcxu.me Google Scholar: https://scholar.google.com/citations?user=h21r4gwAAAAJ Github: https://github.com/collvey
Experience
2024 — Now
2024 — Now
• Built an internal agent-driven experiment analysis system that automatically queries, ranks, and summarizes historical launch proposals, cutting manual analysis time and enabling faster, more consistent decision-making across teams
• Designed and launched a client-to-server ad invalidation framework that filters already-invalid ads (e.g., app already installed) early in retrieval, delivering +0.1%-0.2% incremental Facebook ads revenue at scale
• Led multiple ads retrieval infrastructure optimizations - including inactive ad filtering and storage redesign, that reduced CPU and storage usage by up to 98%, saving roughly $800K in annual compute cost while improving system efficiency
2023 — 2024
2023 — 2024
San Bruno, California, United States
• Led YouTube Android client-side video playback and download efforts across YouTube platforms to improve playback latency, app system health, video download reliability, and engagement
• Launched Android streaming response caching strategy to improve 22.5pp fast Shorts livestream transition rate, and contributed to 11.5% increase to Shorts livestream playbacks
• Recognized as Google Perfy Gold Awardees (66 total Googlewide) in 2024Q3 aimed for exemplar methodology/result on high-value performance and efficiency issues (latency, compute, memory, spindles)
2019 — 2023
2019 — 2023
San Bruno, California, United States
YouTube:
• Privacy Compliance: Designed and deployed client-side stream directory migration to fix security vulnerability for over 2 billion users
• Download Efficiency: Addressed and resolved critical issues leading to stalled video downloads, achieving an 83% reduction in video downloads stuck over 3 minutes by fine-tuning retry strategies
• User Experience: Reduced video download cancellations by 20% by resolving a multi-device casting race condition
• App Stability: Enhanced YouTube Music Android app stability, reducing total crashes by 15.6% through channel thumbnail download via https
• Database-Cache Consistency: Improved download performance with a 2.2% reduction in download failures by ensuring video download database-cache consistency
• Database IO Efficiency: Reduced user ANR (app not responding) events by 5% by eliminating redundant data encryption and decryption processes during database operations
• Data Integrity: Implemented a clean-up strategy that led to a 49% decrease in corrupt downloaded streams
Google Research (20% project):
• Collaborated with interdisciplinary teams of computational chemists, statistician, and software developers from research institutes and AI-driven drug discovery company to develop novel active learning algorithm and improve performance in small molecule hit-to-lead conversion flow
2013 — 2018
2013 — 2018
College Park, MD
SimuΧ– A Computer Model to Simulate Soft Bio-hydrogel Material 2014 – 2018
Molecular Computational Lab, University of Maryland, College Park
• Proposed a state-of-art computer model for biomaterial used in artificial skin synthesis to reduce the computational cost for simulations by 20 times
• Developed analytical applications to identify crosslinking patterns of chitosan polymer network using clustering algorithm implemented in Python to reveal 3D molecular structural properties
• Collaborated with BioChip groups to characterize the hydrogel deformation behavior based on 2TB simulation dataset and improve the understanding of biofilm fabrication
WEPPRO - A Highly Efficient Peptide Model with Optimized Drude Dipoles 2013 – 2018
Molecular Computational Lab, University of Maryland, College Park
• Modeled Water-Explicit Polarizable Proteins (WEPPRO) to obtain unbiased protein structures, a critical prerequisite for the design of drugs targeting the Alzheimer’s disease
• Simulated four million frames of molecular trajectories and unravel the underlying molecular mechanisms for self-assembly processes via statistical algorithms implemented in Python
• Created molecular images featured in peer-reviewed scientific journal cover using ray tracing system implemented in Visual Molecular Dynamics (VMD)
WEPMEM – An Optimized Lipid Model to Simulate Bilayer Dynamics 2012 – 2015
Molecular Computational Lab, University of Maryland, College Park
• Developed Water-Explicit Polarizable Membrane Model (WEPMEM) for lipids and cholesterol related to coronary artery disease for highly efficient computational simulations
• Investigated the lipid aggregation using machine learning algorithms implemented in Python scikit-learn and automated the analysis of simulation dataset
• Optimized a job management system to automate WEPMEM simulation setups in high-performance computing platforms using bash/csh
2011 — 2012
Greater Nanjing Area
• Investigated large-scale genome data to gain insights of type-II diabetes using BLAST, Clusteralx2, and Python (BioPython) and automated the data retrieval process from genome database using Python and SQL
• Automated the process to retrieve data from genome database and improved the efficiency of data collection using Python scripts with provided API
Education
University of Maryland
Doctor of Philosophy (PhD)
2012 — 2018
Nanjing Normal University
BS
2008 — 2012