Experience
2025 — 2025
2025 — 2025
旧金山, CA
• Designed and deployed a probabilistic ETA prediction model using DoorDash’s NextGen architecture, replacing single-point baselines with non-parametric distribution-based forecasts to capture uncertainty.
• Led multiple iterations of Base Layer model experiments, improving MAE by 3% and CRPS by 7% over production benchmarks.
• Explored and evaluated 20+ modeling strategies and distance-aware loss functions for predicting non-parametric distributions to balance accuracy, long-tail performance, and reliability at scale.
• Optimized Decision Layer model parameters through grid search, raising On Time Accuracy by 0.41% and reducing 20-min Lateness by 3.1% without degrading other business metrics.
• Delivered two superior model candidates launched as online experiments serving 5M+ daily orders, now progressing toward full production rollout.
2024 — 2024
2024 — 2024
• Led full-stack development for an RAG-based conversational search engine and an LLM-driven Slack Bot, enabling real-time intelligent query handling for enterprise users.
• Developed an LLM-based Slack Bot, delivering context-aware responses (drawn from search results, chat history, and file attachments).
• Enhanced Lepton Search by integrating SearXNG as search backend and incorporating WizardLM and Llama3.
• Optimized query concurrency and data pipelining, reducing average LLM response times
while supporting concurrent user queries.
• Containerized Slack Bot server with Docker and deployed both SearXNG backend and Slack App on Kubernetes.
2023 — 2024
2023 — 2024
Toronto, Ontario, Canada
• Pioneered a real-time BCI signal classification pipeline, enabling rapid detection and visualization of neural responses with minimal latency.
• Extended OpenBCI codebase to handle real-time signal data ingestion and filtering from OpenBCI
headsets.
• Crafted a CNN-based model in ONNX to classify P300 wave signals at 82% accuracy.
• Incorporated classifier into OpenBCI GUI, ensuring live feedback within 100ms latency.
2022 — 2023
Markham, Ontario, Canada
• Spearheaded development and integration of advanced Computer Vision pipelines (YOLOv7) and Neural Rendering (Nerfstudio) into full-stack solutions, boosting detection and interactive 3D capabilities.
• Adapted Nerfstudio into a Flask–React web interface.
• Developed and fine-tuned a YOLOv7-based custom hand detector for motion-blurred videos.
• Integrated legacy PyTorch code with MindSpore.
• Led data collection, hyperparameter tuning, and environment setup for projects using YOLO, MANO, ViT, and NeRF.
2021 — 2021
2021 — 2021
Toronto, Ontario, Canada
• Contributed to a distributed inference framework for deep learning models using PyTorch and Node.js, focusing on partition algorithms, performance optimization, and scalability.
• Designed and implemented a partition algorithm for convolution layers in distributed YOLOv5 inference.
• Applied linear scheduling to distributed inference tasks, cutting overall latency by 41% on various models.
• Collaborated on a Node.js-based orchestration module with Prof. Li and Ph.D. students, ensuring seamless scaling and communication for large-scale AI workloads.
Education
UC San Diego
Master of Science - MS
University of Toronto
Bachelor of Applied Science - BASc
Nanjing Foreign Language School