Sunnyvale, California, United States
• Lead engineer supporting large-scale post-training for frontier LLMs and emerging multimodal models.
• Partnered with research teams to train early-stage multimodal models combining vision, language, and action-based inputs, enabling experiments in grounded reasoning and embodied interaction.
• Developed training pipelines that integrate visual encoders, cross-attention fusion layers, and instruction-tuned language heads, optimizing for sample efficiency and multi-sensor alignment.
• Led RLHF workflow development for multimodal systems, including preference data generation for vision-language tasks (image reasoning, spatial queries, action prediction).
• Developed curriculum-based sampling and multimodal reward-model training strategies that improved robustness on tasks involving perception, visual QA, and trajectory evaluation.
• Designed distributed replay buffers and sampling policies for long-horizon action sequences, improving convergence in tasks involving temporal dependencies and sequential decision-making.
• Supported parallelism (TP/PP/DP) for large-scale multimodal and LLM training runs, focusing on reliability, reproducibility, and stable multi-view data loading.
• Collaborated on model merging, evaluation pipelines, and safety-alignment experiments for multimodal policy models.