• Led the network repair operating model, shifting network operations from reactive ticket triage to structured diagnosis workflows, increasing actionable repairs from 10% to 50% and establishing the standard across all datacenters
• Designed and deployed an AI-native ticket routing system, replacing 1,200 lines of code and improving accuracy from 67% to 95%, including a self-improving pipeline that proposes new rules, and opens its own code reviews
• Delivered repair automation across 68K+ devices, 40K+ optics, and 130K memory replacements, reducing maintenance cycles and avoiding $4.5M in operational labor costs
• Built Model Context Protocol (MCP) tools and a shared diagnostics library powering AI diagnostic agents across platforms
• Established the repair platform health scorecard with concrete SLOs adopted across adjacent teams
• Mentored engineers and expanded team ownership into AI, datacenter, and backbone network domains