I designed and engineered Stats.Fan, a modern MLB analytics platform built on a proprietary data pipeline that makes 125+ years of baseball history fun and interactive through live dashboards, trivia, and player comparisons.
🔧 Highlights
Built a full-stack data infrastructure:
• Created a data lake with 20,000+ player HTML snapshots using R2 storage.
• Designed a refinery and scraping engine (using Cheerio) to normalize messy historical data.
• Structured a MongoDB-based data warehouse with clean, queryable player documents.
• Integrated MLB Stats API augmentation (career, season, and live stats).
• Deployed a custom CDN for player headshots tied to normalized player IDs.
Developed a modern, interactive frontend:
• Built with React and ECharts.
• Features include player comparison tools, trivia generator, and interactive stat widgets.
• Optimized for performance and SEO with clean markup and server orchestration.
Designed orchestration and deployment tooling:
• Built CLI + shell scripts for rsync-based deployments to a DigitalOcean droplet.
• GUI tools for scraping config, ingestion monitoring, and API response testing.
• Modular scraper architecture with GraphQL-style parameterization.
End-to-end engineering ownership:
• From ingestion pipelines and data normalization to frontend UX and production deployment.