• Designed and deployed advanced AI services integrating LLMs, retrieval-augmented generation (RAG), embeddings, and vector databases, delivering real-time financial insights, predictive recommendations, and automated client advisory workflows.
• Architected scalable machine learning data pipelines with Apache Spark, AWS Glue, and Kafka, enabling continuous ingestion, feature engineering, and low-latency model inference on large-scale streaming datasets.
• Developed and containerized microservices for model serving and inference using Docker, Kubernetes, and CI/CD, achieving 99.9% system uptime, rapid iteration cycles, and cost-efficient deployment at scale.
• Implemented observability, monitoring, and experiment tracking solutions with Prometheus, Grafana, and MLflow to measure key performance indicators (latency, drift, cost), improve model reliability, and support audit readiness.