Experience
2021 — Now
2021 — Now
I led the end-to-end development of scalable and automated pipelines for feature generation, model training, and deployment using technologies such as Python, TensorFlow, PyTorch, Apache Spark, Kubernetes, AWS, and Google Cloud. These pipelines enabled rapid experimentation and efficient delivery of machine learning models into production. I implemented robust data processing workflows leveraging BigQuery, Apache Iceberg, and cloud storage solutions to aggregate and transform real-time event stream data for dynamic feature engineering. I designed and deployed optimized inference services using TensorFlow Serving and Triton Inference Server, powering recommendation systems at production scale. I established comprehensive monitoring frameworks using TensorBoard, Prometheus, Grafana, and MLflow to track key model performance indicators, detect data or model drift, and automate retraining workflows as needed. I also developed and fine-tuned large language models (LLMs) for AI-driven use cases such as contextual search, content moderation, and personalized recommendations. For generative AI applications, I integrated foundation models—including Claude, Titan, and Jurassic—via Amazon Bedrock to support intelligent summarization, chatbot conversations, and semantic search. I employed Retrieval-Augmented Generation (RAG) techniques with vector databases, knowledge graphs, and embeddings to enhance retrieval accuracy and user interactions. Additionally, I implemented advanced optimization strategies such as model quantization, federated learning, and multi-GPU training to meet stringent latency and scalability requirements. My work followed MLOps best practices, leveraging tools like Kubeflow Pipelines and Vertex AI to establish robust CI/CD pipelines for ML models, including seamless deployment, rollback observability. I also built scalable ETL workflows using Apache Beam, Apache Kafka Snowflake to enable streaming data processing for downstream machine learning workflows.
2018 — 2021
2018 — 2021
New Jersey, United States
As a Python AI/ML Engineer, I played a key role in advancing LILT's generative AI platform, which is designed for the rapid translation of extensive content. LILT was selected by a European law enforcement agency to provide a dynamic solution for translating high volumes of content in low-resource languages within tight deadlines. The platform, powered by large language models and leveraging NVIDIA GPUs and the NVIDIA NeMo framework, enabled the swift translation of time-sensitive information at scale.
NVIDIA A100 Tensor Core GPUs: These GPUs are tailored for high-performance computing tasks, offering enhanced memory bandwidth and tensor processing capabilities. They facilitated efficient handling of large-scale data processing and the complex computations necessary for training sophisticated language models.
NVIDIA NeMo: NeMo provided a modular and scalable architecture, supporting model customization through transfer learning and fine-tuning. Its integration with other NVIDIA tools and libraries ensured optimal performance and streamlined AI model deployment.
AWS: AWS enabled easy scaling of GPU resources to meet peak workload demands and supported a highly available and resilient architecture. It offered a range of services, including EC2 for compute power, S3 for storage, and SageMaker for managing ML workflows. The cloud infrastructure ensured high availability, disaster recovery, and dynamic resource scaling based on demand.
Extensive AI/ML Libraries: The development process was supported by comprehensive libraries and tools, including NumPy, pandas, and scikit-learn.
TensorFlow and PyTorch: Both TensorFlow and PyTorch provided advanced features for deep learning, such as automatic differentiation, GPU acceleration, and model deployment tools. TensorFlow's TensorBoard offered training metrics visualization, while PyTorch’s dynamic computation graph enabled intuitive model development and debugging.
2016 — 2018
2016 — 2018
New Jersey, United States
As a dedicated software engineer at Dell Technologies, I specialize in developing and optimizing cutting-edge software solutions that drive business innovation and enhance user experience. My expertise spans full-stack development, cloud computing, and data analytics, enabling me to deliver scalable and efficient software products. I am proficient in programming languages such as Java, Python, and JavaScript, and experienced in frameworks like Spring Boot and React.js. My work involves deploying applications on AWS and Azure, utilizing containerization with Docker and Kubernetes, and ensuring continuous integration and delivery (CI/CD) using Jenkins and Git.
Key Projects:
Cloud-Based Solutions: Developed and deployed scalable cloud-based applications on AWS, enhancing performance and reducing operational costs.
Data Analytics Platform: Engineered a robust data analytics platform using Apache Spark and Hadoop, providing real-time insights and improving data processing efficiency.
Microservices Architecture: Designed and implemented microservices architecture with Spring Boot and Docker, improving system scalability and maintainability.