Experience
2021 — Now
New York City Metropolitan Area
NeMo Framework:
• Leading the NeMo Framework's inference operations.
• Designed and implemented the export module to run fast LLM inference for community models such as Llama2, GPT, Falcon, Starcoder, and NeMotron using the TensorRT-LLM.
• Designed and implemented the deploy module to deploy any LLM model to Triton Inference Server.
GNN Tool:
• Lead the API design and the development of the GNN Tool project.
• Built GNN based models for fraud detection, recommender systems, and NLP.
• Worked with the product managers to get the requirements and built the roadmap.
• Created the design documents and wrote the initial code for the components.
• Managed the weekly team meetings, and created and assigned tasks.
• Helped engineers to complete tasks when needed and performed code reviews.
• Managed the inter-team communications.
• Presented our work to customers and helped with the technical discussion.
2018 — 2021
New York City Metropolitan Area
GPU Accelerated Recommender Systems, Merlin:
• Working on GPU accelerated feature engineering and preprocessing for recommender systems.
• Focusing on GPU accelerated Deep Learning library for recommender systems.
GPU Accelerated Machine Learning, cuML:
• Focused on traditional machine learning (ML), deep learning, and HPC problems.
• Developed GPU accelerated ML algorithms to help data scientists and researchers train ML models very fast on multi-node multi-gpu (MNMG) AI infrastructure.
• Architected and developed GPU accelerated ML primitives library for NVIDIA's open source project RAPIDS (http://rapids.ai).
• Developed single-GPU accelerated generalized linear models (OLS, Ridge, Lasso, SGD, etc), PCA and, tSVD using C++ and CUDA for RAPIDS cuML.
Achieved more than 20x speed-up compared to scikit-learn's implementation.
• Implemented MNMG linear models (OLS, Ridge, Lasso, Elastic-net, SGD, etc), tSVD, and PCA, and achieved 40x-90x speedup.
2017 — 2018
2017 — 2018
New York City Metropolitan Area
• Guiding the data scientists and researchers in building innovative solutions based on NVIDIA technology.
• Being an industry thought leader on integrating NVIDIA technology into deep learning solutions to support scientific and engineering applications.
• Giving deep learning workshops on computer vision (CV) and natural language processing (NLP) using deep neural networks (convolutional neural network, recurrent neural networks, long-short term memory networks, etc) to help developers, researchers and data scientist adopt the technology.
• Focusing on deep learning applications in financial services industry and insurance.
• Developed 3 use-cases using LSTMs, deep autoencoders, and deep reinforcement learning (policy gradients, deep Q-learning actor-critic, etc) for capital markets.
• Focusing on some of the deep learning based NLP tasks such as named entity recognition, text summarization, and topic modeling for finance use-cases.
2016 — 2017
2016 — 2017
Norwalk, CT, USA
• Developed web services/APIs using big-data technologies including Kafka, and Cassandra.
• Worked on predicting the trends of the hotel rates using simple linear regression.
• Did the proof of concept of machine learning based collaborative filtering to recommend hotels to customers.
• Focused on weighted alternating least squares (W-ALS) based recommender system.
2013 — 2016
Newark, NJ
• Taught Microprocessor Laboratory to a diverse class of 30 undergraduate students.
• Advised 100 graduate students for GPU and FPGA course projects.
Education
New Jersey Institute of Technology
Doctor of Philosophy (Ph.D.)
2016
Ege University
Master’s Degree
2010
Ege University
Bachelor’s Degree
2007