Cambridge, Massachusetts, United States
Worked on backend systems for web-based research projects built using Django.
Defined schemas and set up PostgreSQL databases for collection of user provided speech data collected via Twilio IVR flows. Exposed collected data via RESTful APIs to power the frontends.
Developed data processing pipelines for transcription, timestamp generation and annotation of collected speech data.
Set up machine learning infrastructure using AWS Sagemaker for training and deployment of custom NLP models for real-time inference.
Developed a python module that generated a 24/7 live video feed which served as the main content for the projects, using FFmpeg and other image processing libraries.