Palo Alto, California, United States
Working on Human Language Technology. Main projects listed below.
Automatic Speech Recognition(ASR) (Amazon Transcribe)
• Wrote the baseline multi GPU sequence to sequence RNN architecture for Speech Recognition using MXNet for OpenSpeech
• Improved existing monotonic attention in a way specifically useful for ASR which achieved a 3% relative word-error-rate improvement over initial implementation
• Mentored intern project to improve performance of RNN-Transducer for Speech Recognition by modifying the loss to include sampled paths generated by the model during training
• Experimenting with downsampling on encoder side using CNNs, minimum risk training and other enhancements to the baseline seq-to-seq architecture to improve WER scores
Technologies: MXNet – Deep Learning Framework, AWS
Neural Machine Translation
• Improved MT models for Indonesian language from 18 BLEU -> 31 BLEU for AWS customer
• BLEU score improvement gained by implementing the following techniques: Finetuning on domain specific data, using single language corpus in a parallel setting by backtranslation, ensembling.