Project: Identification of Peruvian languages
• Compile texts in multiple Peruvian languages from the Internet
• Automate text extraction
• Develop Python scripts to clean text
• Prepare the dataset based on the clean and labeled texts.
• Develop, train and compare various Machine Learning models and a Recurrent Neural Network.
• Develop a web application to present the results
• Write and present two papers in the SIMBig conference
Project: Classifiers committee for genetic expression data
• Develop a new machine learning algorithm based on classifier committees
• Automate testing multiple hyperparameters on the developed algorithm on various public datasets
• Compare results of the algorithm versus the Random Forest algorithm
• Write and present a paper in an IEEE BIBM conference