Ville de Paris, Île-de-France, France
Research NLP internship focusing on non-labeled data.
• Scraped (via Python: Selenium, BeautifulSoup and APIs) and used more than 10,000 working agreements from French government website. On the one hand, used those to understand the impact of a 2018 French reform on equality-at-work, and on the other hand to understand the impact of lockdown on the generalization of teleworking.
• Worked with Datamatics for the labelling process, then employed supervised Machine Learning algorithms for exploiting the labelled data.
• Also considered several other approaches such as NER (Named-entity recognition), Semi-Supervised learning, keyword searching, etc.