Mumbai, Maharashtra, India
• Engineered scalable ETL pipelines in Python using Pandas, Paramiko, and Boto3 to ingest and preprocess high-volume data from seven external vendors, supporting downstream analytics across three business units.
• Standardized data ingestion for XML, CSV, and JSON formats, embedding schema validation and error-loggingroutines that ensured structural integrity across eighteen active workflows.
• Automated file transfers between on-premise Linux servers and AWS S3 buckets using Cron jobs and SSH-based scripts, saving over 24 hours of manual work weekly.
• Reduced batch job runtimes by integrating multiprocessing, optimizing I/O operations, and implementing asynchronous processing across forty scheduled data pipelines.
• Developed reusable Python modules for data type checks, normalization, and exception handling, enabling five engineering teams to unify data processing standards across projects.
• Collaborated with QA and DevOps to perform end-to-end UAT, version control with Bitbucket, and continuous deployment via Jenkins, ensuring smooth delivery of production-ready ETL assets.
• Left BNP Paribas to pursue MS however plans got delayed due to Covid-19, meanwhile took up this role.