
I am a Senior Data Engineer with experience in optimizing reports, developing ETL pipelines, and leading data team strategies. Skilled in Python, SQL, AWS, Docker, and data visualization tools.
• Redesigned and optimized critical reports by replacing inefficient R scripts with production-grade Python and SQL, reducing run time...
• Redesigned and optimized critical reports by replacing inefficient R scripts with production-grade Python and SQL, reducing run times by up to 96%
• Collaborated with the data analytics team to develop new reports and reporting capabilities, securing additional clients and driving a 60% revenue increase
• Designed and led data team strategy for the large-scale migration of ETL pipelines from disparate in-house solutions to a Docker-based Dagster deployment, significantly improving data visibility, enabling detailed lineage tracking, accelerating metadata discovery for faster troubleshooting, simplifying development lifecycle, and standardizing coding practices across all environmentsData Engineer | May 2023 - Jan. 2025
• Built and maintained Python and SQL-based ETL pipelines that ingest and clean over 25 million business records for a machine learnin...
• Built and maintained Python and SQL-based ETL pipelines that ingest and clean over 25 million business records for a machine learning application, ensuring data accuracy and scalability
• Deployed Docker-based Airflow to orchestrate Python and SQL scripts for data ingestion and cleansing in user search results, and created comprehensive documentation on setup, usage, and troubleshooting
• Partnered with the internal cybercrime research team to scale breach data ingestion by updating legacy data systems for parsing auto...
• Partnered with the internal cybercrime research team to scale breach data ingestion by updating legacy data systems for parsing automation, achieving a 2–3x increase in ingest velocity
• Ingested over 100 million breach records across multiple data stores by parsing data with Linux, Bash, Python, and SQL
• Improved and maintained large-scale data ingestion pipelines—handling hundreds of millions of daily records—and provided on-call support to ensure reliable delivery for both internal analytics and external business products