I am a seasoned Data Engineer with over 12 years of experience, adept at delivering transformative solutions in big data, cloud platforms, and database management. I've successfully worked as a developer and architect for top MNCs and led multiple freelance projects to completion.
Some of my key achievements include:
I specialize in crafting scalable ETL pipelines, optimizing performance, and mentoring aspiring engineers.
Worked as part of the team responsible for developing a framework enabling customers to create their own brand pages. Analyzed clickst...
Worked as part of the team responsible for developing a framework enabling customers to create their own brand pages. Analyzed clickstream data to derive actionable insights, including the number of attributed orders, page views, and add-to-cart events. Delivered key performance metrics to help customers evaluate the effectiveness of their brand pages and measure their ROI. Contributed to improving customer engagement by providing data-driven insights, leading to a 20% increase in attributed orders and enhanced customer satisfaction.
Optimized planograms for Walmart stores to enhance Kenvue's sales and Walmart's profitability by preprocessing and preparing...
Optimized planograms for Walmart stores to enhance Kenvue's sales and Walmart's profitability by preprocessing and preparing data for seamless consumption by the data science team. Additionally, developed generic and extensible frameworks to create a scalable codebase adaptable for other retailers.
Led the migration effort for a substantial big data project, transitioning it from an on-premises Cloudera cluster to the AWS cloud. E...
Led the migration effort for a substantial big data project, transitioning it from an on-premises Cloudera cluster to the AWS cloud. Employed Apache Airflow for job orchestration, executed Spark jobs on Amazon EMR, and facilitated data transfer from Amazon S3 to Redshift for comprehensive analysis.
Developed a comprehensive, extensible framework designed to facilitate Apache Spark application development, enabling developers to focus exclusively on core business logic. This framework seamlessly manages data ingestion from diverse sources, data storage across multiple destinations, and offers extensive configuration options for optimizing Spark application performance.
Designed and implemented numerous banking reports utilizing Apache Spark, coordinated through Autosys, and leveraged Hive external tables for analysis. The code prioritized performance optimization, ensuring efficient execution within a Spark-on-Kubernetes environment.