Hi,
I am a Senior Data Engineer with over 6 years of experience working on Python. I have worked on Knowledge Graphs (Neo4j, Spark, NetworkX, d3.js), OS programming, creating data pipelines (Airflow, PySpark, SQL, AWS), data ingestion APIs (Dask, Azure, FastAPI, Postgres, AsyncIO) and data analysis (Pandas, Seaborn, Matplotlib). I have also created multiple courses on Udemy on High Performance Computing in Python, Exploratory Data Anaysis, Functional Programming and Scalable Data Analysis. I also answer questions regularly on StackOverflow. I am currently in Top 5% of people who answer on StackOverflow.
• Worked as a Data Engineer, and helped in setting up Apache Airflow DAGs for data ingestion and processing (using Pyspark).
• Res...
• Worked as a Data Engineer, and helped in setting up Apache Airflow DAGs for data ingestion and processing (using Pyspark).
• Responsible for end-to-end design and optimization of Airflow DAGs.
• Creating Dashboards for visualizing Streaming data using Apache Kafka
• Migrated cloud infrastructure from AWS to Microsoft Azure
• Managing the complete data pipeline for ingestion and processing
• Research new healthcare datasets to be ingested into the pipe...
• Managing the complete data pipeline for ingestion and processing
• Research new healthcare datasets to be ingested into the pipeline
• Responsible for reducing the turnaround time for new dataset ingestion and
processing.
• Optimize the data ingestion process to utilize resources in the most efficient
manner possible.
• Identify places where code redundancy can be reduced as much as possible.
• Also part of the team to mentor interns for Python.
Used Python, Apache Airflow, Pandas, NetworkX