
By night, I am a Deep learning enthusiast and by day a curious full-stack developer. I love sorting out the intricate details of problem to come up with a creative solution, and often learn by hacking, just to make things fun! Currently I'm excited to be learning more about Convolution Networks and NLP filters.
Aside from the computer scientist in me, I enjoy working out and learning to play piano in my free time. I continuously strive to push myself forward both professionally and more importantly, as a person.
I am always open to new ideas and possible projects, so don't hesitate to contact me!
Developed and implemented scalable data pipelines using Python and Apache Spark, resulting in improved efficiency and reduced data...
Developed and implemented scalable data pipelines using Python and Apache Spark, resulting in improved efficiency and reduced data processing time by 40%.
• Ensured a smooth transition by converted existing Python modules into jobs designed to be compatible with PySpark’s distributed computing environment.
• Cooperate with data engineers in designing and building a robust data lake architecture, ensuring accurate and timely data ingestion and integration from various sources.
• Collaborated with cross‑functional teams to identify and address data quality issues, resulting in a 30% reduction in data errors and improved overall data integrity.
• Optimized database performance by setting up transform and setting the servers to be serverless, resulting in a 50% cost reduction.
• Created an internal prompt tool that aids sales and solutions increasing productivity and reduce time for scoping out clients solutions by 10%.
• The scrubber module was enriched by incorporating the Java Stanford Named Entity Recognizer (NER), a move that boosted the module’s data processing speed by leveraging cutting‑edge natural language processing technology.
• Troubleshooting data‑related challenges became more efficient through the use of Datadog logs, ensuring system performance remained at its peak. Additionally, they employed the K6 tool for load testing model services and used Datadog’s forecasting features in tandem with K6 to refine the scaling of model services, underpinning the team’s commitment to robust and responsive data infrastructure management.
• Developed a Twitter sentiment analysis tool, which serves as an assistant for market research, providing valuable insights into ...
• Developed a Twitter sentiment analysis tool, which serves as an assistant for market research, providing valuable insights into public opinion and trends. To ensure efficient data retrieval, optimized the PostgresDB, resulting in significantly faster SQL query execution.
• Constructed a prediction model using Python with the PyTorch framework, designed to predict E. coli concentrations in water samples accurately. This model’s predictions are presented within a range of confidence intervals, offering a clear and reliable assessment of potential contamination levels.
• Furthermore, Plotly was chosen for data visualization, a decision that allows for interactive, publication‑quality graphs. This tool has enabled the clear and effective presentation of data, facilitating an intuitive understanding of the model’s predictions and the underlying sentiment analysis findings.