Codementor Events

How and why I built Data Scraping script in Python3

Published Apr 30, 2020Last updated May 01, 2020
How and why I built Data Scraping script in Python3

About me

I am a Professional Python Developer at W3sols who has built complex data scrapers, Python backends for mobile and web apps. I am working on AI and ML based tech stack these days and plan to progress my career as a Professional AI and ML engineer.

The problem I wanted to solve

I created this data scraping script for one of my clients who wanted to automate the process of betting odds data collection and analysis otherwise the client would do it manually by checking on different websites and then putting data into excel for analysis.
I was tasked to do this in Python3. Being a web developer and working mostly on frontend technologies like javascript and HTML, CSS I was new to Python. I had to learn it as well as implement the scraper script for the client.

What is Data Scraping script in Python3?

A data scraping script basically pulls data that is there on a web page. As we all know data is the new "OIL", everyone is in a race to own a lot of data in their respective industries and internet is a treasure of data. It has kinds of data but usually unstructured. In data scraping we try to pull that unstructured data from websites and structure that data for future study and analysis.

Tech stack

For this Data Scraping web app I used :

  • Python 3 for backend scripting and its libraries such as (BeautifulSoup4, Requests, Pandas)
  • Django for API creation
  • Angular 8 for frontend development

The process of building Data Scraping script in Python3

The websites I scraped were famous sports betting websites and a big task was the STRUCTURE of each website. All websites had a different structure. I scraped 2 websites and both had a different structure. It took me sometime to analyse the structure of each website and understand how data was populated on any webpage.

After I resolved that next task was to bypass login on each website. Both websites made data available only after a login and that too if that login is done with a browser not a scraping bot πŸ˜… and to my surprise it was so easy with Python3 only 6 lines of code and done.

Next was the task of pulling the data and that was made easy with BeautifulSoup4 and Requests library available in Python3.

Now, I had a data scraping script another task was to save the data into mongoDB and to my surprise this NoSQL database is very easy to use πŸ˜„.

Data Saved !!!!!
Time to analyse the data. My client provided some formulas to calculate different probabilities and how to convert odds data from European to American form. Used those formulas and created APIs that could be called from frontend to populate the dashboard.

Bam!!!!🀩
Almost done !!!!!

Now it was the time for some frontend. I used ANGULAR 8 to create a dashboard that called the APIs from backend and populate data onto dashboard.

THIS IS IT !!!!! πŸ˜„

All the work automated !

Challenges I faced

It was not that tough though since I had experience working on different programming languages like Javascript, Swift 4, NodeJs etc. but whenever you are up for something new and big there are some challenges that one may face.

In this whole process I faced the issue of understanding how to setup a proper environment to work this web app but in the end all set and finished πŸ‘πŸ»

Key learnings

I learnt a whole new tech stack in this whole process :

  • Python3
  • Django
  • MongoDB
  • Angular 8

and of course the art of data scraping !!!!

Tips and advice

For those who want to understand what data scraping is and how to do it effectively in Python3 I would advise to start with basic scraping like scrape wikipedia urls and start easy with BeautifulSoup4 and Requests library.

DO NOT jump over directly to hard part such as scraping using frameworks like Scrapy, Selenium etc.

Final thoughts and next steps

End thoughts:
Do stuff on your own. It will need time and research and most of all EFFORT but in the end you will end up learning something new which you can feel proud of !!!! πŸ˜„

Discover and read more posts from Dharvi
get started
post comments1Reply
Terry Austin
8 months ago

Great article