Codementor Events

Scrape the web with Python and get updates on Telegram

Published Feb 04, 2019Last updated Feb 09, 2019
Scrape the web with Python and get updates on Telegram

Intro

This is a follow up to How and why I built a simple web-scrapig script to notify us about our favourite food. In that article I described how someone can scrape the web for information, and get alerts via e-mail if a certain word shows up on a web page.

In this article I will describe what to do if you want to get updated via Telegram instead of e-mail.

If you're new to Python programming you might also want to check out these post about scheduling:
How to run and schedule Python scripts on iOS
How to run and schedule Python scripts on Raspberry Pi

1. Build the scraping part

In the previous article I gave detailed, step-by-step instructions under the The process of building the web-scrapig script section regarding this topic, so I'll not repeat my self.
I just want to add that in the example I built an alert for our favourite food, but you can scan and track all sorts of information. Stock rates, news, trends on social media, keywords on different feeds, etc.
As long as you can do an HTML select on it, it can be scraped and tracked.

2. Set up Telegram

To post alerts on Telegram, the best and easiest way is to create a Bot profile, and send updates through it.

Creating a Bot is super easy, you just need to talk to... a bot called Botfather.
After you started to talk to him on Telegram, send the /newbot command.
Botfather will ask you to name the bot, and also define a username.
The name can be anything, but for the username you will have to come up with a unique one that no one else used before.

Here you can see my attempt when I created and named a bot for my Raspberry Pi:
Screenshot 2019-02-04 at 21.03.38.png

As you can see Botfather also provides a HTML API token (which I covered in my screenshot). Note it down as you will need it later when you are implementing the alert in Python.

If you want to make changes to your bot you can do that by sending the /mybots command to Botfather. You will be able to rename it, add description and profile picture, reset API token many more.
Screenshot 2019-02-04 at 21.17.29.png

If you want to stop talking to Botfather, you can always send /cancel command, and it will exit any action.

3. Implement sending updates to a channel in Python

Ok, at this point we should have a Python script that scrapes something from the web, and a Telegram bot registered.

The next step is to determine the ID of the channel where we want to send updates via our bot.
So first of all you need to figure out where you want to receive the updates:

  1. Shall the bot send messages just to you?
  2. Or to a channel?
  3. Or another person on Telegram?

Case 1 is the easiest: just open the chat with your Bot, and send any kind of message to it.
After this open a webbrowser, paste in this URL, and replace [TOKEN] with your unique bot API token.

https://api.telegram.org/bot[TOKEN]/getUpdates

Once you visit this URL, you will get a JSON response. In this response, find the section where it says: ”from”:{“id”: 123456789, ...

The number after "id:" will be your user's unique identifier. Note it down for later use.

In case number 2, you first need to invite the bot to the channel where you want to post updates, and send a message to him by mentioning it.
After that you will be able to determine the ID of the channel the same way as in case 1.

In the last scenario (case 3), you will need to ask the other person to find your bot and send a private message to it. Then determine this person's unique ID the same way as above described.

Ok, by now you have a Bot (token) ready, and you also know the unique ID of the channel where it should send updates.

The next step is to implement a sending method in your code.
First, scrape the needed information and store it in a string variable.
Again, if you have no clue what I'm talking about, head over to my previous post where I give a detailed explanation on this topic.
In this example I'll call this variable ResultText.

Then, add this line of code:

requests.get("https://api.telegram.org/bot[TOKEN]/sendMessage?chat_id=[CHATID]&text={}".format(ResultText))

Erase [TOKEN] and add your bot's unique API token instead.
Also erase [CHATID] and add the unique ID of the channel where you want to send updates.

And you're basically all set! This line of code will send RestultText via your Bot to the determined channel. Simple as that!

One last thing to consider is that if you want to use special characters, the above code might run into error. To avoid this I recommend to convert your text to a URL friendly format with a module called urllib.

import urllib

ParsedRestultText = urllib.parse.quote_plus(ResultText)

And use this instead of ResultText in your code. So the end result would look like this:

requests.get("https://api.telegram.org/bot[TOKEN]/sendMessage?chat_id=[CHATID]&text={}".format(ParsedResultText))

Final words

I hope this short article gave you an idea how Telegram bots can extend your web-scraping possiblities. This ofcourse was just a super simple example, you can do lot more complex things with bots, like:

  • listen to keywords and do certain actions based on them
  • display unique keyboards or menu buttons, and do actions based on what being pressed

Be sure to check out Telegrams's documentation here if you want to build more advanced solutions: https://core.telegram.org/bots

Discover and read more posts from Gergely Kovács
get started
post comments1Reply
Harsh jaiswal
4 years ago

Can you please write it’s final script. I am still confused. I just want to get final structure.