Python, Javascript, and Web automation
In the last few months, I've been trying to compare the languages that I've worked with so far. The reason being, I often find myself in situations when I have a task at hand, and I realize there are multiple different ways to do it in multiple languages, and I get analysis paralysis.
Anyways, the focus of this post is Python, Javascript, and their use in Web automation. To be fair, both languages have different histories and evolved very differently, but web automation is one area that I feel where both languages have something to offer. I'll try to compare Python and Javascript in the context of different usage patterns and ways of performing web automation.
What exactly is web automation?
I've seen a lot of beginners (including the early-me) misunderstanding the scope of web automation. I've got a couple of things to say here. Firstly,
Web scraping is one kind of web automation, but web automation is more generic than that.
Secondly,
If you automate any manual task on the web, the process of doing it can be called as web automation.
Some web automation tasks
There are many that you can think of, but here are some of the popular ones,
Form filling
Form filling, as the name suggests, means automating the manual interaction with forms on the web. This interaction could be entering text, selecting radio buttons, ticking checkboxes, etc. Why would you want to automate them? Well, there could be many reasons. For instance, I once automated the login process of my university's library portal to make a CLI utility that would do things like subscription-renew, check fines, etc. for lazy geeks (source code, don't judge me for the code, it was years ago). Another reason could be that a large part of the time in your job is spent on filling in redundant details in these redundant forms, so you may want to automate these workflows. Speaking of workflows,
Creating workflows
This is my favorite use case for web automation. You can combine a bunch of automation pieces together to do things like,
- Periodically visit your favorite wallpaper site, see if there are any new wallpapers added, if yes, then download them and add it to your active wallpapers collection. I have a similar script for Quotefancy (source code).
- Frequently track certain new topics on the web, and trigger some action based on the sentiment of the news. Such a set up could be useful in systems like High-Frequency Trading of stocks, cryptocurrencies, etc.
- Search for all the images (on different image sources), tweets, trends related to the keyword, and present them in a user-friendly way. This was actually one of my internship projects, where I created a dashboard to be able to speed up the news video generation process for trending topics.
- There could be more such use-cases, I hope you get the gist.
Automation testing
Testing is a very popular use-case of web automation. It is almost a necessity when you have a large web application. As a developer, you'd like to write test cases for functional behavior of the web feature that you're developing, so that you can be assured that it is working as per specifications. This is also called "Quality Assurance" or (QA) in the formal world. And another thing that you'd like to have is to accumulate similar tests over the time in a test suite, so that every time you add a new feature, you are assured that its addition doesn't change any existing "expected" behavior in the bigger scheme of things. Loosely speaking, this process is known as regression testing, and if you simulate it for the entire user journey, you can call it end-to-end testing. Automation testing for a website attempts to prevent many embarrassing bugs like "button not getting clicked," "broken URLs" (unless you want to show off your creative 404 pages), etc. by trying to simulate and test all the scenarios upfront exhaustively.
Web scraping
Web scraping, in simple terms, is the act of extracting data from websites. The data can be used for several purposes. It's often a cat and mouse game between the website owner and the developer operating in a legal gray area. On one side, there are website owners who are putting fences around their content in the form of captchas, authorization-mechanisms, etc. and on the other side are developers who are coming up with ways to bypass these fences and extract the data they need from the website. There are different reasons that you might need to extract and organize data from other websites like doing some analysis or using it as an input for some business logic (ex. showcasing Amazon's price of some product on your site).
The crux of web-automation is that instead of humans, you figure out some way to let the computers do those repetitive and tedious tasks.
Sidenote: Web automation (even when assisted by AI) is not equivalent to computers taking jobs; it's more like freeing humans to focus on more creative tasks (like automating more things).
The past and the present; Landscape of web automation
So you may ask, what are the tools that people use (or have been using) to do web automation?
Well, the programmers have relied on simulating the web in their programs to achieve what they want. Typical series of steps are making requests to the server, parsing the response, making sure if the response is as expected (if testing is the purpose), otherwise taking actions based on the response, all via the program. There are now high-level libraries and frameworks available to ease-off programmer's life while writing these automation scripts. For example, Selenium is quite a popular library for browser automation, supporting APIs for most of the popular languages.
For non-technology-focused enterprises, there are companies like import.io, scrapinghub, etc. that provide solutions to manage the entire data-extraction pipeline from the web. Some companies also want to understand the consumer sentiment on social media or understand the consumer sentiments of the competitors, and there are services that let you set such kind of a monitoring system in a few clicks.
And for individuals who are not so code-friendly (and probably want to do automation for different reasons), there are SaaS solutions that let one create workflows. For example, check Zapier, Huginn, and IFTTT (If this, then that). Some of the workflows don't interact directly with web per se, rather they depend on REST APIs, but the end objective is still the same, automation! And not to mention, there are a LOT of companies whose lifeline is just web automation (consider product-and-price comparison services, or cross-browser testing solutions, for example).
Doing web automation with Python and Javascript
Okay, the important question now, "Given that I know either Python or Javascript, how do I go about doing web automation with them?".
Common usage patterns
Using libraries and Frameworks
The motivation behind creating and using libraries is to minimize the boilerplate and not reinvent the wheel. Usually, the libraries that you'll use will provide with one or many of these functionalities,
- Functionality to make network requests for common protocols.
- Common tricks for network requests like retries, using proxies, handling redirects, etc.
- Functionality to parse common response formats including HTML, JSON, XML, etc.
- Capability to search efficiently for the desired information in the parsed response, and extract them in the native language's data containers.
For instance, requests is widely used for handling HTTP requests in Python, and the analogous library in Javascript is axios. Python has beautifulsoup for parsing and pulling data out of HTML and XML files, Javascript has cheerio. Frameworks like scrapy take scraping to another level (difference between a library and a framework). The closer alternatives that I'm aware of for Javascript are node-crawler.
Using a web browser automation suite like Selenium
Selenium Web Driver is a web automation framework. It can control the browser and thus can let you simulate user actions programmatically. Selenium is quite useful in the scenarios where the content that one wants to work on is either rendered at the browser side by libraries like Handlebars or React, or fetched by making future AJAX calls to the server and then rendered by the browser. A couple of examples of this include:
- Webpages with infinite scrolling (Twitter, Facebook, etc.)
- Webpages with pre-loaders like percentage bars or loading spinners
These scenarios can only be handled if we are able to simulate browser like behavior (hence Selenium to the rescue). Selenium has client interfaces for most of the popular languages (including Python and Javascript, of course).
How does Selenium works?
- We use the client library API to write instructions.
- We specify the webdriver we want to use. All the popular browsers (Firefox, Chrome, IE, Safari) have web drivers, which provide an interface for controlling the actual browsers.
- The client code is then converted according to the protocol understood by this webdrivers (see this), resulting in the desired action on the browser end.
- If there's any information asked to be captured, it is sent back to the client in a similar but reverse sequence.
Usually, the triggering of the client program launches a browser instance, and we can see things like clicking and entering data on the screen, which is useful while testing. But if we care about just scraping, we can use "headless browsers" that don't have UI and are faster in terms of performance. Chrome Headless is a popular choice for a headless web driver, and other options include Headless Firefox, and PhantomJS. You can download the latest versions of all of the Selenium components from here.
Testing frameworks
The javascript testing ecosystem has a lot of tools that overlap with web-automation. I'm avoiding discussing them here because it's a subject of an entire post, but you can check this nice post if you'd like to explore.
Complementary tools
There are some tools that you'll be using fairly often when you're doing web automation. Some of them are,
- Browser developer tools; These are the tools to visually inspect the DOM, locate the elements, and get their selectors, inspect AJAX and other HTTP requests, etc. All of them are available in the developer toolkit of most of the popular browsers.
- Crontab; You might often want some tasks to run periodically according to a time pattern; you can use crontab for that.
- tcpdump: You can use tcpdump to compare the headers of two requests (the one that your automation script is sending, and the other that the actual browser is sending).
- cloudflare-scrape: If you're doing scraping, then you can use a tool like this past to get around Cloudflare's anti-bot checks.
- 2captcha: Again, if you're doing scraping and struggling with captchas, you can use 2captcha. In my experience, I've found that websites require captcha only once in a while, so the cost is not that much.
Python vs. Javascript
Okay, now it's time to compare these two languages face-to-face. For reference, I'll share snippets for solving captchas during web-automation using 2cpatcha service. You can read more about the process here.
The Python snippet
import requests
from time import sleep
API_KEY = ''
site_key = 'some_site_key'
url = 'http://example.com'
SESSION = requests.Session()
def solve_captcha(api_key, site_key, url):
"""
Returns the solved captcha answer that you can use in your automation requests.
"""
response = SESSION.post(f'http://2captcha.com/in.php?key={api_key}&method=userrecaptcha&googlekey={site_key}&pageurl={url}')
if response.status_code is 200:
captcha_id = response.text.split('|')[1]
gresponse = get_answer_from_captcha_id(api_key, captcha_id)
return gresponse
def get_answer_from_captcha_id(api_key, captcha_id, sleep_delay=5, max_attempts=25):
for _ in range(max_attempts):
response = SESSION.get(f'http://2captcha.com/res.php?key={api_key}&action=get&id={captcha_id}')
if 'CAPCHA_NOT_READY' in response.text:
time.sleep(sleep_delay)
continue
else:
return response.text.split('|')[1]
The JavaScript snippet
const axios = require('axios');
const API_KEY = '';
const siteKey = 'some_site_key';
const url = 'http://example.com';
function solveCaptcha(apiKey, siteKey, url) {
const requestUrl = `http://2captcha.com/in.php?key=${apiKey}&method=userrecaptcha&googlekey=${siteKey}&pageurl=${url}`;
axios({ url: requestUrl })
.then(async (response) => {
if (response.substring(0, 3) == 'OK|') {
const captchaID = response.substring(3);
const gresponse = await getAnswerFromCaptchaID();
resolve(gresponse);
}
})
.catch((error) => {
console.log(error);
});
}
async function getAnswerFromCaptchaID(apiKey, captchaID, sleepDelay = 5000, maxAttempts = 25) {
const requestUrl = `http://2captcha.com/res.php?key=${apiKey}&action=get&id=${captchaID}`;
for (let currentAttempt = 0; currentAttempt < maxAttempts; currentAttempt++) {
await axios({ url: requestUrl })
.then(async (response) => {
if (response != 'CAPCHA_NOT_READY') {
if (response.substring(0, 3) == 'OK|') {
const gresponse = response.substring(3);
return gresponse;
}
} else {
await sleep(sleepDelay);
}
})
.catch((error) => {
console.log(error);
});
}
}
I've used the respective popular libraries (requests and axios) in both the languages mentioned previously for handling HTTP requests. The implementations are not equivalent line-to-line, but the gist is the same. Here are a few things that I've to say about both these languages,
Synchronous vs. Asynchronous
This is a preferred-paradigm difference that you'll see in the Python and Javascript libraries. Javascript libraries lean towards the promise, and callback style programming, which means the code is more likely to be complex (check callback hell), whereas Python libraries are synchronous. Of course, it is possible to follow both the paradigms in both the languages. But you can already see in the snippets that it is easier to follow along with the Python code than the JavaScript code (you can argue that JavaScript is not well written above, and that's the thing I want to stress on next. I find it slightly more difficult to understand and implement Javascript constructs).
Maintainability
I feel that the leniency in JavaScript demands great responsibility as a developer to get things done "rightly." There are occasions where Python makes a stronger case for web-automation, simply because of the simplicity of the solution (can save development time). Things like immutable object types, strict argument matching for functions, strong typing, can prevent bugs (due to reasons like malformed input) from creeping into your automation pipeline. If you've worked with Javascript, you might be familiar with those annoying bugs due to things like "undefined" values, implicit typecasting, async related issues, etc. (check out wtfjs).
Performance
node.js beats CPython in terms of execution speed, and this has really been the major plus point of using node.js for web applications. The difference in performance ultimately boils down to design or the underlying runtime and libraries. The V8 engine is a Just-in-time compiler and is natively non-blocking. The popular libraries in the Javascript ecosystem have tried hard to take advantage of this design. Not to mention that V8 is maintained by Google (hence a lot of resources being poured into optimizing it).
Benchmarks can be subtle, so not including them in the post. But if you're interested in benchmarks, do check out Benchmark Game's comparison of node and Python.
Ecosystem
A programming language has more to offer than just grammar and implementation. It has a community of developers, a mechanism to share and distribute code, a collection of reusable distributions, and much more.
- Javascript is much closer to the web; almost every developer involved with the web knows JavaScript to some extent.
- The standard library of Python in more versatile and give you a lot of well-maintained capabilities out of the box.
- When it comes to preventing reinventing the wheel by reusing code through libraries, Python covers wider domain applications apart from the web (Data analytics, Scientific computing, Machine Learning).
- Python usually has a widely accepted set of tools for doing the tasks (like beautifulsoup for parsing and extracting, requests for handling HTTP request, and scrapy for writing crawlers), whereas I personally find such options in Javascript to be confusing (there are multiple libraries for doing the same thing, and it's very hard to pick one, and oftentimes I found the quality to be sub-optimal).
- At the time of writing this post, pypi (Python's package index) has 195,000 packages, whereas npm registry has more than a million packages.
- Both languages have decent options for testing, Python has in-built unittest module (which is quite good), and JavaScript has a lot of mature testing frameworks (which make it more preferable if you're into automation testing).
- As it comes to popularity, according to Stackoverflow's 2019 developer survey, JavaScript is the most popular language, but Python is catching up (as the fastest-growing programming language).
Conclusion
In this post, I tried to briefly touch on Python, JavaScript, and their differences in the context of Web automation. I hope that you got some useful information that might help you in picking a language among these two in future.
Hey, I have gone through your complete post, thank you for sharing such whole concept of Python, Java and Web automation. It was also great to know about Selenium and how it works. I have taken some online classes by Gayatri Mishra on this all above mentioned courses and specially on Selenium Testing, it was very helpful and the steps provided there was very easy and understandable. While in your post I found something new and more steps to follow.