All You Need To Know About Selenium WebDriver Architecture
Testing the system against all odds is a challenging task and you need a tool which can help you in this process. One such predominantly used tool by Automation Testers is Selenium. If you are a beginner and wish to know how Selenium functions internally, then you have landed at the perfect place. In this article, I will give you brief insights about Selenium WebDriver Architecture.
Below are the topics covered in this article:
- What is Selenium?
- Selenium Suite of Tools
- Selenium WebDriver Architecture
- Demo
What is Selenium?
Selenium is an open source portable framework used to automate testing of web applications. It is highly flexible when it comes to testing functional and regression test cases. Selenium test scripts can be written in different programming languages like Java, Python, C# and many more. These test scripts can run across various browsers like Chrome, Safari, Firefox, Opera and also provides support across various platforms like Windows, Mac OS, Linux, Solaris. Selenium also supports cross browsing where the test cases run across various platforms simultaneously. It also helps in creating robust, browser-based regression automation suites and perform tests.
I hope you understood the fundamentals of Selenium. Now let’s move further and understand various tools that are available in the Selenium Suite.
Selenium Suite of Tools
Selenium is mainly comprised of a suite of tools, which include:
- Selenium IDE
- Selenium RC
- Selenium WebDriver
- Selenium Grid
Let’s understand the functionalities of each of these tools in more detail.
Selenium IDE
IDE (Integrated Development Environment) is a Firefox plugin. It is one of the simplest frameworks in the Selenium Suite. It allows us to record and playback the scripts. If you wish to create scripts using Selenium IDE, you need to use Selenium RC or Selenium WebDriver to write more advanced and robust test cases.
Next, let’s see what is Selenium RC.
Selenium RC
Selenium RC, also known as Selenium 1, was the main Selenium project for a long time before the WebDriver merge brought up Selenium 2. It mainly relies on JavaScript for automation. It supports Ruby, PHP, Python, Perl and C#, Java, Javascript. It supports almost every browser out there.
Note: Selenium RC is officially deprecated
Selenium WebDriver
Selenium WebDriver is a browser automation framework that accepts commands and sends them to a browser. It is implemented through a browser-specific driver. It directly communicates with the browser and controls it. Selenium WebDriver supports various programming languages like – Java, C#, PHP, Python, Perl, Ruby. and Javascript
Selenium WebDriver supports the following:
- _ Operation System Support _ – Windows, Mac OS, Linux, Solaris
- _ Browser Support _ – Mozilla Firefox, Internet Explorer, Google Chrome 12.0.712.0 and above, Safari, Opera 11.5 and above, Android, iOS, HtmlUnit 2.9 and above.
Selenium Grid
Selenium Grid is a tool which is used together with Selenium RC. It is used to run tests on different machines against different browsers in parallel. Which implies – running multiple tests at the same time against different machines running different browsers and operating systems.
So this was all about the Selenium Suite of Tools. Let’s dive deeper into this article and learn the functionalities and various components of Selenium WebDriver Architecture.
Selenium WebDriver Architecture
In order to understand Selenium WebDriver Architecture, we should first know what is a WebDriver API. Selenium Webdriver API helps in communication between languages and browsers. Each and every browser has different logic of performing actions on the browser. Below image depicts various components of Selenium WebDriver Architecture.
It comprises four main components which are:
- Selenium Client Library
- JSON WIRE PROTOCOL Over HTTP Client
- Browser Drivers
- Browsers
Let’s understand each of these components in depth.
1. Selenium Client Libraries/Language Bindings
Selenium supports multiple libraries such as Java, Ruby, Python, etc. Selenium Developers have developed language bindings to allow Selenium to support multiple languages. If you wish to know more about libraries, kindly refer to the official site for Selenium libraries.
2. JSON WIRE PROTOCOL Over HTTP Client
JSON stands for JavaScript Object Notation. It is used to transfer data between a server and a client on the web. JSON Wire Protocol is a REST API that transfers the information between HTTP server. Each BrowserDriver (such as FirefoxDriver, ChromeDriver, etc.) has its own HTTP server.
3. Browser Drivers
Each browser contains a separate browser driver. Browser drivers communicate with the respective browser without revealing the internal logic of the browser’s functionality. When a browser driver has received any command then that command will be executed on the respective browser and the response will go back in the form of an HTTP response.
4. Browsers
Selenium supports multiple browsers such as Firefox, Chrome, IE, Safari, etc.
Now let’s move further and know how exactly Selenium functions internally with the help of the below example.
Demo
In real time, you write a code in your UI (say Eclipse IDE) using any one of the supported Selenium client libraries (say Java).
Example:
WebDriver driver = new FirefoxDriver();
driver.get("https://www.edureka.co");
Once you are ready with your script, you will click on Run to execute the program. Based on the above statements, the Firefox browser will be launched and it will navigate to Edureka website.
Once you click on ‘Run’, every statement in your script will be converted as a URL, with the help of JSON Wire Protocol over HTTP. The URL’s will be passed to the Browser Drivers. (In the above code, I have used FirefoxDriver). Here, in this case, the client library (Java) will convert the statements of the script into JSON format and further communicate with the FirefoxDriver.
Every Browser Driver uses an HTTP server to receive HTTP requests. Once the URL reaches the Browser Driver, then it will pass that request to the real browser over HTTP. Once done, the commands in your Selenium script will be executed on the browser. In the case of Chrome browser, you can write your Selenium script as shown below:
WebDriver driver = new ChromeDriver();
driver.get("https://www.edureka.co");
If the request is POST request, then there will be an action on the browser. If the request is a GET request then the corresponding response will be generated at the browser end. It will be then sent over HTTP to the browser driver and the Browser Driver over JSON Wire Protocol and sends it to the UI (Eclipse IDE).
So, that was all about Selenium WebDriver Architecture. I hope you understood the concepts and it added value to your knowledge.
Got a question for us? Please mention it in the comments section of Selenium WebDriver Architecture article and we will get back to you.