Selenium WebDriver : Deep Dive

Selenium is a suite of tools that includes: Selenium IDESelenium WebDriverSelenium Grid and Selenium Standalone Server. Selenium is a tool for automating browsers, which means that Selenium makes browsers execute commands according to your scenario. This is why it’s the perfect tool for web application testing, but you are not limited to just that.

Selenium WebDriver is a free, open-source, portable software-testing framework for testing web applications. It provides a common application programming interface (API) for browser automation. Selenium WebDriver is a library that you call from your code, which executes your commands on the browser of your choice. It is an open source library for Automating browser level actions like click, type, selected a value from the dropdown etc. 

WebDriver has a built-in implementation of Firefox driver (Gecko Driver). For other browsers, you need to plug-in their browser specific drivers to communicate and run the test. Most commonly used WebDriver’s include:

  • Google Chrome Driver
  • Internet Explorer Driver
  • Opera Driver
  • Safari Driver
  • HTML Unit Driver (a special headless driver)

Selenium WebDriver- Architecture

Selenium WebDriver API provides communication facility between languages and browsers.

There are four basic components of WebDriver Architecture:

  • Selenium Client Library
  • JSON Wire Protocol
  • Browser Drivers
  • Real Browsers

JSON Wire Protocol:

Selenium WebDriver uses JSON Wire Protocol to communicate with browser in request/response pairs of “commands” and “responses”. This wire protocol defines a RESTful web service using JSON over HTTP. Your client i.e. your automated scripts send commands to your server i.e. browser and receives Reponses accordingly.

The main Benefits of Selenium WebDriver using JSON Wire Protocol is that you can write your code in any programming language and your commands with HTTP verbs go to the browser and receives response in Json. Other benefits include running your automated scripts against any kind of browser with the same implementation, running your scripts on cloud service providers like SauceLabs, BrowserStack, Perfecto cloud etc.

Now you know about JSON Wire Protocol so you can relate Selenium commands with HTTP methods and exceptions to the status codes. For Example; getTitle(), getText(), getPageSource() methods are GET requests, click(), findElement() methods are POST requests. The below tables describe some common status codes returned by server i.e. browser to the user:

Response Code Response Message

You can find complete list of HTTP methods and response status code from the official Selenium document here.

Selenium WebDriver- Features:

  • Multiple Languages Support: WebDriver also supports most of the commonly used programming languages like Java, C#, JavaScript, PHP, Ruby, Pearl and Python. Thus, the user can choose any one of the supported programming language based on his/her competency and start building the test scripts.
  • Multiple Browser Support: Selenium WebDriver supports a diverse range of web browsers such as Firefox, Chrome, Internet Explorer, Opera and many more. It also supports some of the non-conventional or rare browsers like HTMLUnit.
  • Speed: WebDriver performs faster as compared to other tools of Selenium Suite. Unlike RC, it doesn’t require any intermediate server to communicate with the browser; rather the tool directly communicates with the browser.
  • Simple Commands: Most of the commands used in Selenium WebDriver are easy to implement. For instance, to launch a browser in WebDriver following commands are used:
    WebDriver driver = new FirefoxDriver(); (Firefox browser )
    WebDriver driver = new ChromeDriver(); (Chrome browser)
    WebDriver driver = new InternetExplorerDriver(); (Internet Explorer browser) .
  • WebDriver- Methods and Classes: WebDriver provides multiple solutions to cope with some potential challenges in automation testing.
    WebDriver also allows testers to deal with complex types of web elements such as checkboxes, dropdowns and alerts through dynamic finders.

Hierarchy of Selenium Classes and Interfaces:

  1. SearchContext is super interface of WebDriver and it has only two methods findElement(), findElements()
  2. WebDriver is an interface which extends SearchContext interface. WebDriver has methods like get(), getTitle(), getCurrentURL(), getPageSource(), close(), quit() etc.
  3. RemoteWebDriver is a fully implemented class which implements WebDriver interface.
  4. ChromeDriver, IEDriver, FirefoxDriver, OperaDriver, SafariDriver etc are child classes of Remote Web Driver class and they provide browser specific implementation of Selenium WebDriver.

Note : Advantage of using WebDriver instance as opposed to browser specific class is when you want to run your execution on remote machine or on a grid. So you can use same driver instance with let’s say ChromeDriver for local machine execution and then on RemoteWebDriver for remote machine or grid execution.

RemoteWebDriver also implements an interface called JavaScriptExecutor which has many useful methods like scrollPageDown(), scrollIntoView().


Categories: Selenium

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s