9 min read

In this article, we will learn about web automation in python using Selenium. Also, we will learn to make a WhatsApp Automator using python.

So, first of all why we need Automation?

Why Automation?

  1. Increased Productivity – By using Automation tools, we can increase the production rate of our application.
  2. Software Testing – During production, we generally need to test our app in different environments before releasing a stable version. Sometimes it is very difficult to perform testing manually. Therefore, it is better to use automating scripts or some automation testing tool to automate repetitive tasks.
  3. Time-Saving – Let us understand it by an example. Suppose you want to search for something, collect data, or sort your files. If you do this manually you will waste your time and resources. By automation, you can increase the speed of repetitive tasks.

So, automation saves time, increases production rates, reduces costs, and labour.
Although there are plenty of automation tools available, we will talk about Selenium here.

What is Selenium?

  • Selenium is a package that is used to automate web browser-related tasks. It also provides IDE to use authoring tools without knowing a scripting language.
  • It provides support to many popular programming languages like Python, PHP, and Java.

Installing Selenium

To use selenium, you first need to install it. There are different methods for different programming languages.
In python just run pip install selenium on console and it will install selenium for you.

Setting Up Selenium

To use selenium with your browser, you must first tell to python which browser you are going to use.

For this, you need to install Web Driver first. Web Driver is a framework that is used to execute automation scripts on a different platform.
In this example, we will be using the Chrome browser so download and extract the chrome driver.

Accessing a page using Selenium

#Uses python3
# First of all ,import selenium 
import selenium
#Tell selenium which browser you are going to use
browser = webdriver.Chrome("path to your chrome driver executable")
# Here browser is a selenium webdriver object. After executing this line, a new chrome window will open up.
browser.get("https://freshlybuilt.com/") #Enter the link of the web page you want to access.

Quick Tip for Web Driver

Sometimes, it becomes difficult to find where you have kept the chrome driver.
If you don’t give a path to a chrome driver in your code, it will try to get a chrome driver from the default location. (Different for different Operating Systems.)
With new upgrades to the Chrome browser, your previous chrome driver might stop working. Then you need to install the chrome driver again.


But you can also automate this by using a package named webdriver_manager.

Using webdriver_manager

  • First of all, install webdriver_manager using pip install webdriver_manager in your console.
  • Importing ChromeDriverManager from webdriver_manager.chrome .
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
chrome=webdriver.Chrome(ChromeDriverManager().install())
browser.get("https://freshlybuilt.com/")

Here, “ChromeDriverManager().install()” will first check for chrome driver at default location. If the driver is not there or needs to upgrade then this package will do that for you.

Showing that the browser is controlled by Selenium.

How Selenium works?

  • Selenium creates an HTTP Request and sends it to the web browser. For each selenium command, an HTTP request is generated. The execution status of the request is then captured by the Selenium.
  • Using Selenium, we can interact with different objects on a web page. The way it does so is very similar to JavaScript. Selenium uses DOM(Document Object Model) to interact with the web page. DOM represents the whole web page as nodes and objects.

To interact with the elements selenium provides a method “By” which we will use. To import the method, use ” from selenium.webdriver.common.by import By ” in your script.

How to know what to inquire?

If you have made the web page, you may know the contents of the page. But generally, it is not the case. You can look into the source code of the website.
The best way is to open up your browser and use browser developer tools to inspect the web page.

  • To open up developer console in Chrome, just right click on the target page and select inspect. You can also use “Ctrl+Shift+I” to open the developer tools.
  • In the leftmost, there is a pointer icon which on hovering says “select an element in a page to inspect it.” Just click on it and navigate to the element you want to inspect. You will see the corresponding code for that element.
Using Developer tools

Different Locators in Selenium

Selenium provides various locators to locate elements on a web page. For scripting and accessing web data, locators are very important.
Some locators in the Selenium are:-

  1. ID Locator:- It is used to locate an element by providing its id. Generally, ids are used in form elements.
    The syntax to use it is:-
    1. driver.findElement(By.id(“id of the element”))
    2. driver.find_element_by_id(‘id of the element’)
  2. Class Locator:- It helps to locate an element defined by the class attribute. The class attribute is used for similar elements on a web page, like paragraph or divs.
    The syntax to use it is:-
    1. driver.findElement(By.className(“Name of the class”))
    2. driver.find_element_by_class_name(‘Name of the class’)
  3. Tag Locator:- It is used to locate an element using its Tag Name.
    The syntax is:-
    driver.findElement(By.tagName(“name of the tag”));
  4. XPath Locator:- It is used to locate elements using XML expressions. The syntax for this is:-
    driver.findElement(By.xpath(“//input[@name= ’email’]”))

There are some more useful locators like Name Locator, Link text Locator, CSS Locator, etc. You can always search for them. They are similar to the above locators.

WhatsApp Automation using Selenium

Till now, we had talked about selenium and it’s basic use. Now let use this knowledge to make a WhatsApp Automater in python.

For simplicity and better understanding, I will make small functions and block.

List of libraries and modules to be imported

  • Web Driver from Selenium – It will help us to use chrome driver with selenium as mentioned above.
  • Web Driver Wait – It helps us to wait for something to occur before proceeding further to the code
  • Expected Conditions – These are the conditions that are generally used for providing waits and checking if a particular tag or condition is fulfilled.
  • Keys – These are used to send special keys like Enter, Control and Function keys.
  • By – By is used to locate an interactive element on the web browser.
  • json – It is used to convert Python strings to JSON strings. JSON stands for JavaScript Object Notation.
from webdriver_manager.chrome import ChromeDriverManager
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
import json

Opening Browser and

#Uses python3
#This function when called will open up WhatsApp Web on the chrome and create a wait object.
def call_chrome():
    driver=webdriver.Chrome(ChromeDriverManager().install())
    driver.get("https://web.whatsapp.com")
    wait= WebDriverWait(driver , 600)
    return driver,wait

Selecting the Receiver

There are two ways to select the target or receiver. The first one is to find the name of the receiver in the list and second is to use the search box to find the name.

Finding receiver using the list of names

Here, as we can see the name is inside a span tag. Therefore we will find that span tag.

#Uses python3
#This function when called will open the receiver you want to message.
def target_naive():
    target=input('Enter the name of person')
    target=json.dumps(target)
    #Here json is used because we need name as '"Kapil"'(string to be
    #enclosed withing quotes)
    x_arg='//span[contains(@title, '+target+')]'
    target=wait.until(EC.presence_of_element_located((By.XPATH , x_arg)))
    target.click()

Finding receiver using search box

Here we can see that the class_name of the search box is ‘_3F6QL _3xlwb’. We will use this to find the receiver. There are two classes available, we can use any one or both of them to locate the search box. This method is generally preferred because the above method can select wrong targets sometimes when the name of the target is also in one of your recent messages.

#Uses python3
#This function when called will search for the receiver the then open it.
def target_advanced():
    target=input('Enter the name of person')
    target_box=driver.find_element_by_class_name('_3F6QL _3xlwb')
    target_box.send_keys(target+Keys.ENTER)
#Keys.ENTER will press enter key after writing name of the target.

Selecting the input box and sending message

INPUT BOX IN WHATSAPP SELENIUM
#Uses python3
#This will select the input box.
def send_msg():
    string=input('Enter the message')
    input_box=driver.find_element_by_class_name('_1Plpp')
    n=int(input('No. of times the message should be send'))
    for i in range(n):
        input_box.send_keys(string+Keys.ENTER)

Voila, now just run all of the above functions to send a message.

#Uses python3
#Run this block at the end to send the message.
driver,wait=call_chrome()
target_advanced()
send_msg()

7+

Kapil Bansal

A student of B.Tech CSE working on competitive programming, a cybersecurity devotee working towards strengthing the concepts. I am also working on Python modules, Data Structures and Algorithms etc.

0 Comments

Leave a Reply