Get Number of Indexed Pages of Multiple Websites Using Python

Get Number of Indexed Pages of Multiple Websites Using Python

GET YOUR FREE CUSTOMIZED

SEO AUDIT

& DIGITAL MARKETING STRATEGY

START BY SHARING YOUR DETAILS BELOW!

In the digital age, having a strong online presence is crucial for businesses and individuals alike. One of the key aspects of online visibility is search engine indexing, where search engines like Google, Bing, or Yahoo crawl and store web pages to display in search results. Knowing the number of indexed pages for your website or your competitors can provide valuable insights into their online strategies. In this article, we will explore how to use Python to get the number of indexed pages of multiple websites and gain a competitive advantage in the digital landscape.

Here is the step by step process to get number of indexed pages for multiple websites:
Step 1:

Step 2:

Step 3:

Step 4: 

Copy the below code

import requests

import urllib

import pandas as pd

from requests_html import HTML

from requests_html import HTMLSession

def get_source(url):

    “””Return the source code for the provided URL.

    Args:

        url (string): URL of the page to scrape.

    Returns:

        response (object): HTTP response object from requests_html.

    “””

    try:

        session = HTMLSession()

        response = session.get(url)

        return response

    except requests.exceptions.RequestException as e:

        print(e)

def get_results(url):

    query = urllib.parse.quote_plus(url)

    response = get_source(“https://www.google.co.uk/search?q=site%3A” + url)

    return response

def parse_results(response):

    string = response.html.find(“#result-stats”, first=True).text

    indexed = int(string.split(‘ ‘)[1].replace(‘,’,”))

    return indexed

def count_indexed_pages(url):

    response = get_results(url)

    return parse_results(response)

count_indexed_pages(“majestichardwoodfloors.com”)

sites = [‘majestichardwoodfloors.com’,

        ‘https://theassetsadvisors.com’,

        ‘https://pelicanmigration.com’,

       ]

data = []

for site in sites:

    site_data = {

        ‘url’: site,

        ‘indexed_pages’: count_indexed_pages(site)

    }

    data.append(site_data)

df = pd.DataFrame.from_records(data)

df.sort_values(by=’indexed_pages’)

Change the highlighted url according to your requirement and run the code

Output:

Knowing the number of indexed pages for your website or your competitors is a valuable piece of information in the digital marketing landscape. With Python’s web scraping capabilities, we can easily collect this data and use it to strategize our online presence effectively. By following the steps outlined in this article, you can create a Python script that counts the indexed pages of multiple websites and gain a competitive edge in the online world.