In the digital age, having a strong online presence is crucial for businesses and individuals alike. One of the key aspects of online visibility is search engine indexing, where search engines like Google, Bing, or Yahoo crawl and store web pages to display in search results. Knowing the number of indexed pages for your website or your competitors can provide valuable insights into their online strategies. In this article, we will explore how to use Python to get the number of indexed pages of multiple websites and gain a competitive advantage in the digital landscape.
Here is the step by step process to get number of indexed pages for multiple websites:
Step 1:
Step 2:
Step 3:
Step 4:
Copy the below code
import requests
import urllib
import pandas as pd
from requests_html import HTML
from requests_html import HTMLSession
def get_source(url):
“””Return the source code for the provided URL.
Args:
url (string): URL of the page to scrape.
Returns:
response (object): HTTP response object from requests_html.
“””
try:
session = HTMLSession()
response = session.get(url)
return response
except requests.exceptions.RequestException as e:
print(e)
def get_results(url):
query = urllib.parse.quote_plus(url)
response = get_source(“https://www.google.co.uk/search?q=site%3A” + url)
return response
def parse_results(response):
string = response.html.find(“#result-stats”, first=True).text
indexed = int(string.split(‘ ‘)[1].replace(‘,’,”))
return indexed
def count_indexed_pages(url):
response = get_results(url)
return parse_results(response)
count_indexed_pages(“majestichardwoodfloors.com”)
sites = [‘majestichardwoodfloors.com’,
‘https://theassetsadvisors.com’,
‘https://pelicanmigration.com’,
]
data = []
for site in sites:
site_data = {
‘url’: site,
‘indexed_pages’: count_indexed_pages(site)
}
data.append(site_data)
df = pd.DataFrame.from_records(data)
df.sort_values(by=’indexed_pages’)
Change the highlighted url according to your requirement and run the code
Output:
Knowing the number of indexed pages for your website or your competitors is a valuable piece of information in the digital marketing landscape. With Python’s web scraping capabilities, we can easily collect this data and use it to strategize our online presence effectively. By following the steps outlined in this article, you can create a Python script that counts the indexed pages of multiple websites and gain a competitive edge in the online world.