Automate Google PSI Data Tracking of Multiple Pages using Python

Automate Google PSI Data Tracking of Multiple Pages using Python

GET YOUR FREE CUSTOMIZED

SEO AUDIT

& DIGITAL MARKETING STRATEGY

START BY SHARING YOUR DETAILS BELOW!

Why should we automate Google PSI data?

Google gave us a platform called ‘’’Page Speed Insight’’, where we can measure the website page speed performance metrics easily but maximum time we test only the homepage of the website and apply changes to the website. So, does Google check only the home page performance to measure the timing of user interaction experience? The answer is ‘’No, absolutely not’’. Google measures all the pages. So it is very important to track all the important pages(Which are serving value to the users) also.

Here comes a big question, if a website has thousands of important pages(ex: E-Commerce website) then would it be practical to check all the page performance using Page Speed Insight? It is quite difficult, doesn’t it?

Here we have come with a solution! We can easily track ‘’n’’ number of page’s speed metrics.
It can save plenty of time and effort. so , here is step by step guide to do this miracle

Step1-

Go to google developer console and create a new project with a name.

Here I have experimented on https://majestichardwoodfloors.com/, so i have created the project name according to this website

Step 2- Click on select project and then click on ‘’Library’’ under API and services menu.

Step 3- Search for pagespeed insight api and click on the appeared option.

Step 5- Click on ‘’ENABLE’’button

Step 6- Click on create credentials button,(If you don’t have any)

Step 7- Fill up the details in the form and click on done. No need to download the client id.

Step 8- Click on Get API Key button and copy the Key and save it in a notepad.

Step 9- Copy the code and paste it in a notepad file and save it with ‘’API_Key.py’’ or any suitable name with python extension. Here in ‘key=’ section you need to paste the API key which you have copied before.

Python Code:

import pandas as pd

import urllib

import requests

import time

import sys

# UI #

import easygui

from easygui import *

key = “AIzaSyBqoWmIeuSNDNhlVjYY-rPovx9UVAvNel0”

check = “captchaResult”

service_url = “https://www.googleapis.com/pagespeedonline/v5/runPagespeed/”

def speed_test_url(url, device):

    params = {

        “?url”: url,

        ‘strategy’: device,

        ‘key’: key,

        }

    data = urllib.parse.urlencode(params, doseq=True)

    main_call = urllib.parse.urljoin(service_url, data)

    main_call = main_call.replace(r’%3F’, r’?’)

    return main_call

while 1:

    choices = [“Enter URLs Manually”, “Upload File”, “Close Program”]

    msg = “Welcome to the Site Speed Testing Tool How would you like to proceed?”

    reply = buttonbox(msg, choices=choices, title=’Site Speed Testing Tool’)

    if reply == ‘Enter URLs Manually’:

        value = textbox(msg=”Enter your URLs for speed testing. URLs should contain ‘https://’ or ‘http://’ when entering. Make sure to submit one URL per line”, title=”Enter Urls”, codebox=True, callback=None)

        if value == None:

            break

        else:

            my_list = value.split(“\n”)

            speed_test_urls = pd.DataFrame(my_list, columns=[‘URL’])

            speed_test_urls = speed_test_urls[speed_test_urls.URL != ”]

    elif reply == ‘Upload File’:

        file = easygui.fileopenbox()

        speed_test_urls = pd.read_excel(file)

    elif reply == “Close Program”:

        sys.exit(0)

    msgbox(“Your files have been uploaded. Please wait as program runs.”)

    speed_check = speed_test_urls[“URL”]    

    msgbox(“Mobile Speed Test Running. Please click Ok while the program runs in the background”)

    data_list_mobile = []

    error_catch_m = []

    for m_check in speed_check:

        call = requests.get(speed_test_url(url=m_check, device=’mobile’))

        response = call.json()

        if check in response:

            pass

        else:

            y = “Error Found with the following URL: %s” % (m_check), “, remove or revise in your spreadsheet”

            error_catch_m.append(y)

            continue

        firstContent_mobile = str(response[“lighthouseResult”][‘audits’][‘first-contentful-paint’][‘displayValue’])

        timetoInteractive_mobile = str(response[“lighthouseResult”][‘audits’][‘interactive’][‘displayValue’])

        speedData_mobile = str(response[“lighthouseResult”][‘audits’][‘speed-index’][‘displayValue’])

        data_list_mobile.append((firstContent_mobile, timetoInteractive_mobile, speedData_mobile))

        time.sleep(2)

    msgbox(“Desktop Speed Test Running. Please click Ok while the program runs in the background”)    

    data_list_desktop = []

    error_catch_dt = []

    for d_check in speed_check:

        call = requests.get(speed_test_url(url=d_check, device=’desktop’))

        response = call.json()

        if check in response:

            pass

        else:

            x = “Error Found with the following URL: %s” % (d_check), “, remove or revise in your spreadsheet”

            error_catch_dt.append(x)

            continue

        firstContent_desktop = str(response[“lighthouseResult”][‘audits’][‘first-contentful-paint’][‘displayValue’])

        speedData_desktop = str(response[“lighthouseResult”][‘audits’][‘speed-index’][‘displayValue’])

        timetoInteractive_desktop = str(response[“lighthouseResult”][‘audits’][‘interactive’][‘displayValue’])

        data_list_desktop.append((firstContent_desktop, timetoInteractive_desktop, speedData_desktop))

        time.sleep(2)

    df_mobile = pd.DataFrame(data_list_mobile, columns=[“First Content Paint – Mobile”, “Time to Interactive – Mobile”, “Speed Index – Mobile”])

    df_desktop = pd.DataFrame(data_list_desktop, columns = [“First Content Paint – Desktop”, “Time to Interactive – Desktop”, “Speed Index – Desktop”]) 

    pagespeed_report = pd.concat([df_mobile,df_desktop], axis=1)

    pagespeed_report = pagespeed_report.replace(” s”, “”, regex=True)

    pagespeed_report[“URL”] = speed_check

    msgbox(“Your speed test has been completed. Please select an existing excel file or create a new file ending in .xlsx”)

    pagespeed_report.to_excel(filesavebox(), index=False)

Step 10- Create a blank excel file with .xlsx extension.

Step 11- Open Anaconda powershell prompt

Step 12- Go to the folder location where you have saved your python file. And runt the file with this command.

Step 13- Here you can see a simple graphical user interface will pop up to your screen, insert the website page urls(you can get urls by crawling in screaming frog or from the sitemap) by any two of these options. 

Step 14- After finishing the speed test select the blank excel file to store the data.

Step 15- And here you go! You got the speed metrics data off each page.

Bonus: This application can help to research competitor’s page speed insight too. You have to just put the competitor’s websites in the URL list and proceed the same steps.