Skip to main content

Google Trends on chronic pain treatments

I recently became interested in Google Trends because I'm accustomed to analyzing time series, albeit in brain activity data.  Nonetheless, my introduction to this tool occurred shortly after the presidential election, as I was interested in simply seeing search interest in certain fears.  Here's an example of a few notable spikes in activity I found:


Unable to find manually many other terms which showed such unambiguous changes on election night (spikes in sexism and homophobia searches are much more evident when plotted alone), I wondered, "Is it possible to extract Google Trend data for 1000s of terms?"  The answer is yes and no (kind of).

Of course, Python has an API just for this purpose, PyTrends, so it is possible.  However, after I had decided on the thousands of terms I wanted to search, I ran into "rate limit errors" with Google Trends, which basically caps the number of searches you can do per hour.  This limit apparently varies, but at the fastest rate I was able to attain using PyTrends (roughly 1 term per 30 seconds) without hitting this error would have taken too long for my comfort.  I am still searching for ways to overcome this issue, perhaps by searching for terms in Wikipedia Trends, which I understand may have less limitations on searches.  Of course I can also limit the number of terms I will search for -- I will come back to this project later when I figure it out.

In the mean time, I decided to continue instead by using PyTrends for extracting Google search trends in pain treatments, since 2004 (the beginning of the Google Trends records) mentioned in the previous blog post.  This was relatively simple, with only slightly over 100 terms.

To increase my confidence that the search terms were pain related, I tested adding "chronic pain" and "pain" to the search terms.  Doing this restricted the number of results in both cases, with "chronic pain" resulting in far more empty searches.  I wonder if it's possible that by shortening the time period I may end up with better results?  Nonetheless, with "pain" added to all search terms, only N=51 searches came back to me with results.

First off, suprisingly, most trends showed a fairly linear increase in searches over time.  This may have something to do with the increased prevalence of Google as a general tool for just about everything over time.  One way I could test this is correct for slopes of general, non-related terms over time.  I may add that to this post eventually, but I just wanted to put something down for right now.

Below are a few examples, with the trend and the linear fit of the trend in the top panel (the slope of the fit is shown as "int./month").  "Interest" on the y-axis is how Google reports their searches, and is relative to the most popular time of the search (which equals 100).  This is somewhat unfortunate, as it then makes it difficult to compare absolute interest for different treatments.  This is why I calculated the slope, which will at least give us an idea about how quickly interest is changing over time.  Additionally, I looked for periodic trends in the data, wondering if there were monthly or yearly cycles in interest by calculating a Fourier-transform power spectrum of the trend (after linearly de-trending, using "detrend" in Matlab), shown in the bottom plot (there wasn't all that much interesting information in the power spectrum):







Common over-the-counter analgesics: These are common treatments for acute pain, although their efficacy for chronic pain is not so clear.  Nonetheless, interest in these are all increasing similarly, with Ibuprofen increasing most quickly over the last 13 years, at about 0.5 interest points per month.





Opiods:  These trends are interesting not only because of the more recently publicized, serious issue of opiod addiction, but their overall trends do not fit a linear function, depending on the specific term.  If searching for "opiod" alone, clearly a relative increase in interest occurred right around January 2016, which is around the time new guidelines were opiod prescription were released by the CDC and CMS in order to combat opioid addiction.  Interestingly, searches for specific opioid drugs began to decrease years before, with sharper decreases in "oxycontin" around 2010, and "vicodin" around 2012.





Non-invasive, physical treatments:  I personally prefer to avoid taking medications when possible, and am more likely to try one of these routes when treating my own pain.  I've never tried acupuncture, but it continues to rise in interest at about 0.13 points / month.  Decompression is primarily used in back pain, and involves stretching the spine with a bracing system (that actually sounds kinda nice?).  After peaking around 2009, it has continued to decrease.  And while the linear trend of "exercise" is common to many of the other increasing trends in this post, this is a good example of a trend which may be contaminated with non-pain related searches.  See the peak in the power spectrum at 1/yr?  You can see that right around January of every year there is an huge increase in interest, indicating much of the search interest may have more to do with New Year's resolutions (similar trends were seen with yoga)!




All treatments:  And finally, here are the slopes (as shown in the plots above) indicating change in interest over time, with higher values indicating a faster increase in search interest, and negative numbers indicating an overall decrease over time.  Again, I may need to add a correction factor for general interest in all searches.  Additionally, many searches may not be as pain specific as I would like (notice that "heat" and "ice" are close to the top of the list, which I expected to have relatively flat slopes).

I'm still working on doing all calculations and plots within Python, but for the time being I have been using Python to get the raw data and Matlab to process and plot it.  Below is the code I used to extract the Google Trends data.

In [ ]:
# Requires the pytrends library. To install, run "pip install pytrends".
from pytrends.request import TrendReq
import time
import os
import pandas as pd
import csv
import numpy as np

"""Add your Gmail username to the google_username variable and your Gmail password 
to the google_password variable."""
google_username = "" # enter Google username here
google_password = "" # enter Google password here
connector = TrendReq(google_username, google_password)

"""Specify the filename of a CSV with a list of keywords in the variable, keyordcsv. 
The CSV should be one column, with header equal to 'Keywords' (case sensitive). """
keywordcsv = "" # csv column file of search terms 
keywords = pd.read_csv(keywordcsv)

Trends = [] #list of lists, holding Google search interest data
for index, word in keywords.iterrows():
    
    # print output to keep track
    print("Downloading Keyword #" + str(index) + ", " + word[0])
    
    """This is necessary to avoid 'rate limit' errors from trying
    to extract too much info too quickly from Google.  I played with 
    different sleep times -- much less than this I ended up with errors.
    This can of course be reduced if you're searching for fewer terms"""
    time.sleep(30)
    
    """ The 'try / except' was necessary in cases where searches came up
    empty.  Surprisingly, when adding 'pain' to many of my terms, there was
    no information"""
    searchterm = word[0] + " pain"
    payload = {'geo': 'US', 'q': searchterm}
    try:
        trend = connector.trend(payload)
        df = connector.trend(payload, return_type='dataframe')
        interest = list(df.values.flatten()) #extract search interest values over time

    except IndexError:
        print('no data')
        interest = list([0]*df.values.shape[0]) #if the search is empty create a list of zeros
        
    interest.insert(0,searchterm) #insert the search term to the beginning of the list    
    Trends.append(interest) #append the interest for this term to the list of lists
    
# save list of lists Trend as a csv file
with open('output.csv', 'w') as csvfile:
    writer = csv.writer(csvfile)
    
    # first write a top row with corresponding dates for Google trend values
    header = list(df.axes[0])
    header.insert(0,'searchterm')
    
    # write the sublists in Trends as separate rows in the csv file
    writer.writerow(header)
    writer.writerows(Trends)
    
    # close file
    csvfile.close()

Comments