Can Google Trends Help us Predict the Primary Election Results?

Felix Rios, Market Research Technology Manager, Ugam,

I’ve known about Google Trends for some time now . I tried in the past to use it, but couldn’t find anything that
justified using it in a meaningful way. If you’ve never looked at this, go ahead. It’s very easy to use, and it’s also very easy to get addicted to.
To put it simply, Google Trends is a tool that allows you to explore and quantify what people search in Google.
Internet searches are the written manifestation of our thoughts, fears, curiosities, aspirations and sometimes our secrets. Internet searches are spontaneous. It’s a isolated moment between human and the web.  A Google search has become the modern synapsis between the human brain and the internet.  The internet has become the most important source of information, for pretty much everything. But it never would’ve gotten to this state if it wasn’t so easy to find the content that we are looking for. We owe the usability of the internet to the search engines.
So if internet has become the soul of mankind, exploring Google searches is the equivalent of doing a deep scan into the thoughts of a collective human brain. Google Trends allows us to do this.
I found this great article in The Verge that looks at the importance of looking at and understanding internet search trends in the context of political debates. “Internet search results and "trends" are now as integral to the presentation of debates as old-fashioned polls, even if they might be electorally meaningless.
Recently, I’ve been trying to determine if there is a relationship between the results in the US primary elections and what people search for in Google in those states.
Right now it’s old news that Ted Cruz won in Iowa and Hillary Clinton, nearly won by a very small margin, even though it was initially called a tie. 
These two graphs show the Google searches the week before the Iowa elections until January 31st. The first graph shows Bernie Sanders (in blue) and Hillary Clinton (in red). Both of the candidates show an upwards trend.

In this second graph you can see the Google searches for the Republican candidates Ted Cruz (In red) and Donald Trump (in orange). We can see Trump with a downward trend and Ted Cruz with an upward trend. 

By looking at these two graphs and knowing what happened, it is obvious that there is no direct relation between “number of searches” and “election results”. If this was the case Donald Trump and Bernie Sanders, should’ve been the winners in Iowa.
If you look at the second graph there is a slight but clear downwards trend line for the term “Donald Trump”. Ted Cruz closes the gap but never crosses it, even in the hours that followed, during the day of the primary election. Then Ted Cruz’s searches spike, once he is announced winner. See the below graph.

There are a few things to observe here. First, how immediate, almost realtime, you can see the impact that news events have and how those reflect in Google searches. Second, while Google Trends helps us to understand the interest the Iowa citizens had in each one of the candidates, we don’t know yet what happens or what decisions people make with the information they find.

When it comes to Donald Trump, according to Five Thirty Eight, the more people research and understand Donald Trump’s policies, the more his numbers in popularity drop. In Iowa we saw that he was the candidate with the highest number of searches in the week prior to the election, this did translate into votes, it’s just that those votes weren’t in his favor.
This data on it’s own, doesn’t work to predict election results (yet?). Social networks are better at determining people’s sentiments. Maybe if we can mix these two types of data we could find something?
What about Bernie Sanders? 
The week prior to the Iowa elections, like Donald Trump, he remained on top of the Google searches. The gap between him and Hillary, remained considerably narrower and the lines moved up and down in harmony. Even the trend lines showed similar inclination, both pointing upwards. The result as we all know it now, was Hillary winning by a very small margin.

New Hampshire
As I’m typing this, there are still a few more hours to go until the New Hampshire primary elections finish and a winner is announced.
Just as I did with Iowa, I have analyzed the Google searches for the last week until midnight on February 8th. In this case I’ve also added Ted Cruz, who I missed in the first week of this analysis, and unfortunately that data is no longer available.
In Orange - Donald Trump, Purple - Ted Cruz, Red - Marco Rubio.

Just like in Iowa, Donald Trump leads the local Google searches. At the beginning of the week we saw Ted Cruz in a strong second place, but very quickly Marco Rubio closed the gap, spiking to the very highest of the week, over the rest of the Republican candidates, during the night of the New Hampshire debate on February 6th, and remained in the second position for the rest of the week.
From Iowa we learned that leading the Google searches, doesn’t mean more votes in your favor. That was the case for both Trump and Sanders. So Marco Rubio leading in the Google Searches, doesn’t mean Marco Rubio winning New Hampshire. 

Let’s see what the situation is for this week with the Democrats?
In Blue - Bernie Sanders, Green - Hillary Clinton.

Both candidates started the week very close, once again both lines move up and down at the same rhythm, with Bernie Sanders always showing a slight advantage. There are two obvious spikes. The first one happens on the night of February 4th, during the Democratic debate. Bernie Sanders reaches the highest volume of searches, of both candidates, during that evening.

The second spike happened early Sunday morning (eastern time). This one is particularly interesting because is the only moment when both lines remain separate, by so much, for so long. I’ve been trying to find out what happened that night that could’ve caused this, the only thing that stands out is SNL’s Cameo of the candidate, Bernie Sanders, that aired on Saturday, February 6th, and was highly shared on social media and picked up by news agencies. This could be an explanation, but I still don’t understand why it is so high when it happened so early in the morning, compared to the spike of the Democratic debate, that happened during prime time. Does SNL have the audience reach to cause this spike? Tweet me (@felixafon) if you happen to understand what happened at this time.
As I said above, while this methodology doesn’t have any statistical value, it may help to understand what is in people’s minds. Some sort of the modern web version of the “TV share”, maybe a “Search Share”?. Maybe it could also help to understand where to point your research guns. I will continue looking at the data in the coming weeks/months, until the presidential elections, and I will publish any interesting findings if/when they come up.
In the meantime, if you want to have a copy of the RAW data that I used for this, send me a tweet (@felixafon). I have formatted it and transformed it so that it is Tableau friendly. Also tweet me if you are interested in knowing more about this concept or simply if you have any comments or opinions about it.
Last but not least, I dared to do a prediction based on this analysis, and it looks like I was half right.

The Author:
Felix Rios is a Market Research Technology Manager at Ugam. He is passionate about technology and beyond the office walls, also enjoys photography.



Sorry, your search did not match any relevant results.