# Correlations between Bitcoin price and Google trend data

Lately, I read about people using Google trends for the market timing of cryptos or stocks. Due to that, I was experimenting on how to work with this data. Therefore, I explain in this blog post how to receive and compare Google trend data to price and volume. Here I used crypto currency as an example. Yet you can use stock market data likewise.

**Additional disclaimer: This is no investment advice or encouragement to buy crypto. Please, do your own research before investing. This tutorial is only for educational purposes.**

### What is Google trends?

Google trends is a service that provides information which search keywords are entered by users into Google. Results are normalized with the search volume of that keyword. Thus, we can analyze trends regarding the popularity of search keywords.

### Using Pytrends to access Google trend data

You can access the Google trends data, using an unofficial API (Application Programming Interface) pytrends. This enables us to receive the Google trends data. **I only received data with a delay of 3 days.** Guess, that is an advantage Google keeps.

### Receiving Bitcoin data

The crypto currency *Bitcoin* is chosen to compare data. I used the API from CryptoCompare to receive live data. The closing prices and the volume (number of traded Bitcoin) are especially relevant. The chosen exchange is Kraken. You can find more on how to use the CryptoCompare API here.

### Writing the Python script

First of all, here is the **code structure**:

- Import dependencies
- Set the time window e.g.
*100 days*and the moving average window e.g.*14 days*. - Choose the keyword(s) for Google trends e.g.
*bitcoin*. - Manipulate the url as you please e.g. time window, currency to and from, exchange
- Afterwards, a live json file is read from CryptoCompare
- The data is stored in a dataframe (
*df*) using the Pandas data analysis library - Because averages are used, a starting point
*sp*is defined. For example, if 20 data points are used for an average, the chart is plotted from the 21th data point on. - Afterwards, the
*interest over time function*of Pytrends is used. The data is put into a pandas data frame (*dfTrend*). - Calculate moving averages
- Calculate correlation coefficient with the corrcoef function of the numpy package
- Finally, plot data regarding price, volume and trend

And here is the script:

#License: MIT License (http://opensource.org/licenses/MIT)

import pandas as pd

import urllib, json

import matplotlib.pyplot as plt

from pytrends.request import TrendReq

from datetime import datetime, timedelta

import numpy as np

timeWindow = 100

timeWindowMovingAvg = 5

keywords = ["bitcoin"]

url = "https://min-api.cryptocompare.com/data/histoday?fsym=BTC&tsym=EUR&limit=100&e=Kraken"

response = urllib.urlopen(url)

data = json.loads(response.read())

df = pd.DataFrame(data['Data'])

df.columns = [['close', 'high', 'low', 'open', 'time', 'volumefrom', 'volumeto']]

df.time = pd.to_datetime(df['time'],unit='s')

df = df.set_index(df.time)

beginDateWindow = datetime.now().date() - timedelta(days=timeWindow)

#===Starting point to have the exact amount of data

sp = len(df.time[timeWindowMovingAvg-1:])

#===Pass data to pytrend and execute it

pytrend = TrendReq()

dataWindow =str(beginDateWindow) + " " + str(datetime.now().date())

pytrend.build_payload(keywords, cat=0, timeframe=dataWindow)

dfTrend = pytrend.interest_over_time() # using interest over time function

dfTrend.columns = ['keyword', 't']

#===Moving average

maTrendPrice = df.close.rolling(center=False, window=timeWindowMovingAvg).mean()

maTrendVolume = df.volumeto.rolling(center=False, window=timeWindowMovingAvg).mean()

maTrendGoogle = dfTrend.keyword.rolling(center=False, window=timeWindowMovingAvg).mean()

#===Correlation

#ccPriceTrend = np.corrcoef(dfTrend.keyword, df.close[:-3])[1,0]

#ccVolumeTrend = np.corrcoef(dfTrend.keyword, df.volumeto[:-3])[1,0]

#===Plot data price and trend

fig, ax1 = plt.subplots()

ax1.plot(df.index[-sp:], df.close[-sp:],'r', label = 'Price', linewidth=1.5)

ax1.set_ylabel('Price in Euro', color='r')

ax1.plot(maTrendPrice[-sp:],'r', linestyle= '--', label = 'SMA-price'+str(timeWindowMovingAvg), linewidth=1.5)

ax2 = ax1.twinx()

ax2.plot(dfTrend[-sp:],'b', label = 'Google trend', linewidth=1.5)

ax2.set_ylabel('Google trend', color='b')

ax2.plot(maTrendGoogle[-sp:],'b', linestyle= '--', label = 'SMA-trend'+str(timeWindowMovingAvg), linewidth=1.5)

for label in ax1.xaxis.get_ticklabels():

label.set_rotation(45)

ax1.xaxis.set_major_locator(plt.MaxNLocator(10))

fig.tight_layout()

fig.savefig('correlating_price_trend.png')

#===Plot volume and trend

fig2, ax3 = plt.subplots()

ax3.plot(df.index[-sp:], df.volumeto[-sp:],'g', label = 'Volume', linewidth=1.5)

ax3.set_ylabel('Volume', color='g')

ax3.plot(maTrendVolume[-sp:],'g', linestyle= '--', label = 'SMA-price'+str(timeWindowMovingAvg), linewidth=1.5)

ax4 = ax3.twinx()

ax4.plot(dfTrend[-sp:],'b', label = 'Google trend', linewidth=1.5)

ax4.set_ylabel('Google trend', color='b')

ax4.plot(maTrendGoogle[-sp:],'b', linestyle= '--', label = 'SMA-trend'+str(timeWindowMovingAvg), linewidth=1.5)

for label in ax3.xaxis.get_ticklabels():

label.set_rotation(45)

ax3.xaxis.set_major_locator(plt.MaxNLocator(10))

fig2.tight_layout()

plt.show()

fig2.savefig('correlating_volume_trend.png')

Furthermore, you can find an updated and working version on the Github repository (01.11.2018).

### Correlation coefficients

A correlation coefficient is a statistical measurement regarding the strength of a relationship. It can measure the linear correlation between vectors (e.g. price, volume) or function curves. A coefficient of 1 would mean a perfect positive correlation. A coefficient of 0 would mean no linear correlation. Here I used the Pearson correlation coefficient.

For further details on that topic you can have a look at my Crypto Portfolio Optimization Part 1: Correlation Matrix article.

### Analysis

**It is important to understand, that this analysis is for a time frame of around 100 days. If you change the time frame, the results may differ considerably.**

First of all we compare the closing price of a Bitcoin and the Google trend data. The correlation coefficient is 0.828 (time window: [17-08-30 17-12-06]). This means there is a strong positive linear correlation.

In addition, the volume of traded Bitcoin and the Google trend data is compared. The correlation coefficient is 0.778 (time window: [17-08-30 17-12-06]). This means, there is a moderate to strong positive linear correlation.

Let us smooth out the data using simple moving averages (SMA) for 5 days. This is applied on the closing price and volume data and the Google trend data.

### Discussion

First, let’s look at the first two figures. I would have almost said, that the correlation between volume and trend is stronger. Quite a lot of the large peaks in volume are covered by the trend. The only exception is the negative peak around the 20th of October. Let’s have a look at the correlation coefficient. The price-trend correlation coefficient is 0.828, while the volume-trend correlation coefficient is 0.778. This contradicts the first impression, that the correlation between volume and trend is stronger.

Therefore, a 5 days SMA is applied in the 3rd and 4th figure. This means, the data is a bit smoothed out. We can see, that the strong peaks in the volume are not met by likewise strong peaks in the trend data.

As a result, in this example the closing price correlated a bit stronger with the Google trend. **Be careful to not assume that in general. A different time window, different currency or even a different Google trends keyword can alter the results significantly. **

### Conclusion

In my opinion you can use Google trend data to grasp the sentiment in the markets. This is what technical analyses are sometimes lacking. By using Pytrends Google trends can be received. Furthermore, you can use simple moving averages (SMA) to smooth out data. Additionally, you can use correlation coefficients to check interpretations. In conclusion, you can use this to hopefully enhance your purchase decisions.

### What did you think?

I’d like to hear what you think about this post.

Let me know by leaving a comment below.

Hi Johannes,

thanks for the code.

The correlation part is broken, but it can be easily fixed. You have to use to_frame() to convert series to frames, use merge to put them into one data frame and then use the built-in method core to calculate correlations. Then you get a kind of dictionary, where values are Pearson correlation coefficients.

#===Correlation

merged = pd.merge_asof(maTrend1.to_frame(), maTrend2.to_frame(), left_index=True, right_index=True, direction=’nearest’)

ccc = merged.corr()

c2 = ccc[‘keyword1’][‘keyword2’]

Hi Teo,

thanks for the reply. I was having a look at this issue and comitted a new and working version on Github: https://github.com/DocCryptastic/crypto_simulations/blob/master/googleTrendPriceVolumeCorrelation.py.

In case you are interested in Correlation coefficients, you can have a look at this https://numex-blog.com/crypto-portfolio-optimization-part-i-correlation-matrix/ article.

Best regards

[…] indicators regarding price and volume and google trend data. Furthermore, it is based on this article I was writing about a year […]