Hi, I'm Eric and I'm a Product Consultant here at Tableau.
I love Tableau, and I am intrigued by Twitter, so I wanted to find a way to get Twitter into Tableau. DataSift is a fantastic option for this but I really enjoy building things with Python (like the Tableau colour palette creator app) and wanted to flex my (not-yet-so-developed) Python muscles, so I decided to have a go at writing a script myself. Editor's note: if you'd like to learn Python check out our post from last month on Learn to be a Python Charmer.
Here was my plan:
- Get tweets
- Save tweets
In other words, I thought that it would be great to be able find tweets I wanted based on a specific word (for instance a certain hashtag). More than that, I wanted to also be able to get some additional information about the tweeter (like their username, location, etc.) which I could then visualize in Tableau. Something like:
Luckily for me, many very smart people have worked very hard to create amazing modules for Python that allow you to do almost anything, like connect to Twitter. I found a module called “TwitterSearch” which allows you to pretty seamlessly connect to the Twitter API and start find content. There are many other options (find a list of them here) but I decided to go with TwitterSearch.
Note: To get anything out of Twitter you need to supply TwitterSearch with a bunch of somewhat cryptic sounding codes (the” consumer key”, “consumer secret”, “access token” and “access token secret”).
These are basically codes that the Twitter API uses to confirm that you are who you say you are when you go asking it for data. Getting these codes is very important and involves you having to set up a twitter account (if you do not have one already) and then creating a “Twitter App” (less scary than it sounds). This tutorial explains how to get the tokens.
Now, with those keys, let’s look at some code:
All of the code is available on github
First we import the modules that we are going to use. We need TwitterSearch to search Twitter and then the csv module to output the data in a csv file.
Then we define a function that is going to find things on twitter and save them into a csv file. Line 9-11 creates the csv file with a few headers called user, time, tweet, latitude and longitude. And then lines 13-15 set up a search on twitter. In lines 17-21 we supply our keys so that we can access Twitter.
Next we need to get some tweets:
In line 31 we are saying: for each tweet that is found do something. Then in rows 32 to 38 we are capturing the name of the user, the time that they tweeted and the content of their tweet (lines 36 and 37 are removing extra spaces that make the text look garbled). In lines 39-44 we are capturing the geo-location data if we can get it (apparently only about 3% of people share this data). Finally on line 46, we write the user, time, tweet, latitude and longitude data to the csv file.
Then to kick everything off we ask the user for a term to search twitter for and set the maximum number of tweets to be returned and then call the function:
Finally here is our output:
I searched for ‘Tableau’ and set my maximum number of tweets to be returned as 2,000. The output looked like this:
These two tweets (and the other 1,998 more tweets) are all saved in a file called Tableau.csv.
Then we can open the CSV in Tableau and take a look at the data (note this is just one of may vizzes you could make!):