This post is featured as part of Election Month here on Tableau Public.
You really want to viz the elections! But you don't know where to find the data, hmm.
Worry no more! We've got you covered. By the time you finish this post, you'll have tons of data on your hands.
1. Let the Data Come to You
What if you could get some new data sets sent to your email every week? Or what if you could collect tweets and news articles on the elections without having to lift a finger? Wouldn't it be wonderful?
Data is Plural, a newsletter by Jeremy Singer-Vine
Jeremy Singer-Vine, data editor at BuzzFeed, created the Data is Plural newsletter about a year ago. Every week, he sends to his subscribers an email containing the links to five data sets.
His Sept. 28 newsletter contained US elections data sets including, among others, county-level results. By signing up, you'll get access to all the data sets shared in the past. They come in a neat Google Sheet, accessible via a link at the bottom of the newsletter.
Subscribe to Data is Plural and get some new data sets emailed to you every week!
Automating Data Collection with IFTTT
If you've been reading the Tableau Public blog for a while, you probably know that we like to use IFTTT to automate data collection. The concept is easy: You set up a recipe containing a trigger (event A) and a consequence (action B).
Event A can be candidate X tweeting about the elections, and action B can be adding a new line to a Google Sheet with the date and time, the content of the tweet, and the URL. You can find the recipe for this action here.
Using this recipe for the Twitter accounts of Donald Trump and Hillary Clinton, you will soon get two massive data sets—with an average of 24 tweets a day for Clinton and 10 for Trump since August 18, when I set my own recipe.
With the data collected, you can build visualizations like the below one, showing which are the favorite themes of each candidate.
Some other useful IFTTT recipes for collecting elections data, or more generally, politics data include:
- Save your Feedly articles to a Google Sheet so you can do media-coverage analysis.
- Create a spreadsheet of new laws to keep track of bills signed by the president.
2. Visit Open-Data Portals
Government data is often available on open data portals. A big plus: it is usually clean and formatted for analysis (keyword: usually!).
We here at Tableau Public have been actively looking for elections data and have found the following two portals to be gold mines:
Ballotpedia contains all you need to know about American politics, from the federal to the local level, with both historical and current data.
Enigma.io is a data warehouse website making public data easily searchable and downloadable.
Note that you need to create a free trial account with which you can download up to 80 data sets a month for free. To find data sets of interest, you can search by topics (i.e. politics > government spending > lobbying) or by keyword. You will then be presented with a list of data sets, complete with a full description. Once you've understood what the data is about, you can click on the "export" button to download it as a CSV.
3. Scrape the Internet
In the past, we've mentioned web scrapers like Import.io. The truth is, election data is seldom presented in a way that makes such tools necessary. To collect your elections data, the Google Sheets formula IMPORTHTML() might very well be all you need.
The IMPORTHTML() formula lets you scrape an HTML table from any webpage. You just need to identify the ranking of your table in the webpage. Then, create a new Google spreadsheet and enter the formula:
In this formula, "https://etc..." is the page's URL (between quotation marks), "table" indicates that you want to retrieve a table, and n is the rank of your table in the page.
If you want to learn more about Google Sheets formulas, you can read Florian's blog post, "Using the Import Functions in Google Sheets to Easily Scrape Data."
A good place to start scraping tables is obviously Wikipedia. On this topic, I'll simply direct you to Jewel Loree's post, "Protip: Wikipedia Is a Treasure Trove of Data Sets."
There must be another thousand ways of getting your hands on elections data. But finding data is only the first stop of the journey, so make sure you settle on a data set and start visualizing it!