How to Shape Your Data

on April 7, 2011

Unfortunately, data isn’t always in the format that Tableau likes it. That’s okay – we can fix that. The following is a step by step guide to reshaping your data so you can make the most of it in Tableau.

First, make sure there are no blank rows or columns within your data set. That’s usually pretty easy to fix manually but if there’s a systematic problem (like every fifth row is empty) you might prefer to use this nifty tool called Data Wrangler (more on this later) developed by the good people at the Stanford Visualization Group.

The data also needs to be in a data table format where each row contains only one piece of data. What on earth does that mean you ask? Well, it basically means that cross tabulations are bad.

Normalized Data (you want your data shaped this way):

Crosstab Data (you don't want it shaped like this):

While crosstabs make sense to people, it makes our software’s head hurt because each row contains three pieces of data -- in this case the company's net income in 2010, 2009 and 2008. Tableau feels much more comfortable when data is in a normalized format, where each row contains only one net income figure.

It can be a huge pain to do all the copy and pasting necessary to change data into this form. But the Data Wrangler can help with this too. Here’s a step by step guide using the cross-tab data from above:

Start by going to Data Wrangler and watching the brief intro video. Then paste your data into the data box as shown below.

Now promote the top row of data to be the column headers by highlighting it and selecting "promote" from the rows drop down.

Remember to click the "plus" button to confirm the change.

Now we'll reshape the data into a normalized format. Highlight the columns that have the three subject scores. Now "fold" them into one column by selecting the "Fold 2010, 2009, 2008 using header as a key" option from the left.

Delete the underscore that Data Wrangler added to your data. We don't want that. Select the text in one cell and then select "cut from fold."

All that remains is to export the data back to Excel and connect to Tableau: