How to Shape Your Data

on April 7, 2011

Unfortunately, data isn’t always in the format that Tableau likes it. That’s okay – we can fix that. The following is a step by step guide to reshaping your data so you can make the most of it in Tableau.

First, make sure there are no blank rows or columns within your data set. That’s usually pretty easy to fix manually but if there’s a systematic problem (like every fifth row is empty) you might prefer to use this nifty tool called Data Wrangler (more on this later) developed by the good people at the Stanford Visualization Group.

The data also needs to be in a data table format where each row contains only one piece of data. What on earth does that mean you ask? Well, it basically means that cross tabulations are bad.

Normalized Data (you want your data shaped this way):

Crosstab Data (you don't want it shaped like this):

While crosstabs make sense to people, it makes our software’s head hurt because each row contains three pieces of data -- in this case the company's net income in 2010, 2009 and 2008. Tableau feels much more comfortable when data is in a normalized format, where each row contains only one net income figure.

It can be a huge pain to do all the copy and pasting necessary to change data into this form. But the Data Wrangler can help with this too. Here’s a step by step guide using the cross-tab data from above:

Start by going to Data Wrangler and watching the brief intro video. Then paste your data into the data box as shown below.

Now promote the top row of data to be the column headers by highlighting it and selecting "promote" from the rows drop down.

Remember to click the "plus" button to confirm the change.

Now we'll reshape the data into a normalized format. Highlight the columns that have the three subject scores. Now "fold" them into one column by selecting the "Fold 2010, 2009, 2008 using header as a key" option from the left.

Delete the underscore that Data Wrangler added to your data. We don't want that. Select the text in one cell and then select "cut from fold."

All that remains is to export the data back to Excel and connect to Tableau:


Props to the Wrangler team! For the record, DataWrangler is a Stanford/Berkeley collaboration, part of the d^p project.

Here's to collaboration!

Data Wrangler is good, but it's not as intuitive as the reshape tool (especially if all you're doing is reshaping). For the moment, it's also row limited. You can record your transformations, then run them against a larger daatset using a local copy of python, but that's another step to do.

I need some excel help when submitting my excel data into Tableau. I'm not sure if this data is in the format that Tableau likes. My excel spreadsheet only has one piece of data in each row and each cell. It still isn't uploading correctly. Is it better to save my excel in a csv format?

Please read about an automated Excel unpivot utility, similar to data wrangler


The conversion utility can accommodate any number of fixed columns to the left the data and a header row that is not immediately above the data.

Blank or zero cells could be excluded from the output database.

Column A in the database lists the row number of the input table. This information is useful when a pivot table is created from the new database table and rows should be sorted according the original sort order, for example when converting financial statements e.g. a multi-year balance sheet.

Arrays are used in order to minimize computing time by 'hitting' the Excel grid only twice.

The input parameters can be saved as a named template for future use on identical spreadsheets.

They're different. My first plan of attack is the add on, if the data is too crazy I bring in the Wrangler.