Shaping Data with Pivots
When preparing data for analysis, you may have encountered situations similar to these:
- Does your data contain several columns whose names are more useful as data values?
- Are there data values in one column which should be separated into distinct columns?
- Do you need to change the structure of your data to match another table's structure?
Pivoting your data would resolve each of these situations. This module will demonstrate how to use a pivot in Tableau Prep Builder to change the shape of your data into a different structural format in order to answer some analytic questions or to combine with other data.
Objectives
At the end of this module, you will be able to:
- Add and configure a pivot step in a flow.
- Use a wildcard search with a pattern match to pivot columns to rows.
- Differentiate between a rows-to-columns pivot and a columns-to-rows pivot.
- Evaluate the pivot results in the flow and in Tableau Desktop.
Pivot columns to rows
- A columns to rows pivot changes the structure of your data from "wide" to "tall" by turning columns into rows.
- A columns to rows pivot turns columns for years into separate rows for each year and region.
- A columns to rows pivot turns columns for years into separate rows for each year and region.
Use cases
A columns-to-rows-pivot enables you to reshape the data from a crosstab format into a columnar format, turning columns into rows. When would that be useful? Use a columns-to-rows pivot in situations similar to those described below.
"Wide" data sets
Use a columns-to-rows-pivot on"wide" data sets when the information is captured with many columns and the columns contain similar information. This format is possibly more convenient for a user report, but not as useful for analysis. Tableau Desktop works best, analyzing data in a "tall/narrow" or denormalized table structure. If the data is in a normalized data structure, you can reshape the data using a pivot. You will be able to answer more questions during your analysis after a pivot.
If we bring this data structure into Tableau Desktop, we get a field (measure) per column. There are five fields that all represent the cost of parking for that day. This makes it very hard to do analysis across time as the data is stored in separate fields.
After a columns to rows pivot, all of the date information is now within a single column, enabling you to do analysis over time. Before the pivot, there is a column for each date. After the pivot, all the dates are in a single column.
Column names are data values
- Use a columns to rows pivot when your data has several columns where the column name is more useful as a data value.
- Data from spreadsheets or other crosstab formats, e.g., survey data, public government data, and pre-formatted reports in Excel often have this characteristic.
- In this example, each column name is a survey question. After a columns-to-rows pivot, all of the survey questions are now within a single column, enabling you to analyze the survey responses.
To match other data structures
Use a columns-to-rows pivot if you want to combine data with another table that has a different structure that you want to match.
In this example, pivoting columns to rows for the first table will change the structure to match the third table, so you can combine them in a union.
Create a columns-to-rows-pivot
There are a few ways to create a columns to rows pivot in a flow. The columns to rows pivot occurs in a pivot step. When you initiate a pivot from any other type of step, such as in the profile pane of a cleaning step, Tableau Prep automatically adds a pivot step to the flow.
Add or insert a pivot step in a flow
To add a pivot step in a flow, hover over a step and click the plus (+) icon. From the Add context menu, select Pivot. You can insert a pivot step using the Insert context menu when you click the plus (+) icon on a connector.
Pivot selected fields in the profile pane
In the Profile pane, select the fields that you want to pivot, then either:
- Select Pivot Columns to Rows from the toolbar.
- Right-click to open the context menu and select Pivot Columns to Rows.
With both methods, Tableau Prep Builder will automatically add a pivot step to the flow.
Use recommendations in the profile pane
If Tableau Prep Builder detects fields that can be pivoted from columns to rows, a recommendation option will show in the Profile pane toolbar or in the column headers.
- Click the light bulb icon and select the Pivot Fields recommendation.
The recommendation previews which fields it will pivot. - Click Apply to generate the pivot step.
If Tableau Prep Builder detects fields that can be pivoted from columns to rows, a recommendation option will show in the Profile pane toolbar or in the column headers. Click the light bulb icon and select the Pivot Fields recommendation.The recommendation previews which fields it will pivot. Pivot recommendations are shown with light bulb icons in the profile pane toolbar and field headers.
Pivot data from columns to rows
The following example uses the active-duty marital status data that was combined using a wildcard union. If you bring this data structure into Tableau Desktop, there will be a measure for each field, limiting your analysis. We'll perform a columns to row pivot.
Step 1
Look at the fields before pivoting
The profile pane showing separate columns for each marital status type.
Take a look at the fields in the cleaning step. The combined data has 11 fields with 90 rows of data. There are separate columns for each marital status type.
Step 2
Add a pivot step
Add a pivot step in a flow.
Click the plus (+) icon next to the cleaning step and then click Pivot.
Step 3
Note the pane sections
There are three pane sections: a list of fields, the Pivoted Fields, and the Pivot Results.
A new pivot step is added in the flow pane, and several sections are shown, from left to right:
The left pane shows a list of fields
The Pivoted Fields pane specifies which fields to pivot
The Pivot Results pane shows the profile and data grid of the pivoted data
Step 4
Select the values to pivot
Select the values to pivot. Options to search and automatically rename fields are available.
Select the fields to pivot, in this case, all the fields with a data type of Number.
You can also enter a search term in the Search field to search the field list for fields to pivot.
Optionally, select the Automatically rename pivoted fields and values check box to enable Tableau Prep Builder to rename the new pivoted fields using common values in the data. If no common values are found, the default name is used.
Step 5
Pivot the selected fields
Drag the selected fields to the Pivoted Fields pane.
Drag and drop the selected fields to the Pivoted Fields pane.
Step 6
Alternatively, pivot from the profile pane
Or, pivot selected fields from the profile pane toolbar.
Or, you can select fields you want to pivot in the profile pane, then click Pivot Columns to Rows in the toolbar. When you use this method, a pivot step is automatically added and the pivoted fields are already identified for you in the Pivoted Fields section.
Step 7
Rename the pivoted fields
Rename the fields in the Pivoted Fields or the Pivot Results so they are meaningful.
Rename Pivot Names and Pivot Values so they are meaningful, in this case, "Marital Status" and "Number of Members."
You can rename the pivoted fields in the Pivoted Fields section or in the Pivot Results, as shown here. Alternatively, you can rename the pivoted fields in a cleaning step after the pivot.
Note that in addition to renaming Pivot Names and Pivot Values fields, you can make other changes, which will be recorded in the Changes pane. Best practices are to keep the changes performed here directly related to the pivot.
Summary
We've created a pivot in the flow by adding a pivot step or by selecting fields in the profile pane. Once the fields were pivoted, we renamed the pivot names and pivot values, so they were meaningful.
Now, we can evaluate the pivot results.
Evaluate the pivot results in the flow
There are a couple of ways to evaluate the pivot results in a flow. In each case, you can check and compare the number of fields and the number of rows after the pivot with those numbers before the pivot.
Before the pivot
Look at the fields in the cleaning step before the pivot to see the number of fields and the number of rows. In this example, there were 11 fields, including 8 fields representing each of the different marital status types, with 90 rows of data.
Before the pivot there were 11 fields, including 8 fields for each of the different marital statuses, with 90 rows of data.
Before the pivot there were 11 fields, including 8 fields for each of the different marital statuses, with 90 rows of data.
After the pivot
After the pivot, there are 5 fields with 720 rows of data. The pivot resulted in 8 times the original number of rows, since 8 fields were pivoted. The 8 original fields that contained the number of people in each marital status type were replaced with 2 fields, one with the number value and the other identifying which field the value came from.
Look at the results in the pivot step
After the pivot, there are 5 fields with 720 rows of data.
After the pivot, there are 5 fields with 720 rows of data.
Look at the results in a cleaning step after the pivot
After the pivot, there are 5 fields with 720 rows of data.
After the pivot, there are 5 fields with 720 rows of data.