Granularity and Aggregation
When you move dimensions and measures in and out of a view, the view's level of detail changes.
- By default, measures placed in a view are aggregated by SUM, which means that the data for that field in all of the rows is combined.
- Measures can also be aggregated as average, median, count, or count distinct.
- Dimensions break down the aggregated total into smaller totals by category.
The "Aggregation, Granularity, and Ratio Calculations" from Tableau goes into more detail about aggregation and data granularity. It uses the example of looking at profit ratios and can help you understand the behavior of Tableau charts.
Calculating rates or ratios is a common need in Tableau. However, if we were to create these calculated fields without understanding what Tableau is doing to compute our answer it can be very easy to get this wrong. Aggregation and Granularity Let’s step back and go over a couple of crucial concepts, granularity, and aggregation. If we plot profit and sales….we only get one point.
Tableau has aggregated the sum of sales and sum of profit and plotted that. If we want more marks in our view, we need to change the granularity or the extent to which the view is broken down. Increasing granularity is done by adding dimensions to the view. Putting Segment on color breaks that single point into three – a point for sum of sales by sum of profit for each Segment. If we add Market to size, we now have 15 marks – there’s now a point per Segment for each Market. Changing Market to another shelf, such as shape, hasn’t changed the granularity of the view – we still have 15 marks. And those 15 marks are still the sums of profit and sales for that combination of dimensions.
Here, this mark is the sum of sales by the sum of profit for the corporate segment for USCA. The value of profit and sales is the aggregation that depends on the breakdown, or granularity, of the view. The Detail Shelf The Detail shelf is a way of impacting the granularity of the view without encoding a field by color or size or shape – let’s see what it looks like if we put Customer ID onto detail. We don’t have a shape or color per customer, but our view now shows marks aggregated to the granularity of each specific combination of Segment, Market, and Customer ID. Aggregation in Calculations Ok, so why do we care about all that when we’re just trying to make a profit ratio?
The key point for calculations is that the way Tableau calculates depends on the aggregation of the data – therefore it depends on the granularity of the view. Here, Tableau has 3 summed the profits and sales for all orders in each Category and presents a value for each. If we drill down to Sub-Category, we see the summed profits and sales per subcategory. If we expand out to the Row ID, we see the individual records. Profit Ratio Calculation Keeping that in mind, let’s look at how to calculate a Profit Ratio. Here we have two calculated fields, Profit/Sales, and SUM(Profit)/SUM(Sales). In each case, the formula is simply what appears in the name. When we’re at the record level, we see that they give the same answer. However, when we roll back and look at our Sub-Category level, it starts looking really weird. Suddenly our ratios don’t match, and the Profit/Sales one, here in red, has huge numbers. What’s happening here? When Tableau aggregates the profit/sales calculation to the granularity of the bar chart, it’s summing the ratios themselves. A profit ratio has been calculated for each record, and those ratios have been summed.
This is very different from what we want to do. By contrast, the second calculation, SUM(Profit)/SUM(Sales), sums the profits and sales at the granularity of the view, and then takes the ratio of those two numbers. If we roll back further to Category, the green ratio is still correct. By putting the aggregation into the calculation itself, we let Tableau know when to compute the ratio, regardless of the granularity of the view. Summary In summary, SUM(Profit)/SUM(Sales) sums the profits and sales to whatever the granularity of the view is, then computes the ratio at that aggregation. Profit/Sales computes the profit ratio at the lowest level of granularity then sums the ratios to the requested aggregation of the view. Conclusion That’s a quick lesson on aggregation, granularity, and ratio calculations. I hope this helps explain some of why Tableau behaves the way it does