Sampling your Data

How does Tableau Prep Builder sample data?
When you connect to a large data set, Tableau Prep Builder, by default, limits the data included in the flow to a representative sample. Sampling allows you to use a subset of the data set to quickly gain an understanding of your data and prepare it more rapidly. Tableau Prep Builder uses an algorithm to calculate the optimal number of rows for the flow. If your data set contains more rows than is optimal, Tableau Prep takes a subset of rows and creates a sample. Otherwise, all of the rows of your data set are in the flow.

t

 

t

 

t

 

t

 

t

 

Configure a sample to explore data distribution
You now understand how Tableau Prep Builder samples data. We've explored the options for choosing sampling size and method. Next, you may wonder how to use these options most effectively to sample your data. We continue working with our traffic data. The full data set contains information about traffic incidents from the year 1975 through the year 2011. Now we would like to experience the start-to-finish process of configuring and comparing data samples. We will use a data set of telephone calls that were logged to the municipal 311 call center in New York City during 2011.

We want to study trends in heating complaints in particular, but we have been told that any days with more than 2,000 heating complaints should be considered outliers. We would like to experiment with different sampling configurations and filtering techniques to see which would be most effective in creating a useful sample. We want to show trends in heating complaints as well as to find and exclude days with more than 2,000 complaints.

https://elearning.tableau.com/prep-course/529614/scorm/3p9xv048z1uo7