Working with Tableau Extracts

by Frederik Egervari

Tableau enables you to use data extracts as a data source in a workbook. Extracts offer a few options that might be helpful.

The idea of an Extract

Extracts are the counterpart of the usual live connections used in Tableau. They are a “snapshot” of the original data and are used to improve the performance of Tableau.

The data is stored differently, in a columnar fashion, which is improving Tableau’s querying processes. Each column is stored separately in a columnar database and read from top to bottom. In regular, row stored databases, every record in a row has to be read, to use one from a particular row, thus performance for querying this is worse. Also, Extracts are specifically designed to work optimally with the Tableau architecture and therefore are better suited.

Setting up an Extract

If you want to create an Extract, you can do so in the Data Source Sheet. At the top right corner, you get the option to either select a live connection to your data, which is the default, or switch to an extract. If you select this option, Tableau automatically creates the extract, with the default settings. You get the opportunity to edit this, to specify what should be included in the extract, or you can refresh the extract.

Advantages

1. Reduce the size of the data.

a.) The size of the data can be reduced by applying filters to certain fields.

b.) The data can be saved at a higher aggregation level. By ticking the box associated with this, all data shown in one of the sheets is aggregated to the maximum level possible. For example, in Fig. 1 this means that all data is aggregated to the level of ‘Region’ and ‘Product Category’. Therefore only twelve different rows need to be saved. This also means that a lot of information is lost, e.g. changing to the average of sales would not work with the extract.

Figure 1:The highest possible aggregation level to show this view needs to include 'Region' and 'Product Category'.

For dates, it is possible to roll them up to defined time periods. Therefore data could be aggregated to the quarterly level for example.

c.) Use a reduced number of rows. Tableau offers the possibility to either use a specified number of rows of a table or lets you take a random sample of your data with a defined number of rows/percentage of rows.

d.) Exclude all hidden fields or hide unused fields. All fields that are hidden in the data pane before doing the extract are excluded from the extract per default. If you click the button to hide all unused fields, these are included in the extract, but if you open a workbook connected to the extract, they won’t be visible per default. Hiding unused fields is a quick and easy way to improve Tableau’s performance.

Fig. 2 shows how the options a.)-d.) can be leveraged to reduce the size of a dataset.

Figure 2: Options in the 'Extract Data' Window. The marked elements are useful to reduce the size of the extract.

2. Convert the data to a hyper file.

3. Store the data offline. The Extract will be saved locally but may be updated from another source like a database while refreshing.

Date Format Issues

Extracts can cause issues when working with date fields. Since the local date format is stored, opening the extract with a Tableau Desktop version with a different workbook locale can cause the program to read the date the wrong way. This can, for example, be a problem when an American workbook with the date format ‘MM-dd-yyyy’ is opened in a Tableau version that uses German locale settings and therefore has the date format ‘dd-MM-yyyy’. In this case, Tableau won’t be able to read the dates and will return Nulls.

Physical Tables vs. Logical Tables

The default setting for working with Extracts is to use logical tables and often this is the only possible option. Most of the time this is the best option as well. But if you have many different joins and can’t configure additional filters, using a physical table Extract might be the better option to boost the performance.

Sun 10 Sep 2023

Sat 24 Dec 2022

4 mins read

Wed 31 Aug 2022