The joys and perils of data extracts

by Alex Fridriksson

Why should I care about data extracts? If you have ever asked yourself that question then this blog post is for you!

Data extracts are not just an annoying step you need to make before uploading your viz to tableau public. They can actually be hugely beneficial to you and your workplace!


Why go slow when you can go fast?

If you are like most people, you hate slow computers, slow programs and dislike having to wait for things to load. You have better things to do than that!

That’s how data extracts can make your life better, they make your workflow and dashboards faster! Given the opportunity, wouldn’t you prefer to be racing down the street in a Porsche Carrera GT rather than slowly making your way down the street in a Ford Focus? I know I would.

Why then do so many people use the generally much slower Ford Focus? One key reason is security.

To better understand the potential tradeoff between speed and security, let’s explore a bit about how data extracts work.

When you are connecting to a data source, whether it is on your computer or in the cloud, the default settings in Tableau are set to use a live connection.

Despite what you might think, Tableau does not import your Excel spreadsheet. It establishes a link to it and constantly sends requests to and from it. That process can take some time, especially if you have a large spreadsheet. This is also true of any file or cloud connection you might have.

Having a live connection often means you will be at the mercy of your internet connection and who really wants that?

What Tableau does when it creates a data extract is that it imports a complete snapshot of your data at that point in time and stores it locally on your computer. It also indexes it and does some other magic to make it run faster, even if it is already stored locally.

Tableau still does not import the data when it creates a data extract, but it creates a faster link than you typically get with a live connection. You can think of it as having a provincial road (live connection) vs the German autobahn (data extract).


Why be insecure when you can be secure?

One main reason not to use data extracts, to pick the Ford Focus over the Porche, is because of security.
Companies often spend a lot of effort trying to make their data secure, nobody likes a data leak.

Often you have to connect to files and servers in the cloud because only authorized people can access the data. It might be highly sensitive and therefore its located safe behind the company’s firewall.

When you make a data extract from that data, you effectively take it from that secure environment and store it on your local drive unprotected. You cannot password protect it or your workbook.

If you do much of your work on a laptop which you take with you to and from work, maybe even the pub, you risk the data being stolen along with your laptop.

There are many more things that could be said about data extracts, if you are looking for a more in-depth information about them I would recommend that you read Tableau’s three-part series on data extracts.

