Monitoring access to data in Tableau Server - exportcrosstab and viewData

by David Sánchez

This post will help answering the question of who viewed the data (viewData) or downloaded the crosstabs (exportcrosstab) in any of the views. It will also provide a list of workbooks & views whose data is accessed in that way. Read below and then download the workbook and try pointing to your datasources. This answer is based on Mark Jackson’s Customizing the Tableau Server experience,  and Jonathan MacDonald’s great explanation and hands-on suggestions and help.

Outline:

  1. What do I mean by…
  2. Why is this important?
  3. Requirements: access to the Tableau Server Repository
  4. Explanation and how-to
  5. Example and sample workbook

What do I mean by “view data”?

What do I mean by “exportcrosstab”?

Why is that important?

It depends. It could be any of a number of reasons:

  • If the underlying data in a view is downloaded often, maybe the view is not providing the answers the users are looking for (therefore they have to download the data and analyse it elsewhere).
  • Maybe you suspect a leak of confidential data and want to know who downloaded what.
  • Or you just want to have a tool to inform on the most “at risk” workbooks based on how many people have downloaded the data.

In any event, how can we monitor these activities?

You need to be able to collect data from your Tableau Server’s repository (see Tableau’s documentation: https://onlinehelp.tableau.com/current/server/en-us/perf_collect_server_repo.htm)

  1. Activate the readonly user as described in the Tableau doc linked above.
  2. Restart the server.
  3. Connect to the repository:
    1. url: yourserver.yourdomain.com, port: 8060 (unless you have changed it)
    2. database: workgroup
    3. User: readonly
    4. Pwd: whatever you decided in (1).

Now you have access to the Workgroups Database that has a plethora of information (https://onlinehelp.tableau.com/current/server/en-us/data_dictionary.html).

Monitor exportcrosstab and viewData

To be able to determine who accessed the data with viewData and export cross tabs, we are going to use the following tables:

  • _users
  • _http_requests
  • _sessions
  • _sites

In addition, due to the way VIZQL sessions work, we are going to need a few LOD calculations to properly assign VIZQL sessions that performed actions to User ID.

Quite often, VIZQL sessions will have two users: the one that started the session –that correspond to a real user—and the “NULL” user that is the actual session answering some of the requirements of the users. In some cases, the VIZQL session will have multiple simultaneous User IDs. I have decided to assign those to an unknown user with ID 99999.

  1. Assign users to VIZQL sessions: if { FIXED [Vizql Session]:COUNTD([UserID])} = 1 then {FIXED [Vizql Session] : MAX([UserID])} else 99999 END
  2. Caveat: it seems that depending on the server, the “null”user will have “null” user ID (and then the calculation in 1 works ok) or a User ID of ‘0’ (and then the calculation above will not work properly, a way around is to create another calculation that replaces 0 with “null” and then use that for the other calculations (IF [User Id] = 0 then NULL else [User Id] end)

Now that we have User IDs associated to VIZQL sessions, we just need to know if any of the selected actions was requested during the session:

IF CONTAINS([Http Request Uri], [Select Action to Monitor]) then 1 else 0 end

Where [Http Request Uri] is the field in the _http_requests table that contain the information and [Select Action to Monitor] is a parameter that contains the different actions I am interested in – the whole list of options can be obtained from the “Actions” dimension—.

Now we can assign some more values to users, workbooks, etc. using LOD (did they perform the action? Where they downloaded?)

Example and sample workbook

I put together an example using the workbook I just viewed data from & exported the cross tabs at the beginning of this post:

Want to give it a try? Download the .twb workbook, replace the data connection with your server and credentials for the readonly user (it may be slow, so an extract may work better!). Also, keep in mind that the _http_requests table store information of the last 7 days ONLY. If you need more data, you will need to export and maintain the database somewhere else.

Any feedback? Please, leave a comment or find me at dsmd4vid.