Connecting Alteryx to Snowflake

by Pat Lucas

I was tasked with creating a workflow in Alteryx and pushing the data to Snowflake as a data table. There are many resources online to guide you however, none of them worked perfectly for me. I had to take snippets from different resources in order to complete the task, possibly taking hours longer than should have been. Therefore I’ve decided to create a blog that directs you step by fiddley step to set up your connection from Alteryx to Snowflake.

Step 1:

If you haven't got an instance of Snowflake, sign up to the free trial here. Upon signing up, choosing Enterprise and either google (GCP) or AWS was absolutely fine for my circumstances, but you may prefer certain regions or providers depending on your or your organisation’s cloud hosting strategy.

Step 2:

You’ll be taken to your Snowflake platform home page. At the top of the page there are some tabs as shown below:

In both the ‘Databases’ and ‘Warehouses’ tab, create a new database and warehouse (these are optional). The database and warehouse names can be used later when constructing your connection from Alteryx to Snowflake. The create button is highlighted below for Databases and is similar within the Warehouses tab.

This step is optional because when setting up the Alteryx - Snowflake connection (which we will do later in this blog), you can just use the default ‘COMPUTE_WH’ that you will find within the warehouses tab and similar default databases within the Databases tab.

If you do set up your own Warehouse, change the size to be X-Small as seen below – this ensures that you don’t use up so many of your free credits within your free trial. The rest of the configuration isn’t so important. Select Finish.

Step 3:

Next you need to download the Snowflake ODBC Driver which can be done here. You can also access the driver installation by selecting Help > Download > ODBC Driver from the Snowflake Web Interface as seen below. Choose the download option that you require (I required a download for Windows), select the 64bit option and choose the latest version of the download.

Step 4: Installation info

Locate the Snowflake ODBC file in your downloads folder and double click to open. A pop up will appear as seen below. Select Next > Next > Install (images below) and allow the app to make changes to your device if you are comfortable to proceed. On the final pop up, select Finish. The ODBC Driver is now installed.

Step 5:

Open the downloaded ODBC Driver, select the 64-bit version if given the option and you should see the view shown below pop up. You are now going to create your data source (you can see two that I have created in the view below - 'Pat Test' & 'test').

Select “Add…” on the right hand side.

You will get a pop up to Create New Data Source. Scroll down until you find the SnowflakeDSIIDriver as seen below, select it and select Finish.

You will get a Snowflake Configuration pop up looking like this below:

The Configuration

  1. You can name your Data Source as you wish, I’ve chosen “TEST” (I later renamed it to Pat Test).
  2. In user and password, input your credentials you used to sign up to your free trial of snowflake.
  3. Your server name is the beginning of the URL that you see on the web page of your Snowflake home screen. Ignore the https:// but take everything from there up to and including snowflakecomputing.com. You can see what to include in the image below within the thick black brackets (you must also include the identifier that in my case comes before the word ‘Europe’ – I’ve covered it for security).
  4. Be sure to NOT include the / at the end of ‘.com’ in the server box - the configuration box won’t produce an error but it caused various people in my team errors later on which took some time to diagnose.
  5. This next step is optional. The Database and Schema and Warehouse fields can be populated with the names you used when you created them within your Snowflake platform (you can use the default 'PUBLIC' for your Schema. Make sure you use capital letters as snowflake is case sensitive. If you didn’t create a database or warehouse, you should find some default names for these on your snowflake page, for example the default Warehouse name is ‘COMPUTE_WH’ – these can be used instead of your created items. The other options can be left empty.
  6. Select OK and that is now set up!

Step 6:

In Alteryx, you can use your output tool to create a table within your Snowflake platform.

Add an output tool to the end of your workflow and select it. In your Alteryx configuration pane on the left hand side, select the dropdown arrow under “Write to File or Database”.

A Data Connections page will pop up as seen in the image below. Select Data Sources and within that, you should see an option to select ODBC or Bulk under the Snowflake Title. Either can be used but I’m going to use Bulk (highlighted).

You might want to know what the difference between ODBC and Bulk are. Using ODBC will mean your data is processed one row at a time as each row will be “inserted” into the target table. Bulk will process chunks of data in one go, essentially copying and pasting the rows of data from your workflow into the target table (in this case in your Snowflake instance). This means that the Bulk method is much faster than using ODBC and a much better method for datasets with a large number of rows. This does not however mean that Bulk is always the best method. In your output options in alteryx, Bulk allows you to create, delete and overwrite tables as well as appending to tables. ODBC allows you to do all of this but it also has the functionality to update individual rows in a table and also update the table if you want to insert rows only in the case that they are new rows and don’t already exist in the table. Bulk does not work on a row by row basis so in a situation in which you need to update individual rows, ODBC is your answer.

Having selected Bulk, the Snowflake Bulk Connection Window will appear. Within your data source name, you should be able to choose the data source that you set up within the ODBC Driver. If it doesn’t show up in the dropdown, select the ODBC Admin button to the right hand side and re-configure your ODBC Driver using the steps outlined in Step 5.

Input your username and password that you used to set up your Snowflake account and change the Staging Method to be “Local”. I won’t go into the Amazon S3 option in this blog. Click OK.

Lastly, choose a name for the table that you are going to write to and select OK.

Run your workflow in Alteryx to post your data to a table within Snowflake.

Errors:

There were some errors I encountered along the way. The main one I encountered being:
SQL compilation error: error line 1 at position 119 Invalid character length: 1,073,741,823. Must be between 1 and 16,777,216.
The reason for this error was because within my Alteryx workflow, I created a field with datatype V_WString. The size of this data type is 1,073,741,823 characters but this value is too large for Snowflake. To avoid this, ensure you change the size of any field that’s too large using a select tool to something smaller than the 16million value stated above.

Some of my colleagues had trouble setting up the right configuration in the ODBC settings. We learnt that you should;

  • Use the 64bit ODBC settings in Windows, not the 32bit one.
  • Be super careful with spelling when typing in your database name, schema name etc. The configuration window won’t give you any errors if you get the wrong, and the first thing you’ll know about it is when Alteryx gives you HTTP 404 errors.
  • Double check the server name - ensure you do NOT include the https:// bit, nor anything after the snowflakecomputing.com part of the address, including the last forward slash /
  • When using Alteryx to create the new database table in Snowflake, remember to use the ‘Create New Table’ option in the config of the Output tool.

Thank you for reading, I hope this was useful!





Fri 26 Mar 2021

Thu 25 Mar 2021

Wed 24 Mar 2021

Tue 23 Mar 2021