Regular Expressions in Tableau and Alteryx

by Laura Scavino

Today Philip explained how to proficiently use Regular Expressions in both Tableau and Alteryx for extracting and restructuring data. It is possible to use Regular Expressions in Alteryx with both the Formula and the Regex tools.

In Tableau

Regex formula is available in Tableau with 3 different formulas:

Regexp_extract (String, pattern)  –> returns the part matching

 In this case the formula looks at the characters between a-d followed by a space and a digit.

 

 

Regexp_match (String, pattern) –> if it is True it matches

 In this case the formula looks for matches for all characters starting with a capital letter.

 

 

Regexp_replace (String, pattern, replacement) –> replace everything that matches

In this case the formula replaces the space with a dash.

 

In Alteryx

With the Formula tool is possible to use only three different options:

Regex count matches (string, pattern)   –> gives a number as output

Regex match –> true or false value in the -1 or 0 output

Regex replace (string, pattern, replace) –> the output is a string

 

Using the Regex tool in Alteryx gives much more freedom compared to the formula tool:

Replace –> replace everything matched

Match –> Boolean statement

Parse –> extract just what wanted

Tokenize –> returns the value separated in either columns or rows

 

By looking at the Regular expressions in the drop-down, it is possible to have a basic overview of the combinations to be used in the Regex tool.

In addition to this there are a couple more to keep in mind.

\b    –> border between alphanumeric characters and punctuation

–    –> used for ranges

|   –> or, either

? –>  returns zero or one

.* –> returns everything, 1 or more

^ –> not in the set

{ } –> quantifier.   First “(\w{3})”   Inclusive  ” (\w{2,5})”     At least   “(\w{6,})”

\w+ –> all alphanumeric characters

^\w+ –> word at the start of the string “^(\w+)”

\w+$ –> word at the end of the string “\w+$”

A very good website for exercises, examples or just practice is RegeExr .