Tableau Prep: Using "Group & Replace" to catch & fix spelling errors.

by Jonathan Allenby

Group & Replace is neat little way of catching and fixing potential spelling errors in your data. By choosing one or more group & replace options, Tableau Prep groups and renames data values based on certain criteria. The option can be found when clicking the three little “More options” dots next to a field.

There are four group and replace options:

  1. Manual Selection: it leaves it to you to decide what values you think should be grouped together.
  2. Pronunciation: it finds and group values that sound alike. What?! Magic I say. This option isn’t flawless but is still pretty cool.
  3. Common Characters: it finds and groups values that share the same common letters. For this option it reduces each value to a collection of its unique letters (so London becomes “dlno”) and groups values with the same collection (so “London” gets grouped with “Lodnon”, “noDonl”,  and even “Lllllllloooondddoooon” as they are all made of combinations of “dlno”).
  4. Spelling: it finds and groups text values that are spelled alike. It uses a complicated algorithm to determine how alike words are spelt, but I’ve found it to be pretty accurate!

You can play around with which option or options you use to fix a field with typos, but I often try the Spelling option first. Any values that get missed by one grouping option can either be manually grouped afterwards or can be gathered up by running a second Group & Replace option!

However, as a word of warning for this: always sanity check the results. This is because you could find some erroneous grouping. For example, suppose you have data with a variety of cities, and four of those cities happened to be:

  • Aartukka (Finland)
  • Karatkau (Georgia)
  • Katakura (Japan)
  • Takakura (Japan)

All these cities are anagrams of each other, so there’s a good chance one or more of the Group & Replace options would accidentally group these together!