Picture the scene: you have a list of Once you have the data, there are five stages to creating a word cloud.
- Split each string into separate words
- Pivot the data into rows
- Clean the data
- Get rid of any words (rows) you don’t want
- Visualise in Tableau
Let’s go through each in turn.
- Split strings into separate words.
It’s possible you need to do a data preparation stage before you split the string, but let’s assume you’ve done that step. You have a row for each sentence / tweet / collection of words: to make a word cloud, you first need to separate this single-field string into a field for each individual word.
Start by dragging the Record ID tool in to the workflow: it’ll give a unique ID to every row, so that you’ll know later which record the word was from. Use the Text-to-Columns tool in Alteryx to split your string field by a delimiter, which in this case is a space (click in the Delimiter box and hit Space). Make sure that the number of rows to split your string into is greater than the maximum number of words in any string: either check or just pick a much higher number of rows than you need, as they’re easy to filter later.
- Pivot the data
Now create a row for each word. To do this, use the Transpose tool, select the Record ID and any other dimensions you’ve got as the Key Fields, then make all the word fields into Data Fields (tick the boxes).
- Clean the data
There will almost certainly be a fair amount of cleaning to do once you’ve split the words. If you’re analysing tweets, there will be a lot of @ symbols and link addresses to remove. Here’s a list of cleaning activities you might have to do.
- Remove blanks and nulls – you can do this with Filter tools.
- Remove whitespace – use the Data Cleansing tool, which has that option.
- Get rid of @ symbols and link words like ‘http’: use Filter and Formula tools to remove these.
- Get rid of unwanted words.
Depending on your analysis, there will be many words that you don’t want in your word cloud: for example, there’s rarely much point in keeping the most common words like ‘the’ or ‘and’, unless you’re interested specifically in those.
The easiest way to remove words from your word cloud is to create a list of words you don’t want in your analysis (I’d recommend Googling ‘list of common words’ or similar, as people have created these datasets as csv files and shared them), then use a Join tool to connect to those words. Take the output containing none of those words and continue with your analysis.
Now we can finally go to Tableau. Output your data from Alteryx and connect to it in Tableau. Then select Text from the Marks card drop-down; drag Word to the Filter shelf and do a Top 10 filter by the count of Word; drag Word to the Text shelf; and drag the count of Word to the size shelf. You should then have a word cloud.