In Excel the COUNTIF(S) function counts the number of cells in a range that meets a single or multiple criteria.
The SUMIF(S) is identical apart from it summing the values rather than counting.
Both these functions are extremely useful to Microsoft Excel Users. The functions provide users with the ability to SUMMARIZE (you will find out why I have capitalised this soon!) values by a specific ‘group’. Where the group represents the criteria(s).
For continuity purposes I will continue using my sample tennis data set.
So here I have posed myself with two questions which would cause me to use the two functions above.
Firstly. I want to know the total matches won by the winning player in each match.
So I need to know how many times the winner on that row is repeated in the data set… I want to COUNT the winner name IF it equals that outlined in that specific row.
Secondly. I want to know the sum value of the ranking of players that the winning player has beaten… I want to SUM the ranking points of losing players IF the winner equals the same individual outlined in that specific row.
For those that don’t have the best understanding of the COUNTIFS and SUMIFS functions in Excel I hope this example has given some context. If you feel confused I can always offer a better explanation just contact me at the bottom of the page.
So onto the main reason for this blog series. HOW DO I DO THIS IN ALTERYX.
I’m going to create a workflow which gives me the identical output.
OK, so what am I actually doing?
Well as usual, we start with an INPUT tool pointing at our sample tennis data set.
So then this initial data set splits into two? WHY?
The problem is that in order to get our SUM and COUNT values we must use the SUMMARISE tool in alteryx. This tool allows you to conduct a number of summary processes against a GROUP.
The output from the summary tool will only show the group and it’s score. In other words the aggregated value for each group member. In more other words, it will have a list of all of our players in the dataset who won at least one match (else they wouldn’t appear in the ‘winning player’ column) (so 8 rows) with the count number of matches they have won, and the sum of the ranking players beaten (the output table this creates is actually shown later in the piece and may provide you with a better understanding).
How do we actually configure the summary tool?
Well in every summarize tool we do there must be at least one ‘group by’. What this ‘group by’ action means is what are the dimensions we wish to summarise by. In our case it is ‘Winner’ which represents the winning players names. If we were to run the summarise tool without the extra actions then it would simply give a list of player names.
You can group by multiple fields, in which case the list would contain all the different combinations of the variables within these fields.
Now we simply just add the values which we want to summarize each group by… in our case ‘Sum’ and ‘Count’
Note that it will automatically assign a field name based on the action selected, which you can change by simply highlighting the cell of the ‘Output Field Name’ with which you wish to amend.
So the issue now is how do we re-align these aggregated scores back onto an individual row level. Well this is where my first blog comes in on VLOOKUPS or JOINS. We simply JOIN our match database with our aggregated database by the player name, and de-select one of the ‘Winner’ columns so we don’t get duplicated data and we are Sorted!
Of course, in most cases you don’t actually want the summarised value on each row, instead you may only want a summary table with each individual group listed once along with the summarised value. In this case the workflow is much simpler. We remove the two joints and places browses on the end of each summarise. Now we have one table which simply shows the winner (only once, because this is the value we have grouped by), the number of matches they have won, and the sum of the ranking points of the players they beat.
One bonus about the summarise tool in alteryx when compared to Excel is that you aren’t limited to just COUNTS or SUMS, you can do all manor of summaries on your groups. Just select as you wish!
I hope this blog series is working as intended in helping people transfer from Excel to alteryx, or even just helping alteryx users understand the tools they use better.
I think I will next discuss the string formulas available in alteryx and how they compare with Excel. I do have an overall goal of this series which is to complete a whole excel project within alteryx. I will try and shed some light on this in my next post also.