Preppin' Data - Animal Adoptions

by Emma E. Jones



The aim of this challenge is to validate claims that over 90% of animals in the shelter are either adopted, returned to their owners or transferred to another facility.

The output should contain 3 fields which include:

  • the type of animal
  • the % of this animal that were adopted, returned to their owner or transferred
  • the % of this animal that were not adopted, returned to their owner or transferred

For this challenge I created a solution in Alteryx and in Tableau Prep. Both tools achieved the desired result although different approaches were taken for the different tools.

Alteryx Solution:

The first step is to input the data. It is important to discover if there are any fields that are duplicated or need renaming or altering. In this case, the columns 'DateTime' and 'MonthYear' appear to be identical.

This can be tested by connection a filter tool to your input data and using the custom filter configuration below:

If these fields are identical, there will not be any rows in the 'F' output of this filter tool since in all cases the two columns match. This means one of the duplicate columns can be removed.

This can be done using a select tool and simply unticking the 'MonthYear' field.

Filtering to keep only dogs and cats:

One of the requirements was to keep only dogs and cats since all other animal types had small sample sizes which would skew the results.

To do this a simple custom filter can be connected to the output of the select tool with the following expression:

To test that this filter is working correctly a 'Unique' tool can be connected to both outputs of the filter tool.

Select the 'Animal Type' field in both 'Unique' tools:

Check the 'U' output for both the unique tools.

We can see that output from the 'T' of the custom filter produces only 'Dog' and 'Cat' as these are the 2 unique values show in the unique tool.

Furthermore, the only unique values to be outputted from the 'F' of the custom filter do not include any variation of 'Dog' or 'Cat' so we can be confident that only dogs and cats remain in our data.

Splitting into 2 Groups:

The next requirement is to split 'Adoption', 'Return to Owner' and 'Transfer' Outcomes from all other outcome types.

This can be done using another Custom Filter as shown below:

This can again be tested using the Unique tool. Use the previous method to ensure that all rows have been filtered correctly but instead of selecting the 'Animal Type' in the Unique tool, select 'Outcome Type'.

Cleaning New Groups:

Insert a select tool after each Output from the 'Custom Filter' tool.

In these new select tools click 'Options' > 'Select' > 'Deselect All' to quickly unselect all fields. Then reselect only 'Outcome Type' and 'Animal Type' since this is all that is needed to complete the analysis.

Renaming the 'Outcome Type' is important since we have split its content into 2 groups. The 'Adopted', 'Returned to Owner' or 'Transferred' group and the 'Other group and this should be reflected in the name.

The select tool connecting to the 'T' output of the filter tool represents the 'Adopted', 'Returned to Owner' or 'Transferred' group. Whereas the Select tool connected to the 'F' output contains all 'Other' outcomes. rename the altered 'Outcome Type' fields to their relevant names.

The Select tool configuration connected to the 'T' Output of the Filter tool

Finding Totals:

Connect Summarize tools to the select tools. These will be used to count the total number of cats and dogs adopted, returned to owner or transferred.

Configure the Summarize tool by grouping by 'Animal Type' and then counting by either the ''Adopted', 'Returned to Owner' or 'Transferred' field or the 'Other' field depending on which Summarize tool you are working in.

This will return the number of dogs and cats that were Adopted/returned to owner/transferred and the number that had a different outcome.

The total of dogs and cats is also needed. So, return to the first filter created in which dogs and cats were isolated from the other animal types. From the 'T' output of this tool connect a Summarize tool.

In this summarise tool, Group by 'Animal Type' and also Count by 'Animal Type'. This will give the total number of cats and the total number of dogs before considering the outcome.

Appending Data:

The Summarize tools now need to be brought back together in order to make some calculations.

Bring two Append tools onto the workflow and connect them to the Summarize tools as below:

Calculations:

Drag 2 Formula tools into the view to calculate the % of animals Rehomed or not

This will create an output like this:

It has unnecessary rows that need to be removed.

To do this use another Custom Filter configured so that if the 'Animal Type' field is the same as the 'Source Animal Type' field the row will be kept.

Next add 2 Select tools to change the data type of our new calculations to 'Double'. This is necessary so that these numbers can then be rounded. If it is left as a string the 'round' function will error.

Add 2 formula tools onto each stream to round each calculation to 1 decimal place.

Add 2 select tools to each stream to remove final unnecessary columns.

Join:

Add a join tool to bring the two streams together. Make sure to join on Animal Type and to rename and remove fields.

Final Output:

Final Workflow:

Tableau Prep Solution:

Remove Duplicate Field:

As with the Alteryx flow, the first thing to do is to remove the duplicate column. This is incredibly quick to do in Prep. Simply click the 3 horizontal dots next to the field you wish to remove and simply select 'Remove'.

Filter:

Removing the animals that aren't animals is also very quick and easy. Again, use the 3 dots on the field you wish to change and follow Filter > Selected Values.

Then select only the animals you wish to keep. In this case 'Dog' and 'Cat'.

Grouping Outcome:

To group, use the 3 dots > Group Values > Manual Selection then select the values you wish to group together.

This will keep all rows but leave only 2 groups within the 'Outcome Type' field. To rename these groups simply double click on  the group you wish to rename and type the new name.

Aggregate:

Add an Aggregate step in to your flow. Drag 'Animal Type' and 'Outcome Type' from the 'Additional Fields' pane to the 'Grouped Fields' pane.

Next, drag the 'Number of Rows (Aggregated)' field into the Aggregated Fields pane.

LOD:

Add a new clean step and in the 'Animal Type' column click the 3 dots > Create Calculated Field > Fixed LOD

The calculation should be a fixed LOD grouped by Animal Type and should be a sum of the Number of Rows'.

This will give us the Total number of cats and dogs (Calculation 1).

Create a Calculated Field:

Now that we have the total of cats and dogs we need to calculate the % of total with the calculation below. This calculation includes rounding the output to 1 decimal place to reduce the number of steps needed later.

This will return this output:

This is fundamentally the output desired but will need some tidying up.

Remove Field:

We can remove the 'Number of Rows' field as this is no-longer useful.

Pivot:

Following the removal of the 'Number of Rows' field the output looks like this:

We need to pivot the data to retain only 2 rows. One for dogs and one for cats.

Add a pivot step into the flow. Ensure that you are converting Rows to Columns by selecting the downwards arrow in top right of the 'Pivoted Fields' pane.

Drag Outcome type into the 'Pivoted Fields' pane.

Drag '% of Total' onto the 'Field to aggregate for new columns' pane.

To see the final output simply add an 'Output' step to the flow. This will give you the template of the output.

Final Output:

Final Workflow:

Overall:

In this challenge using Tableau Prep was much faster than using Alteryx. Filtering and grouping took many more steps in Alteryx than in Tableau Prep. Simple steps such as renaming fields can be done by simply double clicking in Tableau Prep but will often require a new tool in Alteryx.

Finding a solution in Alteryx required more steps. It did however, allow me to clearly document my progress. In this way it is easier for someone else to understand the work that I have done if they were to continue where I left off.

The ways of validating work are different in Tableau Prep and Alteryx. In Alteryx you must use the available tools to create your own tests. In Tableau Prep you are able to see clearly which rows belong in which fields by hovering over them.  

Avatar

Emma E. Jones

2 mins read

Fri 26 Nov 2021

Thu 25 Nov 2021