DS38 were given our first presentation task in DS38 and myself and Andrea were given the job of examining inputs and outputs in the world of Data Analysis and Tableau Prep. We decided to split the task in half, and I was responsible for looking at inputs. Inputs can come from three main sources: Files, Databases and Servers
Files can range from Excel files to text files and .csv files among many others.
Databases allow you to access an organized group of data sets that are structured. You also get Date warehouses, which store multiple databases that can be accessed.
Servers allow you to access a repository of files that are stored in the cloud and comes with all the benefits that the cloud offers to users.
Once a data source has been selected, it might need to be prepped to make it suitable for analysis. This is typically easier from data sources from databases as they are often organized in the same manner throughout the database and have already been partly prepped. Query Languages can be used to browse and retrieve files from databases such as SQL. Multiple Fl
Data Sources can be manipulated, cleansed and reshaped for the purposes of data analysis and often this is necessary. Inputs from all 3 sources of data can require some reworking. Multiple datasets can sometimes be combined through unionization and joining if certain parameters are met.
Using software such as excel, tableau, PowerBI and a whole range of others, inputs can be analyzed and visualized into outputs.
Some common visualization output files after data has been analyzed and visualized are workflows from software such as tableau prep, and hyper files from Tableau Desktop and Public.
