Handling the basics of SQL.
SQL stands for Structured Query Language. It is a standard programming language that can extract and manipulate databases. However, there are different ‘regional dialects’ to this language. I have been using Snowflake; an online cloud-based database platform. You can upload data to this platform, ready to manipulate.
Biggest lesson of the day. The order in which you need to write your query…
SELECT
FROM
WHERE
GROUP BY
HAVING
ORDER BY
The hardest bit to get right is ordering the ‘select’ and ‘from’ statement correctly. ‘FROM’ points your query to a specific table within snowflake. ‘SELECT’ is specifying what fields you would like to obtain from the table. BUT, when you come to write your query, the ‘select’ part needs to be written first. A great tip I picked up was that it doesn’t matter what order you write out your query; whatever works for you! Just remember to re-order it before you run the query.
SELECT
When selecting fields to keep, you can keep everything,*, or just fields of interest. You can also wrap fields with functions in this section, to get them into your desired output. If you are aggregating, be sure to include you aggregated field here (e.g. SUM(sales) ) as well as what you will group by (e.g. Category). You need to include your ‘group by’ field again when you get to the ‘GROUP BY’ part of the query (e.g. group by Category).
FROM
Remember, you can only have one ‘FROM’ in your query. So if you are obtaining fields from different databases, you need to join them. To do this, you need to label both tables (o and c in the following example), so Snowflake knows which one is which. For example...
SELECT *
FROM Orders o
INNER JOIN Customers c
ON o.customer_id = c.customer_id
The above example is making an inner join between the Orders table (o) and Customer table (c), based on matching customer IDs. If you then wanted to extract particular fields from this master table, specify the field names instead of using a *. The ‘o’ and ‘c’ are like codes that tell Snowflake from which table a particular field comes from. This is particularly important when you have fields that are shared in both tables.
