The SQL Group BY clause is used to summarize your SQL results. Rather than displaying each row in detail, when GROUP BY is added to your query, one row per grouping is returned. Using groups comes in handy when you want to calculate totals.
All of the examples in this tutorial are based on PizzaDB. You can get the script to build the PizzaDB here.
For this tutorial we’ll use the CutomerOrderDetail view, which is shown below
Suppose we want to know the total purchases by Customer. To get this result we can use the following query:
For this query all rows within the CustomerOrderDetail view are processed; however, only one row per customer is returned. As rows are processed the PurchaseAmount is totaled. As new customer groups are processed to total is reset.
Note: A view is a pre-built query we can treat as a table with our queries. Read up on them in our article What is a View?
SQL GROUP BY Syntax
Here is a simplified SQL syntax incorporating GROUP BY into the SELECT statement.
SELECT Column1, Column2, ... AggregateFunction(Column3)
FROM TableName
WHERE condition
GROUP BY Column1
ORDER BY Column1
In this example
- Column1, Column2, … represent the columns you which to group the result set. These columns must be part of the GROUP BY clause.
- AggregateFunction – A function like COUNT(), SUM(), or AVG() which performs a calculation on a set of values.
- TableName is the table you are querying
- condition is used to filter rows in the result set
- GROUP BY specifies the columns you which to group by.
- ORDER BY allows you to sort the grouped fields.
Group by Two or More Columns
You can also group by one or more columns. Here we group on OrderDate and ProductName
There are couple of things to point out here. First notice how we group by OrderDate and ProductName. You see both columns specified in the SELECT and GROUP BY clauses. Second, we included an ORDER BY clause.
Using the GROUP By statement doesn’t guarantee order. Yes, you get unique values in the group, but to ensure your data is properly sorted, include the ORDER BY statement.
GROUP BY and SQL order of Execution
Now that we are starting to add more clauses to our SELECT statement, lets talk about how each clause is executed.
When a query is run, the SQL database does the following:
- Retrieve data for the table specified.
- Filter any rows per the where clause.
- Group the filtered result per the GROUP BY SQL clause
- Order the date per the ORDER BY clause.
- Display the selected columns.
I think it is important to understand this so you better understand how your queries work. For example understanding that data is filtered out prior to the group by statement lets you understand why Pan Pizza is missing from the following results:
In this new lesson we’ll explore the various aggregate functions you can use along with GROUP BY.