Combine Table Rows Using UNION
In this lesson we are going to talk about the UNION clause. You can use the UNION clause to combine table rows from two different queries into one result. Unlike a join, which combines columns from different tables, a union combines rows from different tables. Here is an illustration of what an UNION looks like
All the examples for this lesson are based on Microsoft SQL Server Management Studio and the AdventureWorks2012 database. You can get started using these free tools using my Guide Getting Started Using SQL Server.
In SQL this statement looks like
SELECT columnlist FROM table1 UNION SELECT columnlist FROM table2
In order to union two tables there are a couple of requirements:
- The number of columns must be the same for both select statements.
- The columns, in order, must be of the same data type.
When rows are combined duplicate rows are eliminated. If you want to keep all rows from both select statement’s results use the ALL keyword.
Union two tables
Suppose you were asked to provide a list of all AdventureWorks2012 product categories and subcategories. To do this you could write two separate queries and provide two separate results, such as two spreadsheets, or you could use the UNION clause to deliver one combined result:
SELECT C.Name FROM Production.ProductCategory AS C UNION ALL SELECT S.Name FROM Production.ProductSubcategory AS S
From this you get a combined list of names, but suppose you wanted to know which name were categories versus subcategories. To do this you can add a new column indicating the category type:
SELECT 'category', C.Name FROM Production.ProductCategory AS C UNION ALL SELECT 'subcategory', S.Name FROM Production.ProductSubcategory AS S
Union versus Union All
The difference between UNION and UNION ALL is that with UNION returns a unique set of rows from the union result; whereas, UNION ALL returns every row.
SELECT person.Address.City FROM person.Address
Returns 19614 rows.
SELECT person.Address.City FROM person.Address UNION SELECT person.Address.City FROM person.Address
Returns 575 rows, which is the number of distinct city names within the table. Running UNION All returns the entire set of city names twice:
SELECT person.Address.City FROM person.Address UNION ALL SELECT person.Address.City FROM person.Address
It returns 39228 rows.
As you can see there is a big difference with using the ALL qualifier. When not used, the results are distinct values. Duplicates are not only eliminated between rows from each result, but also from within.
Union three tables
Suppose management wants a combined list of people, vendors, and store names identified by source.
To do this we create three separate queries and then use the union clause to put them together. We will then order the list.
SELECT 'Person' AS Source, FirstName + ' ' + LastName AS Name FROM person.Person UNION SELECT 'Vendor', Name FROM Purchasing.Vendor UNION SELECT 'Store', Name FROM Sales.Store ORDER BY Name;
At first glance you may think the ORDER BY clause would only apply to the last select statement, but in fact it applies to all the results returned by the union. The database engine first process all the union statements then the order by.
If you’re in doubt about the processing order, you can use parenthesis “()” to control the order of evaluation much like you can with expressions. Here is what the statement, in general, would look like with parenthesis:
(SELECT 'Person' AS Source, FirstName + ' ' + LastName AS Name FROM person.Person UNION SELECT 'Vendor', Name FROM Purchasing.Vendor UNION SELECT 'Store', Name FROM Sales.Store) ORDER BY Name;