The SQL UNION, SQL INTERSECT, and SQL EXCEPT clauses are used to combine or exclude like rows from two or more tables. They are useful when you need to combine the results from separate queries into one single result. They differ from a join in that entire rows are matched and, as a result, included or excluded from the combined result.
Overview
These operators can be used on any query; however, a couple simple of conditions must be met:
- The number and order columns must be the same in both queries
- The data types must be the same or compatible.
Table of contents
SQL UNION Set Operator
In this section let’s talk about the SQL UNION operator. You can use the UNION clause to combine table rows from two different queries into one result.
What is a SQL UNION?
Unlike a join, which combines columns from different tables, a union combines rows from different tables. Here is an illustration of what a SQL UNION looks like
In SQL this statement looks like
SELECT columnlist FROM table1 UNION SELECT columnlist FROM table2
In order to union two tables there are a couple of requirements:
- The number of columns must be the same for both select statements.
- The columns, in order, must be of the same data type.
When rows are combined duplicate rows are eliminated. If you want to keep all rows from both select statement’s results use the ALL keyword.
Learn More: What is the Difference Between a Join and a Union >>
Combine Table Rows Using SQL UNION
The Union operator returns rows from both tables. Use UNION to return a distinct list of values. Use UNION ALL to return distinct rows. SQL UNION for sorting results from two separate queries as one combined result. For instance, if you have two tables, Vendor, and Customer, and you want a combined list of names, you can easily do so using:
SELECT ‘Vendor’, V.Name FROM Vendor V UNION SELECT ‘Customer’, C.Name FROM Customer C ORDER BY Name
Note the ORDER BY clause applies to the combined result.
Union two tables
Suppose you were asked to provide a list of all AdventureWorks product categories and subcategories. To do this you could write two separate queries and provide two separate results, such as two spreadsheets, or you could use the SQL UNION operator to deliver one combined result:
SELECT C.Name FROM Production.ProductCategory AS C UNION ALL SELECT S.Name FROM Production.ProductSubcategory AS S
From this you get a combined list of names, but suppose you wanted to know which names were categories versus subcategories. To do this you can add a new column indicating the category type:
SELECT 'category', C.Name FROM Production.ProductCategory AS C UNION ALL SELECT 'subcategory', S.Name FROM Production.ProductSubcategory AS S
SQL UNION versus SQL UNION ALL
The difference between UNION and UNION ALL is that UNION returns a unique set of rows from the result; whereas, UNION ALL returns every row.
SELECT person.Address.City FROM person.Address
Returns 19614 rows.
SELECT person.Address.City FROM person.Address UNION SELECT person.Address.City FROM person.Address
Returns 575 rows, which is the number of distinct city names within the table. Running SQL UNION All returns the entire set of city names twice:
SELECT person.Address.City FROM person.Address UNION ALL SELECT person.Address.City FROM person.Address
It returns 39228 rows.
As you can see there is a big difference with using the ALL qualifier. When not used, the results are distinct values. Duplicates are not only eliminated between rows from each result, but also from within.
UNION three tables
Suppose management wants a combined list of people, vendors, and store names identified by source.
To do this we create three separate queries and then use the SQL UNIONclause to put them together. We will then order the list.
SELECT 'Person' AS Source, FirstName + ' ' + LastName AS Name FROM person.Person UNION SELECT 'Vendor', Name FROM Purchasing.Vendor UNION SELECT 'Store', Name FROM Sales.Store ORDER BY Name;
At first glance you may think the ORDER BY clause would only apply to the last select statement, but in fact it applies to all the results returned by the sql union. The database engine first process all the union statements then the order by.
If you’re in doubt about the processing order, you can use parenthesis “()” to control the order of evaluation much like you can with expressions. Here is what the statement, in general, would look like with parenthesis:
(SELECT 'Person' AS Source, FirstName + ' ' + LastName AS Name FROM person.Person UNION SELECT 'Vendor', Name FROM Purchasing.Vendor UNION SELECT 'Store', Name FROM Sales.Store) ORDER BY Name;
SQL Intersect Set Operator
The SQL INTERSECT operator is used to combine like rows from two queries. It returns rows that are in common between both results. To use the SQL INTERSECT operator, both queries must return the same number of columns and those columns must be of compatible data types.
In this example, the circles represent two queries. The orange circle is the left query; whereas, the blue circle is the right. The area within each circle represents that query’s results.
As you can see the green portion represents the result of the SQL INTERSECT operator. This area represents those rows that are in both the left and right query.
SQL Intersect Example
Below is the general format of the INTERSECT operator.
SELECT Name, BirthDate FROM Employee INTERSECT SELECT Name, BirthDate FROM Customer
There are two queries which are separated by the SQL INTERSECT operator. The top query is commonly called the left query.
The query is valid since both the left and right queries contain the same number of columns and each column is a similar data type; Char and Date respectively.
Contrast this to
SELECT Name, BirthDate FROM Employee INTERSECT SELECT Age, BirthDate, Name FROM Customer
Which is invalid on multiple levels. First the number of columns isn’t the same. Additionally, the data type for each column is incompatible. For instance, Name, which is a Char column isn’t a compatible data type with Age.
Uses for SQL INTERSECT
The intersect operator is good when you want to find common rows between two results. The INTERSECT operator is similar to the AND operator; however, they operate on different database objects.
The SQL Intersect operator is used to compare entire rows; whereas, the AND operator is used to compare columns within rows.
Say what?
Don’t worry, it becomes clearer below.
Intersect Two Tables
Let’s assume we want to find all job titles for positions held by both male and female employees. How could we do this? The first set is to compose the queries to find positions held by males, then to do the same for females.
Here is the query for males, the one for the females is very similar:
SELECT JobTitle FROM HumanResources.Employee WHERE Gender = 'M'
To finish we need to find out which titles are in common. To do this we can use the SQL INTERSECT operator.
SELECT JobTitle FROM HumanResources.Employee WHERE Gender = 'M' INTERSECT SELECT JobTitle FROM HumanResources.Employee WHERE Gender = 'F'
You may be tempted to try and simplify this statement by eliminating the INTERSECT operator all together and use the following
SELECT JobTitle FROM HumanResources.Employee WHERE Gender = 'M' AND Gender = 'F'
But this won’t simply work. Why? Because the Where clause is evaluated for each row and you’re never going to find a Gender value equal to both M and F for the same record.
Using Order By with SQL INTERSECT
To order the result by JobTitle we can use an ORDER BY clause. Keep in mind this works on the the final row set returned by the interest operator.
SELECT JobTitle FROM HumanResources.Employee WHERE Gender = 'M' INTERSECT SELECT JobTitle FROM HumanResources.Employee WHERE Gender = 'F' ORDER BY JobTitle
SQL INTERSECT Equivalence
The INTERSECT hasn’t always been part of SQL Server . Before its introduction to the language you had to mimic the INTERSECT behavior using and INNER JOIN.
Below is the equivalent statement to find job titles in common for both genders:
SELECT DISTINCT M.JobTitle FROM HumanResources.Employee AS M INNER JOIN HumanResources.Employee AS F ON M.JobTitle = F.JobTitle AND M.Gender = 'M' AND F.Gender = 'F'
This join is called a self-join, since we are joining the table to itself. The idea is to match up every JobTitle with same values. By pairing these values together we can then compare their corresponding gender values and keep those where one gender is male and the other female.
NOTE: These are equivalent to a point. AS we have learned, NULL aren’t values, therefore NULL = NULL is always false. Given this, the INNER JOIN fails to match on joins; however, the SQL INTERSECT operator does match NULLS.
You’ll find there is usually more than one way to solve a problem in SQL.
SQL Except Set Operator
The SQL EXCEPT operator is used to exclude like rows that are found in one query but not another. It returns rows that are unique to one result. To use the EXCEPT operator, both queries must return the same number of columns and those columns must be of compatible data types.
In this example, the circles represent two queries. The orange circle is the left query; whereas, the blue circle is the right. The area within each circle represents that query’s results.
As you can see the orange crescent (moon shape) represents the result of the EXCEPT operator. This area represents those rows that are on the left and not in the right query.
SQL EXCEPT Example
Below is the general format of the EXCEPT operator.
SELECT Name, BirthDate FROM Employee EXCEPT SELECT Name, BirthDate FROM Customer
There are two queries that are separated by the EXCEPT operator. The top query is commonly called the left query.
The query is valid since both the left and right queries contain the same number of columns and each column is a similar data type; Char and Date respectively.
Contrast this to
SELECT Name, BirthDate FROM Employee EXCEPT SELECT Age, BirthDate, Name FROM Customer
Which is invalid on multiple levels. First, the number of columns isn’t the same. Additionally, the data type for each column is incompatible. For instance, Name, which is a Char column isn’t a compatible data type with Age.
Uses for SQL Except
The except operator is good when you want to find common rows exclusive to one result. I like to use it when testing query results. I take the result of my un proven query and EXCEPT them against a proven query. This help me identify rows deserving my attention and troubleshooting.
Except Two Tables
Let’s assume we want to find all job titles for positions held by males but not female employees. How could we do this? The first set is to compose the queries to find positions held by males, then to do the same for females.
Here are the query males, the one for females is very similar:
SELECT JobTitle FROM HumanResources.Employee WHERE Gender = 'M'
To finish we need to find out which titles are common to only male employees. To do this we can use the EXCEPT operator.
SELECT JobTitle FROM HumanResources.Employee WHERE Gender = 'M' EXCEPT SELECT JobTitle FROM HumanResources.Employee WHERE Gender = 'F'
You may be tempted to try and simplify this statement by eliminating the EXCEPT operator all together and use the following
SELECT JobTitle FROM HumanResources.Employee WHERE Gender = 'M' AND NOT Gender = 'F'
But this won’t simply work. Why? Because the Where clause is evaluated for each row. Logically this where clause will return all job titles for males.
Using Order By with EXCEPT
To order the result by JobTitle we can use an ORDER BY clause. Keep in mind this works on the final row set returned by the except operator.
SELECT JobTitle FROM HumanResources.Employee WHERE Gender = 'M' EXCEPT SELECT JobTitle FROM HumanResources.Employee WHERE Gender = 'F' ORDER BY JobTitle
SQL EXCEPT Equivalence
Using Subquery
The EXCEPT operator was just recently added to SQL Server. Before its introduction to the language, you had to mimic the EXCEPT behavior using a subquery.
SELECT JobTitle FROM HumanResources.Employee WHERE Gender = 'M' EXCEPT SELECT JobTitle FROM HumanResources.Employee WHERE Gender = 'F' ORDER BY JobTitle
Below is the equivalent statement to find job titles only held by Males:
SELECT DISTINCT M.JobTitle FROM HumanResources.Employee AS M WHERE M.Gender = 'M' AND M.JobTitle NOT IN (SELECT F.JOBTITLE FROM HumanResources.Employee AS F WHERE F.Gender = 'F')
I colored the subquery in green. We haven’t talked about sub queries yet, but will in the next series of articles. In general, the subquery is run once for each result returned from the main query. In this example, once we select a job title that is held by a male (the main query) we then do another query asking whether that job title in the set of job titles held by females (the subquery). If not, then the job title is retained in the results.
Using Outer Join
Use the EXCEPT Operator to return only rows found in the left query. It returns unique rows from the left query that aren’t in the right query’s results. This query is useful when you’re looking to find rows that are in one set but not another. For example, to create a list of all vendors that are not customers you could write:
SELECT JobTitle FROM HumanResources.Employee WHERE Gender = 'M' EXCEPT SELECT JobTitle FROM HumanResources.Employee WHERE Gender = 'F' ORDER BY JobTitle
Like INTERSECTION, EXCEPT has an equivalent SQL statement. In this case, we can use an OUTER JOIN to construct its equivalent:
SELECT Distinct M.JobTitle FROM HumanResources.Employee M LEFT OUTER JOIN HumanResources.Employee F ON M.Title = F.Title AND F.Gender = 'F' WHERE M.Gender = 'M' ORDER BY M.JobTitle
Using Parenthesis
You can build complicated queries using these operators. In fact, there’s nothing stopping you from combining one or more of these operators into a super query. When this is done, be sure to use parenthesis “()” to control which operators are evaluated first.
It may not be apparent to you or another SQL reader that,
SELECT A FROM TA INTERSECT SELECT B FROM TB EXCEPT SELECT C FROM TB UNION SELECT D FROM TD
Evaluates as
((SELECT A FROM TA INTERSECT SELECT B FROM TB) EXCEPT SELECT C FROM TC) UNION SELECT D FROM TD
When there is no parenthesis, the order of evaluation is:
- INTERSECT
- EXCEPT and UNION are evaluated Left to Right
Can you remember this?
My recommendation is to just use parenthesis and make it clear. Tricky is kewl, but you’ll get burned down the road when you misread your own code – trust me on this one…
Out of the three queries, the UNION operator is irreplaceable. There is no other way to combine results from two queries into a single result without using UNION.
On the other hand, as you saw earlier, both EXCEPT and INTERSECT’s results can be reproduced using OUTER and INNER JOINS respectively. In fact, you’ll find that the JOIN version of the queries runs more efficiently than EXCEPT and INTERSECT do and is more versatile as you can include fields from the left table that aren’t in the right.
For instance
SELECT V.Name, V.Address FROM Vendor V EXCEPT SELECT C.Name FROM Customer C ORDER BY Name
Isn’t valid, since the number of columns in both queries doesn’t match, whereas,
SELECT Distinct V.Name, V.Address FROM Vendor V LEFT OUTER JOIN Customer C ON V.Name = C.Name WHERE C.Name is NULL ORDER BY V.Name
Is valid.
Leave a Reply