How to use the Intersect Operator
The INTERSECT operator is used to combine like rows from two queries. It returns rows that are in common between both results. To use the INTERSECT operator, both queries must return the same number of columns and those columns must be of compatible data types.
Note: All the examples for this lesson are based on Microsoft SQL Server Management Studio and the AdventureWorks2012 database. You can get started using these free tools using my Guide Getting Started Using SQL Server.
Visual Example of Intersect
In this example, the circles represent two queries. The orange circle is the left query; whereas, the blue circle is the right. The area within each circle represents that query’s results.
As you can see the green portion represents the result of the INTERSECT operator. This area represents those rows that are in both the left and right query.
Below is the general format of the INTERSECT operator.
SELECT Name, BirthDate FROM Employee INTERSECT SELECT Name, BirthDate FROM Customer
There are two queries which are separated by the INTERSECT operator. The top query is commonly called the left query.
The query is valid since both the left and right queries contain the same number of columns and each column is a similar data type; Char and Date respectively.
Contrast this to
SELECT Name, BirthDate FROM Employee INTERSECT SELECT Age, BirthDate, Name FROM Customer
Which is invalid on multiple levels. First the number of columns isn’t the same. Additionally, the data type for each column is incompatible. For instance, Name, which is a Char column isn’t a compatible data type with Age.
Uses for Intersect
The intersect operator is good when you want to find common rows between two results. The INTERSECT operator is similar to the AND operator; however, they operate on different database objects.
The Intersect operator is used to compare entire rows; whereas, the AND operator is used to compare columns within rows.
Don’t worry, it becomes clearer below.
Intersect Two Tables
Let’s assume we want to find all job titles for positions held by both male and female employees. How could we do this? The first set is to compose the queries to find positions held by males, then to do the same for females.
Here is the query for males, the one for the females is very similar:
SELECT JobTitle FROM HumanResources.Employee WHERE Gender = 'M'
To finish we need to find out which titles are in common. To do this we can use the INTERSECT operator.
SELECT JobTitle FROM HumanResources.Employee WHERE Gender = 'M' INTERSECT SELECT JobTitle FROM HumanResources.Employee WHERE Gender = 'F'
You may be tempted to try and simplify this statement by eliminating the INTERSECT operator all together and use the following
SELECT JobTitle FROM HumanResources.Employee WHERE Gender = 'M' AND Gender = 'F'
But this won’t simply work. Why? Because the Where clause is evaluated for each row and you’re never going to find a Gender value equal to both M and F for the same record.
To order the result by JobTitle we can use an ORDER BY clause. Keep in mind this works on the the final row set returned by the interest operator.
SELECT JobTitle FROM HumanResources.Employee WHERE Gender = 'M' INTERSECT SELECT JobTitle FROM HumanResources.Employee WHERE Gender = 'F' ORDER BY JobTitle
The INTERSECT hasn’t always been part of SQL Server . Before its introduction to the language you had to mimic the INTERSECT behavior using and INNER JOIN.
Below is the equivalent statement to find job titles in common for both genders:
SELECT DISTINCT M.JobTitle FROM HumanResources.Employee AS M INNER JOIN HumanResources.Employee AS F ON M.JobTitle = F.JobTitle AND M.Gender = 'M' AND F.Gender = 'F'
This join is called a self-join, since we are joining the table to itself. The idea is to match up every JobTitle with same values. By pairing these values together we can then compare their corresponding gender values and keep those where one gender is male and the other female.
NOTE: These are equivalent to a point. AS we have learned, NULL aren’t values, there fore NULL = NULL is always false. Given this, the INNER JOIN will fail to match on joins; howver, the INTERSECT operator does match NULLS.