SQL Cross Join

·

·

Cross Join Meme

Use SQL cross joins when you wish to create a combination of every row from two tables.  All row combinations are included in the result; this is commonly called cross product join.  A common use for a cross join is to create obtain all combinations of items, such as colors and sizes.

What is a SQL CROSS JOIN?

Use a SQL CROSS JOIN to combine rows from each table.  Unlike other joins, a cross join uses no join conditions.   It produces a combination of all rows from the tables joined. 

In other words, it produces a cartesian product of the two tables.  

Do you want to about all the join types? If so, check out SQL Joins – The Ultimate Guide

Simple Example

Let’s look at how to use a CROSS JOIN by generating combination of colors and sizes.

SELECT columnlist
FROM   maintable
       CROSS JOIN secondtable

Suppose you wanted to return a report showing every combination of color and size as shown in our sample data below:

Cross Join Two Tables to Get Combinations

To do this in SQL you could write

SELECT c.Color, 
       s.Size
FROM   Color c
       CROSS JOIN Size s

You see that the result contains every combination of color and size.   Since there are two rows in Color and four in Size, the final result is eight rows (2 x 4 = 8).

This combinatoric effect can make cross joins extremely dangerous!  If you CROSS JOIN a table of 1,000 rows with another of 1,000 you end up with 1,000,000 in the result.

Now that you know the basic, let’s look at a really good example where SQL cross joins help create better results.

Use SQL Cross Joins to Create Reports

Let’s consider an example where the president of Adventure Works would like to know by gender, how many employees are employed within each title, even those which contain no employees.

This request on the surface seems easy enough, we can use the GROUP BY and COUNT to create the result!  Here is the SQL:

SELECT   JobTitle,
         Gender,
         COUNT(1) as NumberEmployees
FROM     HumanResources.Employee
GROUP BY JobTitle, Gender
ORDER BY JobTitle, Gender

But notice that there are some missing combinations.  For instance, even though there is a Female (F) Benefits Specialist, there isn’t one that is Male (M).

Missing Group in Report

So how do you get combinations of JobTitle and Gender when the count is zero?

Use CROSS JOIN to Fill In Gaps

We can use a SQL cross join.  The idea is to first do a cross join on distinct values of gender and title.  These results can then be outer joined back to the employee table to obtain the account.

So what does this query really do?

To make it simpler to write and read, I used CTE’s (Common Table Expressions) to create this query.  If you are not familiar with CTE’s, think of them as views for now.  Note: you can read more about CTEs in my article Introduction to Common Table Expressions.

We create two CTE’s of distinct JobTitle and Gender values from the Employee table.  By cross joining these tables we can display all possible combinations of Job Titles and Gender.

Here is the query we can use to create distinct combinations:

WITH cteJobTitle (JobTitle)
AS  (SELECT DISTINCT JobTitle
     FROM   HumanResources.Employee),

cteGender (Gender)
AS (SELECT DISTINCT Gender
    FROM   HumanResources.Employee)

SELECT   J.JobTitle,
         G.Gender
FROM     cteJobTitle AS J
         CROSS JOIN cteGender AS G
ORDER BY J.JobTitle

Once you study the above query you’ll see the SQL cross join is just creating combinations from two separate distinct valued lists, color-coded blue and red, whose results are:
Combinations of Job Titles and Gender using CROSS JOIN

Combining Cross Join with Original Query

Now, all we need to do is take the results and combine them with the summarized data we obtain by grouping the data.  Again a CTE is used to collect the summarized data.  An OUTER JOIN is then used to combine this with all combinations of JobTitle and Gender.

The reason this works is that we are using an OUTER JOIN to return all results from one table regardless of whether they match another.  This is important as we want to include all results from the SQL cross join, which is all the combinations of gender and title, regardless of whether there is actually summarized data.

To summarize, here are the steps we are taking to create this result:

  1. Get a distinct list of JobTitles.  This happens in a CTE.
  2. Get a distinct list of Genders.  This happens in a CTE.
  3. Create all possible combinations of Job Titles and Genders using CROSS JOIN.
  4. Compute a summary count of employees by JobTitle and Gender.
  5. Match the computed summary count with a distinct list.

Here is the final query which accomplishes these steps:

WITH     cteJobTitle (JobTitle)
AS       (SELECT DISTINCT JobTitle
          FROM   HumanResources.Employee),

         cteGender (Gender)
AS       (SELECT DISTINCT Gender
          FROM   HumanResources.Employee),

         cteCounts (JobTitle, Gender, NumberEmployees)
AS       (SELECT   JobTitle,
                   Gender,
                   COUNT(1) AS NumberEmployees
          FROM     HumanResources.Employee
          GROUP BY JobTitle, Gender)

SELECT   J.JobTitle,
         G.Gender,
         COALESCE (C.NumberEmployees, 0) as NumberEmployees
FROM     cteJobTitle AS J
         CROSS JOIN cteGender AS G
         LEFT OUTER JOIN cteCounts AS C
         ON C.JobTitle = J.JobTitle
            AND C.Gender = G.Gender
ORDER BY J.JobTitle, G.Gender;

To make it easier to read, the two CTE’s and CROSS JOIN to create all combinations of JobTitle and Gender are colored blue and red.  The CTE to summarize the data is colored green.

Also notice we use the SQL COALESCE function to replace NULL with a zero value.

Here is the result of the query:

Cross Join Reporting Trick

INNER JOIN as CROSS JOIN

As you get to know SQL you realize there is usually more than one way to write a query.   For instance, there is a way using the WHERE clause to have a CROSS JOIN behave like an INNER JOIN.

Let’s take these two tables:

Table Relationships

Suppose we wish to query all employees and show their birth date and last names.  To do this we have to relate the Employee table to Person.

As we have seen we can write a cross join to combine rows as so

SELECT P.LastName,
       E.BirthDate
FROM   HumanResources.Employee E
       CROSS JOIN Person.Person P

But this query isn’t too useful in this case, as it returns 5,791,880 rows!

To limit the row combinations so the person records are properly matched to the employee rows we can use a WHERE clause.  Here is the final query:

SELECT P.LastName,
       E.BirthDate
FROM   HumanResources.Employee E
       CROSS JOIN Person.Person P
WHERE  P.BusinessEntityID = E.BusinessEntityID

Here are the first 14 rows of 290:

Cross Join Results

This query returns the same results as one written with an INNER JOIN.  The query using an inner join is:

SELECT P.LastName,
       E.BirthDate
FROM   HumanResources.Employee E
       INNER JOIN Person.Person P
       ON P.BusinessEntityID = E.BusinessEntityID

Query Plan Comparison

Which query is better?  If you look at the query plans, you’ll see they are very similar.

Here is the plan for the cross join…

Cross Join Query Plan

Here is the plan for the inner join…

INNER JOIN Query Plan

As you see they are identical.  The reason they are so, is that SQL is a declarative language, meaning we tell the DB what result we want, no necessarily how to do it.  When we provide the DBMS with our query the optimizer put together the best plan.  In most cases it will be the same for equivalent statements.

What to Read Next:

Join the newsletter

Subscribe to get our latest content by email.

Powered by ConvertKit
More from the blog


MySQL PostgreSQL SQLite SqlServer