Is it possible to use Joins, Subqueries, and Unions on Multiple Tables?

·

·

, ,

A reader recently asked me to help them solve a problem combining data from multiple tables into a single result.  What was interesting was that the solution used the three ways to combine data that we have recently covered, namely:  joins, set operators, and subqueries.

Given the comprehensive nature of this problem I thought it would be a great case study for you to read.

The Problem

Query Question

“I have a main data table that lists events. Each event has a series of fields including duration, reason code and status.

The database has a table of statuses and there is a common field between the main table and this allows me to show the status name rather than the code. 

There are range of reasons associated with four statuses: ready, delay, spare, and down.  It is possible for the same code to exist in two statuses (e.g. delay maintenance and down maintenance).

For some reason the source database has four tables of reasons (one for each status).  Each of these has a field which links to the main reason code. What is missing is a status field in those tables.

What I’m looking for is something along the lines of ‘’when the main table record status is ‘delay’ then retrieve the name in the ‘delay’ table which corresponds to the main table delay code.  When the status is ‘down’, do the same thing but look up the name in the ‘down’ table.

If you have any pointers I’d love to hear them!”

The Goal

So the goal is to create a result that uses descriptive values rather than codes.  Let’s walk through the solution together.

SQL Query Goal - Using Multiple Table get the desired result.
Case Study Goal

Here is an overall view of the various tables and the matching challenge, which is to “conditionally match” on one of the four “ready” tables depending on the value of Status:

SQL Matching Chalenges
Challenges Matching “Conditionally”

Formulating a Solution – Working with Multiple Tables

Getting the status name is pretty straight forward.  For this we can use an INNER JOIN; however, displaying the Reason Name is tougher.

Once you put your mind to it, you’ll realize that that answer hinges around getting a better grip on the various “reason” tables.  If they could be treated uniformly, then we could simply join to them once, avoid duplicate entries, and solve the problem.

When I originally thought about this, I was thinking about seeing if I could use SELECT CASE to solve the problem, and then it hit me!  We can do this using UNIONS.  If you notice all the tables have a very similar structure.

Given this, we can create a derived table as the result of combining the four “reason” tables which can then be joined to the original table using both the status code and ready code.  This avoids having to “conditionally match” to each of four tables.

I see three distinct sub solutions:

  1. Joining to the Status table to get the status name.
  2. Using UNION to combine the various “reason” tables; therefore, avoiding the “conditional match” issue.
  3. Incorporating the result from step two into the final query via a subquery (derived table).

Let’s go through each of these sub problems in order.

1. Using INNER JOIN to return StatusName

Since we need to obtain the StatusName from the status table corresponding to the status code in events we can use an INNER JOIN.  When the join condition is met, rows are returned. This is the most common join.

Typically you’ll use an INNER JOIN to match a foreign key value back to its primary key.

This is exactly what you have.  The StatusCode is the primary key of the status table and we can use that to match that to the foreign key value StatusCode from Event.

You don’t have to worry about multiple rows being returned since the status code is unique within the status table.

The diagram below summarizes this concept:

Inner Join Depiction
Inner Join to Status Table

The INNER JOIN portion is colored red in the following example:

SELECT EventID,
       Duration,
       S.Name
FROM   Event E
       INNER JOIN Status S
       ON E.StatusCode = S.StatusCode

Of course, there is more to do, we also need to get the reason name, but this is a good start.

Read More: SQL Joins – The Ultimate Guide >>

2. Utilizing UNION to Combine Reason Table Rows

UNION is called a set operator.  The UNION operator combines rows from several tables into a single result.  Whereas a join is meant to combine columns from different tables into a single row the UNION operator is adding rows from each table.

In the following diagram you can see the four separate “reason” tables.

Using Union to Combine Data
Using UNION to Combine Reason Data

By successfully applying union clauses we’ll combine all the rows.  The other thing you’ll notice is we’re adding a static column which corresponds to the status code.  Static column is just a fancy way of saying we’re adding a column that is a fixed value.

By doing this we can associate each table’s rows with the proper status code.  Remember, this was one of our big hurdles; we didn’t have a good way to conditionally match to each of the four tables based on status code.

By adding the status code to the result, we can eventually use both the status and reason codes to match to the correct row.

This code creates the unified results.

SELECT 'R', ReasonCode, Name
FROM   ReadyReason
UNION
SELECT 'D1', ReasonCode, Name
FROM   DelayReason
UNION
SELECT 'S', ReasonCode, Name
FROM   SpareReason
UNION
SELECT 'D2', ReasonCode, Name
FROM   DownReason

Read More: SQL Union, Intersect, and Except – The Ultimate Guide >>

3. Using Subqueries to Include a Derived Table in Final Result

Once the union is created we are now able to use it to match and pull in the reason names.  As you can see in the diagram below, the matching becomes much easier.  We no longer have to inspect the status code, then decide which one of the four tables to use before matching on reason code to get the name.

Instead we can now use a standard INNER JOIN to match both the status code and reason code to the result of the union.

But that does bring up a question.  How do we introduce that result into our query?

Fortunately there is a way to do so using subqueries.  There are many places we can use sub queries, such as in the SELECT or WHERE clause.  We can also use them in FROM clause.  When used here, they typically refer to derived tables.  The result of our union is a derived table.

We’ll alias it and enclose it in parenthesis.

Union In Derived Table
UNION in Derived Table

If you look closely at the SQL you see the UNION result is given the name SR.

You’ll also notice there is a column alias for the static field.  This is so we can refer to it in the join condition.  It makes the SQL easier to read and follow.

In the sample below I’ve color coded the UNION green and it use in the INNER JOIN blue..

SELECT EventID,
       Duration,
       SR.ReasonName
FROM   Event E
       INNER JOIN
       (
        SELECT 'R' as StatusCode, ReasonCode, Name
        FROM   ReadyReason
        UNION
        SELECT 'D1', ReasonCode, Name
        FROM   DelayReason
        UNION
        SELECT 'S', ReasonCode, Name
        FROM   SpareReason
        UNION
        SELECT 'D2', ReasonCode, Name
        FROM   DownReason
       ) SR
       ON E.StatusCode = SR.StatusCode AND
       E.ReasonCode = SR.ReasonCode

Remember the form of an inner join is

INNER JOIN {table} ON {match conditions}

What we have don is replace {table} with our derived table.

In this case it works very well.





Read More: SQL Subqueries – The Ultimate Guide >>

Final Query – Combine Multiple Tables

To create the final result we combine the three sub solutions together.  From the section above, you can see that each means to do so is relatively simple.  Sure, there is syntax to contend with, but I think overall the ideas are straightforward.

Of course when glopped all together it looks overwhelming, but hopefully you have seen that if you take a step-by-step approach to building your SQL, that becomes less so.

Final Query
Final query using INNER JOIN, UNION, and subquery

To make it easier for you to see the final solution, each section is highlighted according to the sub solution we discussed.

The first INNER JOIN is in red, the UNION in Green, and the INNER JOIN using the derived table in blue.

SELECT EventID,
       Duration,
       S.Name as [Status Name],
       SR.ReasonName as [Reason Name]
FROM   Event E
       INNER JOIN Status S
       ON E.StatusCode = S.StatusCode
       INNER JOIN
       (
        SELECT 'R', ReasonCode, Name
        FROM   ReadyReason
        UNION
        SELECT 'D1', ReasonCode, Name
        FROM   DelayReason
        UNION
        SELECT 'S', ReasonCode, Name
        FROM   SpareReason
        UNION
        SELECT 'D2', ReasonCode, Name
        FROM   DownReason
       ) SR
       ON E.StatusCode = SR.StatusCode AND
       E.ReasonCode = SR.ReasonCode

This concludes the case study.

Conclusion

I hope you found that helpful.  Did you see that, though the end result looked like a mouthful, it was really made up from three simpler concepts?

Many of my students raise questions about this process that is how to write complex queries such as these.

For instance they are not sure where to start.  This is where I can help.  I’ve got some really great idea to help get you going with confidence so you can start writing queries your first day.

Others wonder how to put it all together.  As you can see from the case study, there is a progression.  I like to take things step-by-step.

When solving complicated problems I like to build on success.  Did you see how I started with a simple solution, added a step, and continued?

If you like my style of teaching I would recommend you sign up for my newsletter below so you don’t miss out on any future training opportunities!

Related Posts

5 responses to “Is it possible to use Joins, Subqueries, and Unions on Multiple Tables?”
  1. Aimi

    hi chris i dont know last query even my lecturer dont know.Can you update the new SQL?

  2. Will

    In the Final Query, the image need to be updated. Where did the 4th column within the Derived Table come from?

    In addition, the text for the final SQL should be updated. The outer query should reflect SR.Name as [Reason Name] in the SELECT statement.

    1. I just double checked the image, and it checks out.

      The fourth column in the image comes from the SELECT statement within the derived table.

      Regarding the last query, S.Name references the column Name within the StatusCode table. SR.Name reference Name within the derived table.

  3. Mark

    In step 3, I think you missed something: where does SR.StatusCode come from? I think you failed to name it in the JOIN.

    1. Thanks for noticing that. I got ahead of myself and forgot to put the column alias on the table.

      I fixed both the image and example command.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

More from the blog


MySQL PostgreSQL SQLite SqlServer