Subqueries versus Joins

·

·

,

Use both Joins and subqueries to query data from different tables. Though they may even share the same query plan, are many differences between them.  Knowing the differences and when to use either a join or subquery to search data from one or more tables is key to mastering SQL.

All the examples for this lesson are based on Microsoft SQL Server Management Studio and the AdventureWorks2012 database.  Get started using these free tools with my Guide Getting Started Using SQL Server.

Joins versus Subqueries

Joins and subqueries both combine data into a single result using either .  They share many similarities and differences.

Once difference to notice is Subqueries return either scalar (single) values or a row set; whereas, joins return rows.

Example Subquery

A common use for a subquery may be to calculate a summary value for use in a query.  For instance we can use a subquery to help us obtain all products have a greater than average product price.

SELECT ProductID,
       Name,
       ListPrice,
       (SELECT AVG(ListPrice)
          FROM Production.Product) AS AvgListPrice
  FROM Production.Product
 WHERE ListPrice > (SELECT AVG(ListPrice)
  FROM Production.Product)

There are two subqueries in this SELECT statement.  The first’s purpose is to display the average list price of all products, the second’s purpose is for filtering out products less than or equal to the average list price.

Below you’ll see a subquery filtering out products with the single value it returns.

Notice how the subqueries are queries unto themselves.  In this example you could paste the subquery, without the parenthesis, into a query window and run it.

Example JOIN

Contrast this with a join whose main purpose of a join is to combine rows from one or more tables based on a match condition.  For example we can use a join display product names and models.

Select Product.Name,
       ProductModel.Name as ModelName
FROM   Production.product
       INNER JOIN Production.ProductModel
       ON Product.ProductModelID = ProductModel.ProductModelID

In this statement we’re using an INNER JOIN to match rows from both the Product and ProductModel tables.  Notice that the column ProducModel.Name is available for use throughout the query.

The combined row set is then available by the select statement for use to display, filter, or group by the columns.

Read More: SQL Joins – The Ultimate Guide

This is different than the subquery.  The subquery returns a single result, which then filters the records.

Note that the join is an integral part of the select statement.  It can not stand on its own as a subquery can.

You’ll notice that some subqueries act as separate queries within the main outer query. You can actually copy and run the in their own query window. But there are other times where an outer query is “interwoven” into the subquery’s conditions.

These are correlated subqueries. The subquery is evaluated once for each outer query row.

Read More: Correlated Subqueries >>

Where are Joins and Subqueries Found?

Joins are used in the FROM clause of the WHERE statement; however, you’ll find subqueries used in most clauses such as the:

  • SELECT List – These subqueries typically return single values.
  • WHERE clause– depending on the conditional operator you’ll see single value or row based subqueries.
  • FROM clause– It is typical to see row based result subqueries used here.
  • HAVING clause – I mostly see subqueries returning single values in this situation.

Comparing Join and Subquery Execution Plans

Despite their differences, joins and subqueries are used to solve similar problems. In fact just because you write a SQL statement as a subquery doesn’t mean the DBMS executes as such.

Let’s look at an example.

Suppose the Sales Manager for Adventure Works wants a detailed listing of all sales orders and the number of order details lines for each order.

Surprisingly there are two ways to go about solving this.  We can use a join or a subquery.

Here are the two statements side by side:

join and subquery compared
Side-by-Side Comparison of Join and Subquery

Obviously they look different, but did you know they have very similar query plans?

Here is the query plan for a subquery

query plan for a subquery
Subquery Query Plan

If you look closely you’ll see there is a Merge Join operation.  The subquery uses the same set of operations to return a result as you see with the join!   In fact, if you look at the corresponding joins query plan, you’ll see it is very similar.  You can get more detail about his in my article what is a query plan.

Subqueries and joins can be confusing, but they don’t have to be that way.  I have put together a really great series of videos explaining subqueries and their mysteries.  Click the button below to see more!

9 responses to “Subqueries versus Joins”
  1. […] Subqueries versus Joins – What’s the difference? […]

  2. CUMHUR

    In your examples, you use subquery when it will only calculate same thing for any rows, like average of a table. What happens when you write a subquery that contains data from the rows. Will it run for each row? For example, lets say we are looking for name of the companies which has at least 1 employe over age 40. When you compare these 2 :

    SELECT DISTINCT(c.NAME) FROM COMPANY c LEFT JOIN EMPLOYEE e ON c.C_ID = e.C_ID WHERE e.AGE > 40

    and

    SELECT c.NAME FROM COMPANY c WHERE EXISTS (SELECT * FROM EMPLOYEE e WHERE e.AGE > 40 AND c.C_ID = e.C_ID )

    would there be a major performance difference? Thanks !

  3. Tom Uden

    Is it possible to join a table to a subquery in the from clause?

    1. Yes, you can use what is called a derived table to do the join.

  4. Carl Jr.

    Thanks. :)

  5. KC Abramson

    This is amazing. My entire life has been a lie.

    I blame being a regular programmer vs sql programmer for thinking that subqueries do the query for each row (just like they would in a program with nested FOR loops)

    This was very eye opening. I really need to recheck all of my DB fundamentals as i’m doing heavy SQL programming now

  6. Ian

    Thanks, this is a helpful introduction. Aside from one’s preferred solution methodology, when should one be used over the other? What sort of situations might arise where the dbms uses a different query plan? It’d be helpful to know if there are instances where one has a significantly greater computational cost.

    I’ll experiment with my own benchmark comparisons, but it would be worth adding this info to a section of the article. Otherwise, thanks, this was helpful.

  7. Thanks for clarification ..join and sub query details examples

  8. eve

    thank you, this is very helpful for me to conceptually understand the difference between the two concepts, especially the attached image of the query plan.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

More from the blog


MySQL PostgreSQL SQLite SQL Server