All the examples for this lesson are based on Microsoft SQL Server Management Studio and the AdventureWorks2012 database. You can get started using these free tools using my Guide Getting Started Using SQL Server. Let’s get started and learn about a query plan.
Table of contents
What is a Query Plan?
A query plan is a set of steps that the database management system executes in order to complete the query. The reason we have query plans is that the SQL you write may declare your intentions, but it does not tell SQL the exact logic flow to use. The query optimizer determines that. The result of that is the query plan
Optimization is the process the DBMS uses to choose the most efficient execution plan for the query you have written.
Since SQL is a declarative language, you tell SQL what to do, but not how to do it. This means, many of the mechanics and in’s and out’s are lets to the DBMS. It gets to choose whether to use an index or just scan a table.
In SQL Server a query plan is called an execution plan.
Parts of an Execution Plan
There are several parts of an execution plan worth mentioning.
First, each plan is made up of one or more execution steps. These steps describe the database operations taken to create the query results. It is important to understand the steps and their implications. For instance, some steps, such as Nested Loops can be very expensive to complete.
In MS SQL Server you can hover over steps to see even more information, such as the relative cost of the step, number of rows processed, and the actual instructions SQL server will use to complete it. This information allows you to further understand the amount of work the step performs.
Another part of the plan is the flow from one step to another. In simple queries this is sequential. The output of one step flows into another. However, as queries get more complicated the plan contain several branches.
Each branch represents a different data source, such as another table from a query, and those branches are ultimately combined using steps such as a Merge step.
You can also hover over branches to see the number of rows output by the step. Here is a complete query plan for a simple query.
When reading a query plan, read them from right to left. The steps on the right are first executed, and their results fed into the next step on the left.
Viewing a Query Plan
Each query executed generates a query plan. It is easy to see the plan using the Microsoft SSMS (SQL Server Management Studio).
To do so, create a query, and then make sure Include Actual Execution Plan (1) is selected. Once the query run (2), the plan is shown.
One you have created the plan, run the query and then select the Execution plan tab (3) to view it.
Plans Take the Mystery out of SQL
When doing research for new articles I commonly came across statements such as “sub queries are less efficient than joins.” I got me thinking. Are these just opinions or are there solid facts behind these assertions?
I really wanted to know as I was putting together a series of articles about subqueries and, if they were truly inefficient and to be avoided, I wanted to tell my readers as such.
In order to understand how sub queries and joins were excecuted, I decided to look at their query plan to understand how the optimizer created each query plan. I was really expecting to see some costly steps in the subquery’s plan.
To my surprise the plans were almost the same.
Here is the subquery I used as test
SELECT SalesOrderID, OrderDate, TotalDue, (SELECT COUNT(SalesOrderDetailID) FROM Sales.SalesOrderDetail WHERE SalesOrderID = SO.SalesOrderID) as LineCount FROM Sales.SalesOrderHeader SO
And here is the equivalent INNER JOIN
SELECT SO.SalesOrderID, OrderDate, TotalDue, COUNT(SOD.SalesOrderDetailID) as LineCount FROM Sales.SalesOrderHeader SO INNER JOIN Sales.SalesOrderDetail SOD ON SOD.SalesOrderID = SO.SalesOrderID GROUP BY SO.SalesOrderID, OrderDate, TotalDue
They are nearly the same. Surprised? I was but after thinking about it it made sense. The SQL is declarative, that is we use it to express our intentions, but not how to actually extract the data out of the database.
After examining the SQL the DBMS broke down the steps, examined the most efficient way to complete the query and generated a plan. In both cases the plan was the same.
The lesson to learn from this when in doubt check the execution plan. If you feel a query is running slow and an equivalent query, such as a join, may be faster, write one up and check the plan. Check to see which uses more efficient steps. This much better than guessing. As you get better a reading plans you’ll start to notice things about your databases, such as whether you need to add an index.
In general I trust the DBMS to make the right decision and generate an optimal plan for my queries. In some cases can’t and then need to optimize.
Why Query Plans Fail
A good plan is only as good and the information from which it is based. When the DBMS parses your SQL and generates a query plan, it goes over many choices and picks the one with the lowest cost.
The cost is based on database statistics such as table row counts. If the statistics are out of date, then the plans are based on poor information. In addition, the absence of indexes also affect the plan.
I also want to point out that though the subquery and inner join plans turn out to be the same for the example I’ve shown with the AdventureWorks2012 database, it may not be the case for other databases or vendors such as Oracle. Each situation should be considered as there are many factors that can influence the optimizer.