SQL Server Hierarchical Query using the hierarchyid type

In this SQL minute we’ll look at how to create a SQL server hierarchical query using the hierarchyid data type.  This is a two-part article.  In part one you were introduced to the hierarchyid type.  In this article we dig deeper into the hierarchyid functions.

If you’re studying for the 70-761 exam, you’ll find there aren’t many intermediate SQL problems with explanations for you to solve.  To help you out I’ve put together my SQL Practice Work book.  When you subscribe to receive it, you’ll not only get my workbook with video answers, but an engaging problem to solve sent to your inbox each week.

For this problem we will use the AdventureWorks database.

Solving the Request using a SQL Server Hierarchical Query

AdventureWorks is concerned their manager’s span-of-control, that is how many employees report to them, is out of whack.  To address this, the HR director would like a report that displays the job title, employee full name, their boss’ full name, and how many people, including themselves are in their organization.

Here is an example of of what she is looking to receive from you:

SQL Server Hierarchical Query Results
SQL Server Hierarchical Query Results

We’ll tackle this query in steps.

Step 1 – Start Small

From our last problem you should be familiar with the hierarchyid SQL type and that it’s use to represent a tree like structure and it provide several functions, such as GetAncestor, to navigate hierarchies, such as org structures.

To start with this query, lets get the employee and their name.  To do this, we’ll use an INNER JOIN:

SELECT E.JobTitle, 
P.FirstName + ' ' + P.LastName as FullName
FROM HumanResources.Employee E
INNER JOIN Person.Person P
ON P.BusinessEntityID = E.BusinessEntityID

Step 2 – Build up Success

Now let’s build on that success and go grab the “report to” column values.  This is a bit more complicated, as we need to introduce a self-join on the employee table to get the employee’s boss.  In SQL server hierarchical query speak, this is the ancestor.

As you can see from the diagram the top of the tree is the root, for the Server Manager, bosses are ancestors, and bosses’ direct reports descendants.  Another way to look at the tree is to thing of it as a family tree, where there are parents and children. 

So, back to getting the Report To name… to do this we’ll be joining to employee, and then again to person to get the name.

Here is a table diagram showing the relationships:

SQL Query Table Relationships

The tables are aliased:  P and E are the person and employee table from our original query.  R is the alias for the employee table to get the boss, and RP is an alias of the person table so we can match the boss to their name.

SELECT E.JobTitle, 
P.FirstName + ' ' + P.LastName as FullName,
RP.FirstName + ' ' + RP.LastName as ReportsTo
FROM HumanResources.Employee E
INNER JOIN Person.Person P
ON P.BusinessEntityID = E.BusinessEntityID
INNER JOIN HumanResources.Employee R
ON R.OrganizationNode = E.OrganizationNode.GetAncestor(1)
INNER JOIN Person.Person RP
ON RP.BusinessEntityID = R.BusinessEntityID
ORDER BY E.OrganizationNode

We’re getting closer! We not have a way to get an employee’s boss, but check out the results, there is a small problem.  The CEO is missing!

The problem with this query is that the CEO doesn’t report to anyone, so they aren’t included… we need to fix…

Step 3 – Include CEO in SQL Server Hierarchical Query

The issue with the query is the GetAncestor function doesn’t return a value for the top-most person in the tree.  Given this, our join condition can’t match.  To fix this we can add another join condition which specifically addresses the CEO.

Keep in mind when working with joins you join conditions can include Boolean logic.  We’re going to rewrite our join so it matches a record from aliased table R, if either of the following conditions are met:

  • The employee’s ancestor OrganizationNode matches another OrganizationNode.  This is the existing join condition.
  • The employee’s OrganizationNode is the Root and OrganizationNode matches another.  This is the CEO, as the root node is the topmost node, just like the CEO is the top person in an organization.

Here is the modified query, note the OR operator within the second join condition.  Also, note how I enclose this second join condition, the one to handle the CEO, in parenthesis.

SELECT E.JobTitle, 
P.FirstName + ' ' + P.LastName as FullName,
RP.FirstName + ' ' + RP.LastName as ReportsTo
FROM HumanResources.Employee E
INNER JOIN Person.Person P
ON P.BusinessEntityID = E.BusinessEntityID
INNER JOIN HumanResources.Employee R
ON R.OrganizationNode = E.OrganizationNode.GetAncestor(1) OR
(E.OrganizationNode = hierarchyid::GetRoot() AND
R.OrganizationNode = E.OrganizationNode)
INNER JOIN Person.Person RP
ON RP.BusinessEntityID = R.BusinessEntityID
ORDER BY E.OrganizationNode

Address Org Size using Hierarchyid Functions

The last step to complete is adding the org size to our results.  At first you may think you’ll need to write some sort of recursive CTE to traverse the tree and count up reports, but once you think about it, you’ll realize you can simply test whether an employee’s OrganizationNode is a descendant of the current employees.

This becomes a simple job, as you can use the IsDecendantOf function and a correlated subquery to count rows where this condition is met.  The subquery is correlated, as part of its criteria includes a reference to the outer query.

Here is an example of the completed SQL server hierarchical query.

SELECT E.JobTitle, 
P.FirstName + ' ' + P.LastName as FullName,
RP.FirstName + ' ' + RP.LastName as ReportsTo,
(SELECT COUNT(1)
FROM HumanResources.Employee
WHERE OrganizationNode.IsDescendantOf(E.OrganizationNode) = 1) as OrgSize
FROM HumanResources.Employee E
INNER JOIN Person.Person P
ON P.BusinessEntityID = E.BusinessEntityID
INNER JOIN HumanResources.Employee R
ON R.OrganizationNode = E.OrganizationNode.GetAncestor(1) OR
(E.OrganizationNode = hierarchyid::GetRoot() AND
R.OrganizationNode = E.OrganizationNode)
INNER JOIN Person.Person RP
ON RP.BusinessEntityID = R.BusinessEntityID
ORDER BY E.OrganizationNode

Now that you have seen how to query tables that use the hierarchyid type, what other kinds of problems can you see solving with this type’s built-in functions?

Also, please keep in mind, depending on your database size, you may be better off using a different type of design to represent your hierarchy.

If you have any comments regarding this post, or your ideas on representing a tree let us know in the comments below.

Kris Wenzel
 

Kris Wenzel has been working with databases over the past 28 years as a developer, analyst, and DBA. He has a BSE in Computer Engineering from the University of Michigan and a MBA from the University of Notre Dame. Kris has written hundreds of blog articles and many online courses. He loves helping others learn SQL.

>