Use SQL to Find the MEDIAN

By: Kris Wenzel   |   Updated: May 2, 2022  
Works With: SqlServer

Problem

You need use SQL to calculate the Median of a result from SQL Server.

Background

Suppose you need to calculate the Median using SQL.

You can loosely define the median as the “middle” value of your data set.  If you were calculating the median by hand, you would use the following rules to do so:

1. When there is an odd number of rows, you can easily find the middle:

How to find the Median when there are odd number of values.

2. It’s a bit more complicated where there are even rows, as you need to computer the middle:

How to find the median when there are an even number of values.

There isn’t a built-in function to calculate the median, as there is average.  So instead, we we’ll resort to using a windows analytic function.

We’ll calculate the median Total Sales form our PizzaDB sample database.  Here is the base query we’ll use to get the sample results:

select OrderDate, sum(FinalOrderPrice) TotalSales
from CustomerOrderSummary
group by OrderDate

Solution

To find the median we’ll use the PERCENT_RANK() window function.  By calculating the 50 percent rank, we arrive at the median.  Let’s see how you can do this using the PizzaDB sample database:

Calculate SQL Median

select OrderDate, TotalSales,
    percentile_cont(.5)  within group (order by TotalSales) over() MedianSales
from (
    select OrderDate, sum(FinalOrderPrice) TotalSales
    from CustomerOrderSummary
    group by OrderDate
) d

Our sample has eight rows, so falls into the “even number of rows” case for calculating the median. You’ll see that the MedianSales is the average row 4 and 5 TotalSales: (255.53 _ 262.96) / 2 = 258.745, which is also the result percentil_cont(.5) achieves.

Discussion

Though there isn’t a built-in function to handle MEDIAN, we can use PERCENTILE_CONT().  There are some restrictions to the data types PERCENTILE_CONT() handles.  Avoid floating point.  Instead, use Numeric, Decimal, or Money types.

Also, PERCENTILE_CONT() function’s OVER() clause is restricted.  Most importantly, you can’t use rows or range clauses within in it, meaning, it becomes very difficult to calculate a moving Median as we can do for averages.

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}
>