Problem
You need use SQL to calculate the Median of a result from SQL Server.
Background
Suppose you need to calculate the Median using SQL.
You can loosely define the median as the “middle” value of your data set. If you were calculating the median by hand, you would use the following rules to do so:
1. When there is an odd number of rows, you can easily find the middle:

2. It’s a bit more complicated where there are even rows, as you need to computer the middle:

There isn’t a built-in function to calculate the median, as there is average. So instead, we we’ll resort to using a windows analytic function.
We’ll calculate the median Total Sales form our PizzaDB sample database. Here is the base query we’ll use to get the sample results:
select OrderDate, sum(FinalOrderPrice) TotalSales from CustomerOrderSummary group by OrderDate

Solution
To find the median we’ll use the PERCENT_CONT() window function. By calculating the 50 percent rank, we arrive at the median. Let’s see how you can do this using the PizzaDB sample database:
Calculate SQL Median
select OrderDate, TotalSales, percentile_cont(.5) within group (order by TotalSales) over() MedianSales from ( select OrderDate, sum(FinalOrderPrice) TotalSales from CustomerOrderSummary group by OrderDate ) d

Our sample has eight rows, so falls into the “even number of rows” case for calculating the median. You’ll see that the MedianSales is the average row 4 and 5 TotalSales: (255.53 _ 262.96) / 2 = 258.745, which is also the result percentil_cont(.5) achieves.
Discussion
Though there isn’t a built-in function to handle MEDIAN, we can use PERCENTILE_CONT(). There are some restrictions to the data types PERCENTILE_CONT() handles. Avoid floating point. Instead, use Numeric, Decimal, or Money types.
PERCENTILE_CONT() is preferred over PERCENTILE_RANK() since it calculates the ranking based on a continuous distribution. This allows for results not observed, such as the case where we have an even number of observations and need to interpolate the result.
Also, PERCENTILE_CONT() function’s OVER() clause is restricted. Most importantly, you can’t use rows or range clauses within in it, meaning, it becomes very difficult to calculate a moving Median as we can do for averages.
Related SQL Median Articles:
- Calculate Moving Median (coming soon!)
- Calculate Moving Average
Leave a Reply