How to Create A Bar Chart Using SQL Server
In this puzzle, we’re going to learn how to create a bar chart using SQL Server. Sometimes it’s fun to see what you can do with the humble SELECT statement. Today I figured it would be fun to see if I could create an “old fashioned” bar chart, much like we used to do with dot-matrix printers!
Solving puzzles is a great way to learn SQL. Nothing beats practicing what you’ve learned. Once you have figured out the puzzle, post you answer in the comments so we all can learn from one another. We also discuss puzzle and more in Essential SQL Learning Group on Facebook. Be sure to find us there!
SQL Puzzle Question
You and a couple of other parents have organized a games tournament for your kids. The kids were assigned teams, with each team playing five rounds of games. Now that the rounds are completed, it’s time to determine the overall winning team!
Your job is to create a bar chart show each team’s average score. The team with the highest average winning score is the overall winner.
The games data is housed in the @teamscore table:
DECLARE @TeamScore TABLE ( Team Varchar(20), Game Int, Score Int )
The bar chart you’re aiming to produce should look like this:
Are you up to the challenge of creating this chart using a SELECT statement?
Modify the display so the bar starts at the minimum score, indicates where the average lies, and ends at the maximum score. Here is an example:
Again, single SELECT statement need only apply…
If you wish, you can use this script to get started.
Answer to SQL Puzzle Question
To answer the question, break the problem down into several steps. The first step is to determine the average score. To do this, use the AVG aggregate function along with GROUP BY.
SELECT Team, Min(Score) MinScore, AVG(Score) AvgScore, Max(Score) MaxScore FROM @TeamScore GROUP BY Team
Here are the results:
Now that we have the average scores, we can start focusing on building the bar. Given our scores are so large, it isn’t practical to have each point represent a star. Instead we need to come up with a scaling factor. The scaling factor describes the number of points each start represents.
For our example the scaling factor is the maximum number of points attained divided by fifty.
The maximum number of points is calculated using this query:
SELECT MAX(SCORE) FROM @TeamScore
We’ll include this in our example as a sub query to avoid hard coding the maximum score. For our purposes, the scaling factor is:
50.0 / CAST((SELECT MAX(SCORE) FROM @TeamScore) as float)
We can use the REPLICATE command to build the bar. It makes it easy to repeat a set of characters. For instance,
So now all we need to do is multiply the team’s average score by the scaling factor and use this result as a parameter to the REPLICATE function to build our bar.
Here is the final query showing the result:
SELECT Team, AVG(Score) as AvgScore, REPLICATE('*', ROUND(CAST(AVG(Score) AS float) * 50.0 / CAST((SELECT MAX(SCORE) FROM @TeamScore) as float),0)) as Spark FROM @TeamScore GROUP BY Team
To make it easier to read, I colored the average score green and scaling factor blue.
Answer to Bonus Question
In order to answer the bonus question we use what we learned from solving the original question and build upon that.
In this case, rather than building a solid bar of start to represent the average score, we need to have the bar extend to the team’s maximum score; the trick being the start should start as their minimum score.
To do this we build up the base by replicating spaces until we reach the minimum score. At this point we replicate stars until the maximum score is reached.
Here is the bit used to build up the spaces:
REPLICATE(' ', ROUND(CAST(MIN(Score) AS float) * 50.0 / CAST((SELECT MAX(SCORE) FROM @TeamScore) as float), 0))
Here is the bit used to calculate the number of stars between the minimum and maximum scores
REPLICATE('*', ROUND(CAST((MAX(Score) - MIN(Score)) AS float) * 50.0 / CAST((SELECT MAX(SCORE) FROM @TeamScore) as float), 0))
It’s the same technique used before, except now we are only interested in building start from the minimum score to the maximum score. To do this we just take the difference. To build the bar, we just concatenate the space base with the stars.
Now that we’ve gotten the bar build, we have an issue: How do we indicate the team’s average score? It should be located somewhere in the middle of the bar, but who do we put it there?
STUFF to the Rescue!
Fortunately, we can use the STUFF built-in function to replace, at the position of our choosing, one character with another.
Given this, we can take the bar, which was built, and then knowing the position representing the team’s average score, exchange it with a “|”
To help you understand how this all works, I put together a diagram showing how a bar is built
Now for the answer to the bonus question! SELECT Team, STUFF( REPLICATE(' ', ROUND(CAST(MIN(Score) AS float) * 50.0 / CAST((SELECT MAX(SCORE) FROM @TeamScore) as float), 0)) + REPLICATE('*', ROUND(CAST((MAX(Score) - MIN(Score)) AS float) * 50.0 / CAST((SELECT MAX(SCORE) FROM @TeamScore) as float), 0)), CAST(ROUND(CAST(AVG(Score) AS float) * 50.0 / CAST((SELECT MAX(SCORE) FROM @TeamScore) as float), 0) as int),1,'|') as Spark FROM @TeamScore GROUP BY Team
Which produces these results:
In summary, I used several concepts to answer this question:
- GROUP BY to summarize scores
- The REPLICATE function to repeat characters, such as an asterisk or star (*)
- A subquery to find the maximum score.
- Data conversion functions
- STUFF function to replace one character in a string with another.