In this article let’s go over the 10 SQL data types you need to know! Imagine it’s your week on the job as an entry level Application Developer. The application you work on is closely tied with Microsoft SQL Server.
Your boss asks you to make a change to one of the tables used by the application. The table is called ‘Inventory‘, and you need to add a column called ‘Qty‘. Basically, we need a way to store the number of each item we have in our inventory, and this new ‘Qty‘ column is the answer.
Ok, that sounds easy enough, right? You already know the column will store numeric values, so this means you will want to use a numeric data type….
But did you know there are ELEVEN numeric data types?
When you’re just starting out, it might feel overwhelming to try to learn all the different data types out there (of which there are over 30!). Instead, what you ought to do is learn the most basic, common data types used in the real world first (then learn the rest as you go!).
Now that you understand the problem, here are the topic we’re covering to help you understand what SQL types to use.
In this tutorial, we will discuss the top 10 most common sql data types you will see and use as a database developer.
As I’ve said, this is not a full list. This tutorial is just meant to get your foot in the door, and get you thinking like a developer.
Here is the list of topics we’ll cover in this tutorial, including the top 10 data types you need to be familiar with when working with SQL Server:
Choosing the Correct Data Type
- Numeric Data types:
- Date and Time Data Types:
- Character String Data Types:
- “Other” Data Types:
- Tips, tricks and links
- Next Steps:
Numeric Data types:
Need to store a number? If so, you should obviously use a numeric data type.
But what kind of number do you need to store? Maybe a whole number (like the quantity of a product in your inventory), or maybe a decimal (like the result of a math division)? The answer to that question will determine what kind of numeric data type to use.
The most basic numeric data types are:
Integers and INT
Oh yes, the classic INT. You’ll want to use this data type if you need to store a whole number. You remember math class, right? A whole number doesn’t contain a decimal!
A good example of when you can use this data type is in the example I started this tutorial with, where you want to store the quantity of each product in your inventory.
Check it out:
Here are the specifics about the INT data type:
- Range: -2,147,483,648 to 2,147,483,647
- Size: 4 bytes
I wanted to include the SMALLINT data type to get you thinking about an important rule when it comes to choosing a data type: Choose the smallest data type that will serve your needs!
For example, if you know your data won’t ever come close to the maximum number you can store for an INT data type, maybe you should think about using a smaller data type.
An example: What if you are a used car salesman, and you use a database table to store information about all the cars in your inventory. One of the columns in that table stores the ‘year‘ of each car.
What’s the smallest value you might need to store in that column? Maybe 1990? Certainly nothing less than, say, 1950. What’s the largest number you would need to store? At the time of this writing, the largest number would be 2021.
So, using the INT data type for this ‘year‘ column would be overkill. The range of INT is about -2 billion to +2 billion!
SMALLINT would be a better choice. It’s range is about -32,000 to +32,000. Yes, it’s still overkill, but not as much overkill.
But the bigger benefit is this: SMALLINT numbers can be stored using less space than INT numbers. SMALLINT uses 2 bytes for storage, as opposed to INT which uses 4 bytes.
Here are the specifics on the SMALLINT data type:
- Range: -32,768 to 32,767
- Size: 2 bytes
The BIT data type is a handy little type. All it stores is the value 1 or zero.
This data type is great for if you need a flag value, or if you need a column to represent true or false.
For example, you could use the BIT data type if you want to add a column called ‘InStock‘ to your ‘Inventory‘ table.
So this column would answer the question “Is this item in stock?”. A value of 1 is true, and a value of zero is false.
Let’s see it in action:
So when the Qty of an item is zero (meaning it’s not in stock), we should set our bit column to zero (aka ‘false’)!
Here are the specifics about the BIT data type:
- Range: 0 to 1
- Size: 1 byte
The DECIMAL Data Type
Need to store a decimal value? I can’t think of a more appropriate data type than DECIMAL!
An example would be if you want to store the dimensions of your furniture products in inches. It’s certainly possible for a dimension to not be a nice round whole number.
Can a coffee table be 42 and a half inches long? Yes!
Can a bookshelf be 60 and three quarters inches tall? Yes!
The syntax for the DECIMAL data type is a little weird. Here it is:
You replace the letters pand s with numbers.
To explain what these numbers mean, let’s look at an example. Let’s create a ‘FurnitureProducts‘ table with three DECIMAL columns:
Ok, so the letter ‘p‘ stands for ‘precision‘. This is just a fancy word to say “The total number of digits to store”.
The letter ‘s‘ stands for ‘scale‘. This is another fancy word to say “The number of digits to the right of the decimal point”.
So let’s take our ‘Length‘ column as an example. We gave it a DECIMAL(5,2) data type. This means we can store up to 5 digits in the column, 2 of which will be after the decimal point.
(So I guess that means we can store up to 3 digits to the left of the decimal)
Pop quiz: How many digits would be to the left and right of the decimal with a DECIMAL(24, 10) data type?
Answer: 14 digits to the left of the decimal, and 10 to the right, giving us a grand total of 24 digits.
Here is an example of the DECIMAL data type in action:
All my dimension columns have a data type of DECIMAL(5,2), which means I can store up to 3 digits to the left of the decimal, and up to 2 digits to the right.
Here are the specifics on the DECIMAL data type:
- p can be between 1 and 38
- s can be between 0 and p
- Think about that for a minute. You can’t have something like DECIMAL(3,4). This is saying the total number of digits can be up to 3, with 4 of them being after the decimal? Folks, that doesn’t make any sense.
- The maximum range for a decimal is between -10^38 +1 and 10^38 – 1.
- The value p determines the amount of storage allocated to the value.
|When p is between…||Storage per value is…|
|1 and 9||5 bytes|
|10 and 19||9 bytes|
|20 and 28||13 bytes|
|29 and 38||17 bytes|
FLOAT Type for Engineering and Science
The FLOAT data type is one of those types you might not use very much, but it’s an interesting data type that is worth discussing and thinking about.
The FLOAT data type is known as an “approximate number” data type. This basically means not all numbers in the data type range can be represented exactly. Instead, it may give you an approximate number.
You might think “Why the heck would I want to use FLOAT if the number I’m storing might not be exact?” You would use FLOAT for extremely large or extremely small numbers where it’s ok if you are off by a little bit.
An example: What if you wanted to store the number of inches between Earth and the moon?
If you’re off by a few inches, who cares? Even if you’re off my several inches, WHO CARES?
The benefit of FLOAT is it can represent numbers far beyond what any other numeric data type can. The cost is it is imprecise, but maybe that’s ok!
Here are the specifics on the FLOAT data type:
- The syntax of FLOAT is this: FLOAT(n). The letter ‘n‘ represents the number of bits used to store something called the ‘mantissa’ of the number in scientific notation.
I know, what the heck?
Here is an example. Take the number 4.2345 * 10^7. This is a number in scientific notation. The ‘mantissa‘ is the 4.2345 part.
So when you write a number using scientific notation, the ‘mantissa’ is the part before the ” * 10^x” part.
Here is a link to an extremely simple explanation: MathIsFun.com – Mantissa.
The letter ‘n‘ can be between 1 and 53
- The range of FLOAT is -1.79E+308 to -2.23E-308, 0 and 2.23E-308 to 1.79E+308
Like I said, extremely large or extremely small numbers!
- If ‘n‘ is between 1 and 24, the number will use 4 bytes of storage
- If ‘n‘ is between 25 and 53, the number will use 8 bytes of storage.
Date and Time Data Types:
Need to store a date value? You should use a date data type!
An example of when you would want to use a date and time data type is when you want to store the date and time an order was placed for one of your products. This would obviously be good information to have!
The most basic date/time data type is:
Wow, how appropriate is that name? The DATETIME data types stores a date and time value.
Let’s look at the example I already discussed, where you want to store the date and time an order was placed:
Pretty straightforward, right? You can see the format is YYYY-MM-DD HH:MM:SS:nnn
A good question would be “Is there an easy way to populate a DATETIME column with a value?“
I’m glad you asked.
You could use a system function called GETDATE(). This function simply returns the current date and time.
For example, let’s insert another row into our ‘OnlineOrders‘ table, and use GETDATE() to populate our DATETIME column (at the time of this writing, it is April 13, 2021 at 8:23 AM):
Here are the specifics on the DATETIME data type:
- Range: January 1, 1753, through December 31, 9999
- Size: 8 bytes
Character String Data Types:
Need to store a string value, like ‘First Name‘, ‘Product Code‘, or ‘Email‘? You should probably use a string data type!
The difference in string data types come down to the length of data that will be stored, and also whether or not you want your string value to be stored in different languages.
We’ll look at examples, I promise.
The most basic string data types are:
When you use the CHAR data type (or any of the string data types, for that matter). You need to specify a size. Here is the syntax for using the CHAR data type:
CHAR(n), where ‘n‘ represents the maximum number of bytes you want to store.
Wait, what? What do you mean “the number of bytes you want to store”?
Microsoft makes it very clear that ‘n‘ is not the number of characters to store, but it is the number of bytes to store. As it turns out, in the English language, one character takes up one byte of storage. So, if you want to store 10 characters, what would the value of ‘n‘ be?
I know it sounds like we’re comparing apples to apples, but in the eyes of Microsoft, they are different.
Anyway, moving on.
Let’s say you have a table called ‘CustomerInfo‘ which stores information about your customers. One of the things you want to track is the State they live in. You can create a ‘StateVal‘ column to store the two-character abbreviation for the State the customer lives in.
Pretty simple right? But wait, there’s more….
Don’t Waste Space
You’ll want to use the CHAR data type for when the values stored are relatively consistent in size. For example, all U.S. states have a 2 character abbreviation, so I know I won’t have much variance in the size of the values in this column (in fact, there shouldn’t be any variance at all).
This is because when you use the CHAR data type, a fixed amount of memory is reserved for the values.
If I had made this column CHAR(10) instead, for example, there would be 10 bytes reserved for each value in each column. It doesn’t matter that I only need 2 of those bytes. SQL Server is still going to reserve all 10 bytes.
In other words, I would be wasting space. The first two bytes would be used, sure, but the next 8 bytes would be literally empty and unused.
It’s like if you built a giant warehouse for a single truck:
This might not seem like a big deal with only 4 rows in my table, but imagine how this problem could grow exponentially as I add more rows to the table. Essentially, each column reserves 500% more space than it needs!
Which leads us to the next data type…
VARCHAR (most popular!)
The VARCHAR data type solves this problem of wasted space. With the VARCHAR data type, SQL Server reserves enough space to store the actual value, and nothing more.
So if you want to store the letters ‘WY‘ in a VARCHAR(10) column, SQL Server would reserve only the needed amount of space to store the actual ‘WY‘ value, which would only be 2 bytes.
The other 8 bytes would be given to memory, ready to be used for something else!
SQL Server says “I was prepared to give you 10 bytes if you needed it, but I see you only need 2, so I’ll give you those 2 bytes and give the rest back to the operating system”.
How sweet :)
The Big Difference Between VARCHAR and CHAR
Compare that with regular CHAR(10), where SQL Server would reserve the full 10 bytes, even though you don’t need all 10.
SQL Server would say “You told me you needed 10 bytes, so you freaking got 10 bytes. I don’t care how much space is wasted, bro”.
The VARCHAR data type is ideal for when your values will vary greatly in size. Because again, SQL Server is prepared to give you the full space if you need it, but if not, SQL Server will reserve only the amount of space your actual value takes up.
Take a column called ‘LastName‘ for example. Let’s give it a VARCHAR(15) data type. If you enter a last name of ‘Hicks‘, there would be 5 bytes reserved for this value in memory, and nothing more.
But if you enter a last name of ‘Pottersfield‘, there would be 12 bytes reserved for this value in memory, and nothing more. So I guess these two values would take up a total of only 17 bytes.
But what if we had chosen to use CHAR(15) instead? Shoot, those two values would have taken up 30 BYTES! So much space wasted!
When Should I use NVARCHAR?
The NCHAR (and NVARCHAR) Data Type
A question I asked myself very often when I was learning SQL Server was “What’s up with this weird NCHAR and NVARCHAR?”.
I sorta told myself they could be used interchangeably with their regular CHAR and VARCHAR counterparts, because “Who cares? Things seem to work fine if I use either/or.”
There’s a BIG difference you need to be aware of.
NCHAR and NVARCHAR are called ‘Unicode‘ data types. They are only ideal for if you need to store string values in multiple languages.
Compare that to regular CHAR and VARCHAR which will let you store string values in English and only one other language.
Remember how I said with CHAR and VARCHAR, SQL Server will use 1 byte per character? Well, with NCHAR and NVARCHAR, SQL Server will use 2 bytes per character. That extra byte is just in case you need to store your string value in multiple languages.
So if you have a column with the NCHAR(10) data type, SQL Server will essentially reserve space for 10 2-byte pairs!
Can you imagine how much space we would waste with our ‘WY‘ value if we used an NCHAR(10) data type? Essentially SQL Server would reserve 20 bytes for something that only needs 2.
The whole space-saving thing is the same with NCHAR and NVARCHAR as it is with regular CHAR and VARCHAR. NVARCHAR reserves only as much space as the value needs, compared to NCHAR which is going to reserve the entire space, regardless of how much is actually used.
So again, the main takeaway is this: You should have a solid reason for using NCHAR or NVARCHAR, because they are going to use much more space than their CHAR and VARCHAR counterparts.
“Other” Data Types:
There are a handful of data types that fit into the “other” category. They certainly have their time and place, but you shouldn’t worry about them too much when you are just starting out with SQL Server.
But there is one “other” data type you should know, and more importantly, the circumstances where you could maybe use it. It is:
UNIQUEIDENTIFIER Data Type
This data type is good if you need a column to be given a truly unique value for each row that gets inserted into a table.
What do I mean by “truly unique“? I mean no other table or column across time and space has the exact same value.
This is a good data type to use if you want Primary Key values to be truly unique, for example. No other table or column in your database management system would have the exact same values as this primary key column.
Let’s see it in action. I created a simple table called ‘ProductInfo‘, with a column called ‘ProdID‘ which is my primary key:
In order to populate this column with a value, you need to use the NEWID() system function. Like this:
Using NEWID with UNIQUEIDENTIFIER
This NEWID() function is what generates that “truly unique value across time and space“.
Here is what the values actually look like:
So if you created another table, with another UNIQUEIDENTIFIER column, that second table will never have the same exact values as this first table.
Compare that to if we just used an INT as the data type for our primary key columns. Could two tables have the same exact value?
Here are some rows in our CustomerInfo table:
Notice there is a Customer with a CustID of 4.
Here are some rows from my Products table:
Notice there is a Product with a ProdID that is also 4.
So the number 4 is NOT “truly unique across time and space“, is it?
That’s what I mean. With UNIQUEIDENTIFIER, no two tables will share a value!
There is a BIG CATCH with UNIQUEIDENTIFIER you should be aware of: It uses A LOT of space. Each unique value uses 16 bytes of space.
So this is another data type you should use only if you mean to.
Tips, tricks and links
Here is a list of tips and tricks you should know when working with SQL Server data types:
- Remember, you should use the smallest data type needed to satisfy your business requirements. The use of appropriate data types will reduce the amount of memory that gets wasted.
- CAST and CONVERT are two extremely handy system functions to basically convert a value from one data type to another on the fly.
Converting Values from One Type to Another
Kris has a great tutorial about CAST and CONVERT on his blog. Check it out here: Know when to use Cast versus Convert.
- For string data types, you can get away with not specifying a size. In that case, the size defaults to one (which is probably less than you need, so you’ll probably be explicit anyway).
Here is an example of adding a CHAR data type with the default size of 1 to my ‘EmployeeInfo‘ table:
You can add values like ‘Y‘ or ‘N‘ to this column. But again, the best practice is to be explicit about the size you want your string data types to be.
- Here is great link to all the data types offered by Microsoft: Data Types (Transact-SQL). This is an overview page about all the different data types available to use. It contains links to the more in-depth discussions about each individual data type. Definitely check that out.
Data types are used for more than just creating columns in tables. You specify data types when creating stored procedures, for example. The input and output parameters to a stored procedure are all given data types, and there might be local variables within the definition of the stored procedure that also use data types.
Stored procedures are another thing you will definitely encounter as a database professional, so it’s important to know what they are and how to write them. I have a full tutorial on stored procedures, here: Stored Procedures: The Ultimate Guide for Beginners.
The same is true for user defined functions. They can also use local variables that use data types. Functions are another common tool we use in SQL Server. Learn everything you need to know here: SQL Server User Defined Functions: A Complete Guide.
Thank you for reading! I hope you found this tutorial helpful. If you have any questions, or if you are struggling with a different topic related to SQL Server, I’d be happy to discuss it. Visit my contact page and send me an email!