14

Learn about the First Normal Form and Database Design

First Normal Form

This is the second in a series of posts teaching normalization.

The first post introduced database normalization, its importance, and the types of issues it solves.

In this article we’ll explore the first normal form. For the examples, we’ll use the Sales Staff Information shown below as a starting point.  As we pointed out in the last post’s modification anomalies section, there are several issues to keeping the information in this form.  By normalizing the data you see we’ll eliminate duplicate data as well as modification anomalies. Unormalized Data

1NF – First Normal Form Definition

The first steps to making a proper SQL table is to ensure the information is in first normal form.  Once a table is in first normal form it is easier to search, filter, and sort the information. The rules to satisfy 1st normal form are:

  • That the data is in a database table.  The table stores information in rows and columns where one or more columns, called the primary key, uniquely identify each row.
  • Each column contains atomic values, and there are not repeating groups of columns.

Tables in first normal form cannot contain sub columns.  That is, if you are listing several cities, you cannot list them in one column and separate them with a semi-colon.

When a value is atomic, the values cannot be further subdivided.  For example, the value “Chicago” is atomic; whereas “Chicago; Los Angeles; New York” is not. Related to this requirement is the concept that a table should not contain repeating groups of columns such as Customer1Name, Customer2Name, and Customer3Name.

Unormalized Data - Repeating Groups

Our example table is transformed to first normal form by placing the repeating customer related columns into their own table.  This is shown below:

Data Model in First Normal Form

The repeating groups of columns now become separate rows in the Customer table linked by the EmployeeID foreign key.  As mentioned in the lesson on Data Modeling, a foreign key is a value which matches back to another table’s primary key.  In this case, the customer table contains the corresponding EmployeeID for the SalesStaffInformation row. Here is our data in first normal form.

First Normal Form Example Data

This design is superior to our original table in several ways:

  1. The original design limited each SalesStaffInformation entry to three customers.  In the new design, the number of customers associated to each design is practically unlimited.
  2. It was nearly impossible to Sort the original data by Customer.  You could, if you used the UNION statement, but it would be cumbersome.  Now, it is simple to sort customers.
  3. The same holds true for filtering on the customer table.  It is much easier to filter on one customer name related column than three.
  4. The insert and deletion anomalies for Customer have been eliminated.  You can delete all the customer for a SalesPerson without having to delete the entire SalesStaffInformaiton row.

Modification anomalies remain in both tables, but these are fixed once we reorganize them as 2nd normal form.

More tutorials are to follow! Remember!  I want to remind you all that if you have other questions you want answered, then post a comment or tweet me.  I’m here to help you. What other topics would you like to know more about?

Kris Wenzel
 

Kris Wenzel has been working with databases over the past 28 years as a developer, analyst, and DBA.He has a BSE in Computer Engineering from the University of Michigan and a MBA from the University of Notre Dame.Kris has written hundreds of blog articles and many online courses. He loves helping others learn SQL.

  • martin says:

    transitively … is all you have said about 3rd normal form and google doesnt even clearly define that word.

  • may says:

    that is very useful for beginner

  • Radek O. says:

    This is one of the best explanations publicly available. Great stuff, thanks heaps!

  • Stephan says:

    I don’t believe you should have the first and last name in your column SalesPerson. I believe 1NF is supposed to have atomic values

    • Hi,

      There is confusion on atomicity. It is now accepted that being atomic, at least when applied to databases, doesn’t mean indivisible. If that where the case then we would be compelled to break out date and time stamps into their component fields seconds, minutes, hours, days, months, and years.

      Check out this article for more information: http://en.wikipedia.org/wiki/First_normal_form (See the section on atomicity)

  • Kimete says:

    Hi Chris,

    I want to ask you, how you add to table Customers: columns CustormerCity and PostalCode because they are not in first tabel SalesStaff. Or maybe they ar part of database?
    Where you are based for this?
    If you cane explan this?

    Thankyou!

    All the best,

    Kimete

  • Christian Meyer says:

    This was really helpful, thanks for writing. I’m wondering what happens when multiple employees are associated with multiple customers. For example,
    Employee Customer
    123 Ford
    124 Ford
    123 Toyota
    124 Jeep
    How do you deal with the primary and foreign keys in cases like this? Sorry if this question is out of the scope of this article. Although, I do feel that this scenario could arise pretty commonly.
    Also should note that I haven’t read up on 2NF/3NF yet, looking for forward to it though!

    • I think you’ll see that the 2nd and 3rd normalization rules take care of that situation.

      Long story short, you’ll find yourself with three table at some point:
      employee
      employee-customer
      customer

  • black hawk says:

    hey,
    can a table be in 1NF if it has missing values?

  • Khalid says:

    Excellent site for database

  • >