Database Design Principles – Part 3: Eliminate Columns Not Dependent on Key

databaseIn the third of this five part database design series, we review the database design rule of removing columns not dependent on the key.  Similar to removing redundant data, this rule will make the database more flexible and maintainable throughout the lifecycle of the system.

A basic example of columns not dependent on the key can be shown in the following table:

  • Book ID
  • Book Title
  • Book Publish Date
  • Publisher Name
  • Publisher Address
  • Publisher Phone

There are two separate entities stored in the same table – book information, and for each book, that book’s publisher information.  If every book had a different publisher, this would be fine.  More than likely, however, many books will share the same publisher.  If the publisher’s phone number changes, the users will need to go through every single book from that publisher, and make sure they update the phone number.  This leads to a lot of overhead for users.  In addition, if one of the users forgets to change a row, as they often do, the data integrity will be compromised, and some rows will have incorrect information.

Another weakness of this approach is that there is no capability to add a publisher that is not associated with a book.  Let’s say that a publisher is connected to a book which is removed from the system.  Once we delete that book, we would lose the publisher’s address and phone number as well.

The solution to this dilemma is to add an ID column, and separate the table into two tables, one for books, and one for publishers:

Book

  • Book ID
  • Book Title
  • Book Publish Date
  • Publisher ID

Publisher

  • Publisher ID
  • Publisher Name
  • Publisher Address
  • Publisher Phone

This database schema update will make sure that publisher updates are be automatically propagated to any related books.  It will also remove the storage space wasted by repeating publisher details on each book record, and result in a much more stable foundation for the software development.

Join us for the next article in this series where we explore isolating “independent multiple relationships.”

Written by Andrew Palczewski

About the Author
Andrew Palczewski is CEO of apHarmony, a Chicago software development company. He holds a Master's degree in Computer Engineering from the University of Illinois at Urbana-Champaign and has over ten years' experience in managing development of software projects.
Google+

RSS Twitter LinkedIn Facebook Email

Leave a Reply

Your email address will not be published. Required fields are marked *