Tutorials‎ > ‎

Basics Of Database Table Design - Pt7

posted 19 Nov 2012, 05:35 by Alistair Hamilton   [ updated 15 Jun 2015, 07:59 ]
Basics of Database Table Design
By now, you should have a reasonable understanding of the process of table normalisation and why it is important for the integrity of your database and the data contained therein. 

Following the process described in earlier sections of this series will ensure that your database doesn't need to be redesigned whenever you need a new field added to a table, or a completely new table bolted on to your design. It will make querying easier and accurate, removing any possible ambiguities within the results. Duplicate data entries will be eliminated by storing the data in one place and have it related to other pieces of data held in other tables.

Casting our minds back to part one of the series, in order to discuss table design, we jumped ahead of the overall database design process by omitting the specification phase. This is perhaps the most important phase of the process.

For the final part of this series, we are going to look at a technique that sits between the specification and the implementation phases.

Entity Relationship Diagrams

An Entity Relationship Diagram (ERD) is a graphical representation of the design as laid out in the specification. It provides a quick method of laying out the table design before going anywhere near your database design software. In addition, it can often flag up issues with the specification at an early stage that may make it necessary to to revisit the design requirements.

As the name suggests, it allows the relationship between tables to be defined and, in my view, helps with the normalisation process as it is easy to see where tables should be separated.

It should be noted that desktop database software such as MS Access and LibreOffice, not to mention some server based database software, have inbuilt functionality that allows the user to create relationships between tables in a graphical manner in a similar vein to ERDs. However, these are tools for establishing the relationship within the database and is not part of the design process. Indeed, in order for these relationships to be established the corresponding tables must be created beforehand.

In other words, from a design perspective, using these tools is like putting the cart before the horse. An ERD should be created before you go anywhere near creating your tables for real.

You can use numerous packages to create your diagram. Most organisational diagram software could be used. Very large database designs, particularly where numerous designers are involved in the project, will use dedicated ERD software but for the most part, desktop database designers do not need those applications. ERDs for many such databases could be drawn by hand.

Here, I'll be using the Draw program contained within LibreOffice.

So, on to the first example. Here is the ERD representation of the work we did earlier in the series.

Entity Relationship Diagram

A straightforward looking diagram that still needs a little explanation...

The arrows represent the relationship between the tables and this diagram shows two different types corresponding to the type of relationship that can exist between tables.

  1. The relationship between the Department and Employee tables is a one-to-many relationship from Department to Employee.
  2. The relationship between the Employee and Project tables is a many-to-many relationship.

How do we go about ascertaining what type of relationship exists between tables? To do this, one must consider the relationship viewed from one record on each side of the relationship.

For example, looking at the Employee/Department relationship, consider one employee. Any one employee works in one department – a one-to-one relationship.

Now, look from the other side. Any one department will have a number of employees working in it – a one-to-many relationship.

Combining the two results in a one-to-many relationship 'flowing' from Department to Employee.

Please note at this stage that it is quite possible to end up with a one-to-one relationship going both ways., i.e. a net one-to-one relationship between tables. A database designer could then argue that the tables could be merged. However, keep in mind what we have learned about database normalisation and always work to the smallest configuration that makes sense.

Although not part of this series, when creating relationships within MS Access et al, the direction of the arrow defines how the relationship is established which has an effect on how a one-to-one relationship behaves in terms of record updates and deletions. Again, the specification should make that clear.

For the reasons outlined above, it is quite common for a database designer to label each end of the arrow with a 1 and an 'M' (or a specific number) to make it clear exactly what the relationship is.

Now, repeat the process for the Employee and Project tables...

Any one employee could work on a number of projects (one-to-many) and any one project could have a number employees working on it (one-to-many). The net result of this is a many-to-many relationship.

Here we have a problem. Relational database applications aren't great at handling many-to-many relationships. To get around this, we must use another technique to represent the many-to-many relationship with two one-to-many relationships. This can be done by introducing an intermediary table as shown in the updated ERD below.

Entity Relationship Diagram

The name of the intermediary table can be anything you like, though combining the table names makes it clear what it is.

Now that the relationships between table are established, how are they linked? Quite simply, fields containing the same data are used to define how the tables are linked through the relationship. They provide the 'anchor points' of the relationship if you like.

Before we look at that, it's time to introduce the next part of why an ERD is useful.

Field Definition and Types

When designing a database table, we are specifying the data that that table will hold. We do so by defining fields for each record that is pertinent to each record. Part of that process is to define the size of the data for each field.

For example, if we have a surname field, that field will contain text and not numbers. In addition, specifying such a field as a text field will, in most database design applications give a maximum number of characters of 254.

Now, clearly, in the case of a field designed to hold employees' surname, specifying 254 characters per record is going to result in a lot of wasted storage space within the database. Do the same for every other field in each table and it is easy to see how space inefficient it would be.

In addition, defining the correct type helps to minimise data input errors. For example, if a field is configured to hold an integer, then entering anything other then numerical digits would not be allowed.

Therefore correct definition of the field type and size reduces storage requirements and helps to maintain the integrity of the data contained within the database. The importance of this should not be underestimated.

The ERD is an ideal place to put this information – field names, data type and size – as shown in the revised ERD below. Note that those fields in bold and underlined represent the primary key field of the table.

Entity Relationship Diagram

With the field definitions incorporated into the ERD, we can establish how the relationships are configured between tables. We already know what the table relationships are. All that needs to be identified now are the fields that define where the tables are joined through the relationship.

In general terms, relationships are established on corresponding fields in each table that hold the same data (which needs to be of the same data type). More specifically, the one side of the relationship should be 'anchored' to a key field.

Looking at the Department to Employee relationship we see that the 'one' side of the relationship should be 'anchored' to the key field for that table, i.e. DeptID. However, there is no corresponding field in the Employee table.

In order to correct this, we need to introduce an additional field in the Employee table that will hold the DeptID data. This might seem contrary to the normalisation process described in earlier instalments but it is a necessary step in order to establish the relationship within the database.

In other words, the DeptID primary key field from the Department table is introduced to the Employee table. Please note that the new DeptID field in the Employee table is NOT a key field in that table even though it is referred to as a 'foreign key'.

Compare this to the imported fields in the intermediary table, EmployeeProject. These foreign keys, whilst not key fields in themselves within the EmployeeProject table, combine to make a joint key field of the EmployeeProject table.

Note that most intermediary tables contain just the foreign keys from the tables involved in the relationship. That is all that is needed to simulate the many-to-many relationship involved. However, there is nothing to stop such a table holding additional information.

For example, the EmployeeProject table in our small database may be the place to record the total number of hours spent by each employee on each project.

The final ERD is shown below.

Entity Relationship Diagram

That diagram draws the series to an end. I'd encourage you to practise the techniques discussed throughout this series. Hopefully you will feel better equipped to design your own databases in MS Access et al.

If you have any questions or would like further assistance with your database design, then please call Alistair on 0131 3333972 or 07790 860067 to discuss your requirements.

Thanks for your time..