Relational Databases & Normalisation Grade 12
A relational database stores data in structured tables linked together. Normalisation is the process of cleaning up a database design to remove repetition and prevent errors.
What is a Relational Database?
A relational database stores data in tables (also called relations). Each table represents one type of thing — for example, a table for students, a separate table for subjects. Tables are then linked together using key fields.
Key Fields
| Key Type | Description | Example |
|---|---|---|
| Primary Key (PK) | Uniquely identifies each record. No duplicates, no blanks. | StudentID in tblStudents |
| Foreign Key (FK) | A field that links to the primary key of another table. Creates the relationship. | ClassID in tblStudents → links to tblClasses |
| Composite Key | Two or more fields combined to create a unique identifier | StudentID + SubjectID in tblMarks |
| Alternate Key | Could be a primary key but wasn't chosen as one | Email address (also unique, but ID was chosen as PK) |
Referential Integrity
Referential integrity is a rule that ensures relationships between tables remain consistent. It means: a foreign key value must always match an existing primary key value in the related table.
If ClassID C03 doesn't exist in tblClasses, you cannot add a student with ClassID = C03. The database will reject this to prevent orphan records (records with no matching parent).
Entity Relationship Diagram (ERD)
An ERD is a diagram that shows the entities (tables) in a database and the relationships between them. Each box is a table; the line between boxes shows how they relate. You are often asked to draw one in the exam, so practise the crow's-foot notation below.
The relationship is created by putting the primary key of the "one" table into the "many" table as a foreign key (here BookID links tblBooks to tblLoans).
Types of Relationships
| Relationship | Meaning | Example |
|---|---|---|
| One-to-One (1:1) | One record links to exactly one record in the other table | One person ↔ one ID number |
| One-to-Many (1:∞) | One record links to many records in the other table | One book can appear in many loans |
| Many-to-Many (∞:∞) | Many records link to many records — resolved with a third "link" table | Students ↔ Activities (via a link table) |
A relational database can't store a many-to-many relationship directly. You break it into two one-to-many relationships using a junction/link table (e.g. tblLoans links books and borrowers).
Physical vs Logical Integrity
| Type | What it protects | How it is ensured |
|---|---|---|
| Physical integrity | The data is not lost or damaged by hardware failure, power loss or disaster | RAID, UPS, regular backups, off-site storage |
| Logical integrity | The data stays correct, consistent and valid within the database | Validation rules, referential integrity, record locking |
Record locking temporarily "locks" a record while one user is editing it, so that a second user cannot change the same record at the same time. This prevents two users overwriting each other's changes and keeps the data consistent (a key way to protect logical integrity in a multi-user database).
Database Anomalies
When a database is poorly designed (not normalised), three types of errors can occur when working with data:
| Anomaly | When it happens | Example |
|---|---|---|
| Insertion anomaly | You can't add new data without first adding other unrelated data | Can't add a new class until at least one student enrols in it |
| Deletion anomaly | Deleting one record accidentally removes other important data | Deleting the only student in a class also deletes the class's information |
| Modification anomaly | Changing one value means updating many rows — and missing some creates inconsistency | A teacher's name appears in 50 student records; update one and the others are wrong |
Normalisation splits large, messy tables into smaller, focused tables and links them properly. This eliminates all three anomalies.
Normalisation — The Three Normal Forms
First Normal Form (1NF)
A table is in 1NF when:
- Every cell contains a single value (no lists of values in one cell)
- There are no repeating groups of columns
- Every record is unique (has a primary key)
Not in 1NF:
| StudentID | Name | Subjects |
|---|---|---|
| 1 | Alice | Maths, IT, English |
❌ Problem: Three values in one cell (Subjects column)
In 1NF:
| StudentID | Name | Subject |
|---|---|---|
| 1 | Alice | Maths |
| 1 | Alice | IT |
| 1 | Alice | English |
Second Normal Form (2NF)
A table is in 2NF when it is in 1NF AND every non-key field depends on the whole primary key (not just part of it). This only matters when the primary key is a composite key.
Example violation:
| StudentID (PK) | SubjectID (PK) | StudentName | Mark |
|---|---|---|---|
| 1 | IT | Alice | 85 |
❌ Problem: StudentName only depends on StudentID, not on (StudentID + SubjectID) together
In 2NF — split the table:
| StudentID (PK) | StudentName |
|---|---|
| 1 | Alice |
| StudentID (PK+FK) | SubjectID (PK) | Mark |
|---|---|---|
| 1 | IT | 85 |
Third Normal Form (3NF)
A table is in 3NF when it is in 2NF AND no non-key field depends on another non-key field (no transitive dependency).
Example violation:
| StudentID (PK) | Name | GradeLevel | GradeDescription |
|---|---|---|---|
| 1 | Alice | 12 | Matric |
❌ Problem: GradeDescription depends on GradeLevel (a non-key field), not on StudentID
In 3NF — split it:
| StudentID (PK) | Name | GradeLevel (FK) |
|---|---|---|
| 1 | Alice | 12 |
| GradeLevel (PK) | GradeDescription |
|---|---|
| 12 | Matric |
Quick Normalisation Checklist
| Normal Form | Check for | Fix by |
|---|---|---|
| 1NF | Multiple values in one cell? Repeating columns? | Split into separate rows; add PK |
| 2NF | Non-key field depends on only PART of the composite PK? | Move that field to its own table |
| 3NF | Non-key field depends on another non-key field? | Move the dependency to its own table |
Good Database Characteristics
| Characteristic | Meaning |
|---|---|
| Data integrity | Data is accurate, consistent and complete throughout its lifecycle |
| Data independence | Changing the database structure doesn't break the programs that use it |
| Data redundancy | Minimum — the same data should not be stored in multiple places |
| Data security | Only authorised users can access or modify data |
| Ease of maintenance | The database design makes it simple to insert, update and delete without causing problems |