Even a relation in BCNF can exhibit redundancy and anomalies when it stores independent multi-valued facts in a single relation.
Consider the relation:
CTX(course, teacher, textbook)
with the following business rule:
| course | teacher | textbook |
|---|---|---|
| Physics | Dr. A | Halliday |
| Physics | Dr. A | Serway |
| Physics | Dr. B | Halliday |
| Physics | Dr. B | Serway |
Observation:
This redundancy stems from a multi-valued dependency, not a functional dependency. BCNF cannot detect or remove it.
A multi-valued dependency (MVD) X →→ Y holds in relation R if, for any two tuples t1, t2 in R with t1[X] = t2[X], there exists a tuple t3 in R such that:
Intuitive meaning: The set of Y values associated with a given X value is independent of the set of (R - X - Y) values.
“For a given X, the Y-values and Z-values (where Z = R - X - Y) are independent — every Y value appears with every Z value.”
Example: In CTX:
course →→ teacher — for a given course, the set of teachers is independent of the textbookscourse →→ textbook — for a given course, the set of textbooks is independent of the teachersX →→ Y read as "X multi-determines Y"
If X →→ Y holds in R(X, Y, Z) where Z = R - X - Y, then X →→ Z also holds.
MVDs always come in complementary pairs!
An MVD X →→ Y is trivial if:
Trivial MVDs hold in every relation and carry no semantic content.
If X → Y (functional dependency), then X →→ Y (multi-valued dependency).
A relation R(X, Y, Z) can be decomposed losslessly into R1(X, Y) and R2(X, Z) if and only if X →→ Y holds in R (equivalently, X →→ Z holds).
This is the foundation for 4NF decomposition.
A relation R is in 4NF with respect to a set of FDs and MVDs F if, for every non-trivial MVD X →→ Y in R:
If X →→ Y is non-trivial and X is NOT a superkey, then R stores independent multi-valued facts for multiple X values — causing the exact redundancy we saw in CTX.
In CTX(course, teacher, textbook):
course →→ teacher is non-trivial (teacher ⊄ course, and course ∪ teacher ≠ CTX)course a superkey? No — course alone doesn’t uniquely identify tuplesDecompose using the violating MVD:
CTX(course, teacher, textbook)
with MVD: course →→ teacher (equivalently: course →→ textbook)
Decompose into:
CT(course, teacher) ← course →→ teacher is now trivial (course ∪ teacher = CT)
CX(course, textbook) ← course →→ textbook is now trivial
Result:
CT(course, teacher)
| course | teacher |
|---------|---------|
| Physics | Dr. A |
| Physics | Dr. B |
CX(course, textbook)
| course | textbook |
|---------|----------|
| Physics | Halliday |
| Physics | Serway |
Adding Dr. C → one row in CT only. Adding a new textbook → one row in CX only. No redundancy!
1NF ⊃ 2NF ⊃ 3NF ⊃ BCNF ⊃ 4NF
Each is a strictly stronger condition than the previous.
| Normal Form | Eliminates |
|---|---|
| 2NF | Partial dependencies |
| 3NF | Transitive dependencies (while preserving FDs) |
| BCNF | All FD-based redundancy |
| 4NF | MVD-based redundancy (independent multi-valued facts) |
| Rule | Statement |
|---|---|
| Complementation | If X →→ Y in R, then X →→ (R - X - Y) |
| Augmentation | If X →→ Y and W ⊇ Z, then XZ →→ YW |
| Transitivity | If X →→ Y and Y →→ Z, then X →→ (Z - Y) |
| Replication | If X → Y, then X →→ Y |
| Coalescence | If X →→ Y, Y ⊆ W, W ∩ Y = ∅, W → Z, Z ⊆ Y, then X → Z |
Given relation R and FDs/MVDs F:
result = {R}
while some relation Ri is not in 4NF:
find a non-trivial MVD X →→ Y in Ri where X is not a superkey of Ri
decompose Ri into:
R1 = X ∪ Y
R2 = Ri - Y (keeps X and everything not in Y)
replace Ri with R1 and R2
return result
Guarantee: 4NF decomposition is always lossless-join (by Fagin’s theorem). However, it may not preserve all dependencies.
Given a relation and business rules, identify MVDs:
emp_id →→ skill, emp_id →→ languageDecompose to 4NF step by step:
Check if decomposition is lossless: