Normalisation in DBMS – 1NF 2NF 3NF BCNF and Functional Dependencies

Q: What is the difference between 3NF and BCNF?

In 3NF, for every non-trivial functional dependency X → Y: either X is a super key, or Y is a prime attribute (part of some candidate key). BCNF is stricter: for every non-trivial FD X → Y, X must be a super key (no exceptions for prime attributes). Every BCNF relation is also in 3NF, but not vice versa. BCNF always guarantees lossless decomposition but may not preserve all dependencies. 3NF always allows a dependency-preserving, lossless decomposition. In GATE, if a relation violates BCNF but satisfies 3NF, the left-hand side is a non-super-key and right-hand side is a prime attribute.

Q: How do you find the closure of a set of attributes?

To find the closure X+ of attribute set X under functional dependencies F: (1) Start with closure = X. (2) For each FD A → B in F, if A ⊆ closure, add B to closure. (3) Repeat step 2 until no more attributes can be added. The closure X+ is the set of all attributes functionally determined by X. Use closure to: find if X is a super key (X+ = all attributes), check if an FD follows from F (A → B holds iff B ⊆ A+), find candidate keys.

Q: What is lossless decomposition in database normalisation?

A decomposition of R into R1 and R2 is lossless (lossless-join) if the original relation can be reconstructed exactly by joining R1 and R2: R = R1 ⋈ R2. Lossy decomposition generates spurious tuples not in the original relation. Test (for binary decomposition into R1, R2): the decomposition is lossless iff (R1 ∩ R2) → R1 or (R1 ∩ R2) → R2 holds — the common attributes form a key of at least one of the two relations.

What You Will Learn

Functional dependencies: types and inference
Armstrong’s axioms and attribute closure computation
Candidate key finding using closure
Normal forms: 1NF, 2NF, 3NF, BCNF with examples
Lossless decomposition and dependency preservation
GATE PYQs with complete solutions

GATE Weightage: 2–3 marks. Normalisation is the single highest-yield DBMS topic; questions appear every year.

1. Functional Dependencies

A functional dependency X → Y means: if two tuples have the same value for X, they must have the same value for Y. X functionally determines Y.

Types of FDs

Type	Definition	Example
Trivial FD	Y ⊆ X (Y is a subset of X)	AB → A (always holds)
Non-trivial FD	Y ⊄ X	Eid → Name
Partial FD	X is a composite key; Y depends on only part of X	{Eid, Pid} → Name (only on Eid)
Full FD	Y depends on all of X; removing any attribute from X breaks the FD	{Eid, Pid} → Hours
Transitive FD	X → Y → Z where X is not a super key and Y is not a prime attribute	Eid → Did → DLocation

2. Armstrong’s Axioms

Armstrong’s axioms are a complete and sound set of inference rules for deriving all functional dependencies logically implied by a given set F.

Three Basic Axioms:
1. Reflexivity: If Y ⊆ X, then X → Y (trivial FDs)
2. Augmentation: If X → Y, then XZ → YZ
3. Transitivity: If X → Y and Y → Z, then X → Z

Derived Rules:
4. Union: If X → Y and X → Z, then X → YZ
5. Decomposition: If X → YZ, then X → Y and X → Z
6. Pseudotransitivity: If X → Y and WY → Z, then WX → Z

Closure of F (F+): The set of ALL functional dependencies that can be derived from F using Armstrong’s axioms. It is always larger than F itself.

3. Attribute Closure & Candidate Keys

Computing X+ (Attribute Closure of X)

Algorithm:
1. result = X
2. For each FD A → B in F:
if A ⊆ result, add B to result
3. Repeat step 2 until no change
4. X+ = result

Example: R(A, B, C, D, E), FDs: A→BC, BC→D, D→E

Find A+:
Start: result = {A}
A→BC applies: result = {A, B, C}
BC→D applies: result = {A, B, C, D}
D→E applies: result = {A, B, C, D, E}
A+ = {A, B, C, D, E} = all attributes → A is a candidate key

Finding All Candidate Keys

A candidate key is a minimal set of attributes whose closure includes all attributes.

Find attributes that appear only on LHS of FDs — they must be in every CK
Find attributes that appear only on RHS of FDs — they cannot be in any CK
Attributes on both sides or neither: may or may not be in CK (try combinations)

Example: R(A, B, C, D), FDs: A→B, B→C, C→D
A never on RHS → must be in CK
A+ = {A, B, C, D} → A alone is a CK
Candidate keys: {A} only

4. First Normal Form (1NF)

A relation is in 1NF if every attribute has atomic (indivisible) values — no multi-valued attributes, no repeating groups, no nested tables.

Violates 1NF:
STUDENT(Sid, Name, PhoneNumbers{p1, p2, p3})

In 1NF:
STUDENT(Sid, Name)
STUDENT_PHONE(Sid, Phone)

Note: The relational model by definition only allows atomic attribute values. So if you’re working with a proper relation, it is already in 1NF. 1NF violations occur when designing the initial schema.

5. Second Normal Form (2NF)

A relation is in 2NF if it is in 1NF AND every non-prime attribute is fully functionally dependent on the entire primary key (no partial dependencies).

Partial dependency: A non-prime attribute depends on only a part of the composite primary key.

Violates 2NF:
WORKS_ON(Eid, Pid, Hours, EmpName, ProjName)
PK = {Eid, Pid}
Eid → EmpName (partial — EmpName depends only on Eid, not full PK)
Pid → ProjName (partial)

Decomposed to 2NF:
EMPLOYEE(Eid, EmpName)
PROJECT(Pid, ProjName)
WORKS_ON(Eid, Pid, Hours)

2NF only matters when the primary key is composite. If PK is a single attribute, there can be no partial dependencies, so the relation is automatically in 2NF (if in 1NF).

6. Third Normal Form (3NF)

A relation is in 3NF if it is in 2NF AND there are no transitive dependencies (no non-prime attribute depends on another non-prime attribute).

Formal definition of 3NF: For every non-trivial FD X → Y, either:

X is a super key, OR
Y is a prime attribute (part of some candidate key)

Violates 3NF (transitive dependency):
EMPLOYEE(Eid, Name, Did, DLocation)
Eid → Did (Did depends on PK — OK)
Did → DLocation (DLocation depends on Did, not directly on Eid — TRANSITIVE)

Decomposed to 3NF:
EMPLOYEE(Eid, Name, Did)
DEPARTMENT(Did, DLocation)

7. Boyce-Codd Normal Form (BCNF)

A relation is in BCNF if for every non-trivial FD X → Y, X must be a super key. BCNF is strictly stronger than 3NF.

Property	3NF	BCNF
Condition for X → Y (non-trivial)	X is superkey OR Y is prime attribute	X must be a superkey (always)
Lossless decomposition	Always possible	Always possible
Dependency preservation	Always possible	Not always possible
Redundancy	Some redundancy may remain	Eliminates all FD-based redundancy

Classic BCNF violation example:
TEACHING(Student, Course, Teacher)
FDs: {Student, Course} → Teacher; Teacher → Course
Candidate keys: {Student, Course}, {Student, Teacher}

Teacher → Course: Teacher is NOT a super key (not in any CK alone); Course IS a prime attribute
→ Satisfies 3NF (Y is prime) but violates BCNF (X is not a super key)

BCNF decomposition:
TC(Teacher, Course) — Teacher → Course is now a super key
ST(Student, Teacher)
Note: FD {Student, Course} → Teacher cannot be preserved in this decomposition!

GATE Key: A relation violates BCNF but satisfies 3NF when and only when there exists a non-trivial FD X → A where X is not a superkey but A is a prime attribute.

8. Decomposition Properties

Lossless-Join Decomposition

Decomposition of R into {R1, R2} is lossless iff the common attributes form a key for R1 or R2:

(R1 ∩ R2) → R1 OR (R1 ∩ R2) → R2

Equivalently: R1 ∩ R2 is a super key of R1 or R2

Example: R(A, B, C, D), FDs: A→BC, BC→D
Decompose into R1(A, B, C) and R2(B, C, D).
R1 ∩ R2 = {B, C}. Is BC a key of R1 or R2?
BC+ = {B, C, D} ≠ {A, B, C} — BC is not a key of R1
BC+ covers {B, C, D} = all of R2 — BC IS a key of R2
Decomposition is lossless.

Dependency-Preserving Decomposition

A decomposition preserves dependencies if every FD in F+ can be enforced using only the projected FDs of R1 and R2 (without joining). Check: every FD in the original set can be derived from FDs restricted to individual relations.

Trade-off:
BCNF decomposition: always lossless, but may NOT preserve dependencies
3NF decomposition: always both lossless AND dependency-preserving

If asked which normal form guarantees both properties → answer is 3NF.

9. GATE Examples

GATE 2015: Relation R(A, B, C, D, E) with FDs: A→BC, CD→E, B→D, E→A. Find the candidate keys.

View Solution

Try A: A+ = A → {A,B,C} → B→D: {A,B,C,D} → CD→E: {A,B,C,D,E} = all. A is a CK.

Try E: E→A, then A+: E → A → BC → D → E. E+ = all. E is a CK.

Try BC: BC+ → B→D: {B,C,D} → CD→E: {B,C,D,E} → E→A: {A,B,C,D,E}. BC is a CK.

Try CD: CD+ → CD→E: {C,D,E} → E→A: {A,C,D,E} → A→BC: {A,B,C,D,E}. CD is a CK.

Candidate keys: {A}, {E}, {BC}, {CD}

GATE 2017: Relation R(P, Q, R, S, T) with FDs: P→QR, RS→T, Q→S, T→P. Which of the following is NOT a candidate key?

View Solution

Test P: P→QR, Q→S: {P,Q,R,S}, RS→T: {P,Q,R,S,T}, T→P: confirmed. P+ = all. P is CK.

Test T: T→P, then P+: all. T is CK.

Test Q: Q→S, can Q determine P or R? Q doesn’t determine P or R directly. Q+ = {Q,S} ≠ all. Q is NOT a CK.

Q is not a candidate key.

GATE 2019: Relation R(A, B, C) with FDs: A→B, B→C, C→A. How many candidate keys does R have?

View Solution

A+ = {A,B,C} = all → A is CK.

B+ = {B,C,A} = all → B is CK.

C+ = {C,A,B} = all → C is CK.

All single attributes are candidate keys. AB, AC, BC, ABC are super keys but not minimal.

Answer: 3 candidate keys — {A}, {B}, {C}

Since all attributes are prime, there are no partial or transitive dependencies → R is in BCNF.

10. Common Mistakes

Confusing 3NF and BCNF conditions: 3NF allows X → A where A is prime (even if X is not a super key). BCNF requires X to always be a super key with NO exceptions. A relation satisfying 3NF may fail BCNF.
Assuming single-attribute PK means 3NF: A single-attribute PK means no partial dependencies (automatically in 2NF), but transitive dependencies can still exist, violating 3NF.
Missing candidate keys: Always check all possible minimal attribute combinations, not just the “obvious” primary key. A relation can have multiple candidate keys.
Closure computation errors: Apply FDs repeatedly until no more attributes can be added. Students often stop after one pass; apply ALL FDs in each round until the closure stabilises.
Lossless test: The common attributes (R1 ∩ R2) must form a key for AT LEAST ONE of the two relations, not both.

11. Frequently Asked Questions

What is the difference between 3NF and BCNF?

Both address the same issue (FD-based anomalies) but at different strictness. 3NF allows X → A if A is a prime attribute, even when X is not a super key. BCNF requires X to always be a super key — no exceptions. Every BCNF relation is in 3NF. BCNF may sacrifice dependency preservation; 3NF guarantees both lossless and dependency-preserving decomposition, making it the practical choice for most applications.

How do you find the closure of a set of attributes?

Start with result = X. Scan all FDs: if the entire left side is in result, add the right side to result. Repeat until no new attributes are added. This is X+. Use it to: test if X is a super key (X+ = all attributes), find implied FDs (A → B holds iff B ⊆ A+), find minimal keys (try subsets of candidate key candidates).

What is lossless decomposition in database normalisation?

A decomposition is lossless if joining the pieces always gives back the original relation with no spurious (extra, incorrect) tuples. Test for binary decomposition: (R1 ∩ R2)+ must include all attributes of R1 OR all attributes of R2 — meaning the common attributes form a key of at least one sub-relation. Lossy decompositions generate spurious tuples that weren’t in the original data.

What is Armstrong’s axioms in DBMS?

Armstrong’s axioms are three inference rules for FDs: Reflexivity (supersets determine subsets), Augmentation (add same attributes to both sides), and Transitivity (chain dependencies). They are sound (only derive valid FDs) and complete (can derive all implied FDs). Derived rules like Union and Decomposition make computation faster. Used to find F+ (all implied FDs) and to test equivalence of two sets of FDs.

Normalisation in DBMS – 1NF 2NF 3NF BCNF and Functional Dependencies

Normalisation in DBMS – 1NF 2NF 3NF BCNF and Functional Dependencies

What You Will Learn

1. Functional Dependencies

Types of FDs

2. Armstrong’s Axioms

3. Attribute Closure & Candidate Keys

Computing X+ (Attribute Closure of X)

Finding All Candidate Keys

4. First Normal Form (1NF)

5. Second Normal Form (2NF)

6. Third Normal Form (3NF)

7. Boyce-Codd Normal Form (BCNF)

8. Decomposition Properties

Lossless-Join Decomposition

Dependency-Preserving Decomposition

9. GATE Examples

10. Common Mistakes

11. Frequently Asked Questions

What is the difference between 3NF and BCNF?

How do you find the closure of a set of attributes?

What is lossless decomposition in database normalisation?

What is Armstrong’s axioms in DBMS?

Next Topics

Leave a Comment