DBMS Formulas Quick Reference – Complete Cheat Sheet for GATE CS

What This Page Covers

Relational model: super keys, candidate keys counting
Functional dependencies: closure, Armstrong’s axioms
Normalisation: normal form identification rules
B+ tree: order, height, minimum/maximum key counts
Query cost: selection, join algorithms
SQL aggregate behaviour with NULLs
Quick GATE facts table

1. Relational Model Formulas

Super keys from a candidate key CK in relation R with n attributes:
Number of super keys = 2^(n – |CK|)
(every superset of CK is a super key)

Super keys from multiple candidate keys:
Use inclusion-exclusion principle.

Cartesian product result size:
|R × S| = |R| × |S| tuples; attributes = degree(R) + degree(S)

Natural join result size:
0 ≤ |R ⋈ S| ≤ |R| × |S|
= |R| × |S| only when join attribute matches all tuples
= 0 when no tuples share a join attribute value

Join size estimate (uniform distribution):
|R ⋈_{A=B} S| ≈ |R| × |S| / max(V(A,R), V(B,S))
where V = number of distinct values

Super key example: R(A, B, C, D), CK = {A, B}
n = 4, |CK| = 2
Super keys = 2^(4-2) = 4 super keys: {A,B}, {A,B,C}, {A,B,D}, {A,B,C,D}

2. Functional Dependency Formulas

Attribute Closure X+:
Start: X+ = X
For each FD A→B: if A ⊆ X+, add B to X+
Repeat until stable.

X is a super key iff: X+ = all attributes of R

FD A→B holds in F iff: B ⊆ A+ (computed under F)

Armstrong’s Axioms:
Reflexivity: Y ⊆ X → X→Y
Augmentation: X→Y → XZ→YZ
Transitivity: X→Y, Y→Z → X→Z
Union: X→Y, X→Z → X→YZ
Decomposition: X→YZ → X→Y and X→Z

Lossless decomposition test (R → R1, R2):
(R1 ∩ R2) → R1 ∈ F+ OR (R1 ∩ R2) → R2 ∈ F+
i.e., common attributes form a key for R1 or R2

Test A→D in R(A,B,C,D) with F = {A→B, B→C, C→D}:
A+ = {A} → A→B: {A,B} → B→C: {A,B,C} → C→D: {A,B,C,D}
D ∈ A+ → A→D holds. ✓

3. Normalisation Quick Rules

Normal Form	Violation	Condition for Compliance
1NF	Multi-valued or composite attributes	All attributes are atomic
2NF	Partial dependency (non-prime attr depends on part of CK)	Every non-prime attribute fully depends on every CK
3NF	Transitive dependency (non-prime → non-prime → non-prime chain)	For X→A: X is superkey OR A is prime
BCNF	Non-superkey determines any attribute	For every non-trivial X→Y: X is a superkey

Identifying prime vs non-prime attributes:
Prime attribute: appears in at least ONE candidate key
Non-prime attribute: does NOT appear in ANY candidate key

BCNF vs 3NF violation pattern:
Relation violates BCNF but satisfies 3NF when:
∃ non-trivial X→A where X is NOT a super key AND A IS a prime attribute

Number of tables in 3NF decomposition:
One table per FD in canonical cover + one table for a candidate key (if not already covered)

Canonical Cover F_c:
Minimal set of FDs equivalent to F; no extraneous attributes or redundant FDs

4. B+ Tree Formulas

Internal node order n (max pointers per internal node):
n × P + (n-1) × K ≤ B
n = ⌊(B + K) / (P + K)⌋
P = pointer size, K = key size, B = block size

Leaf node capacity n_leaf (max keys per leaf):
n_leaf × (K + P_data) + P_next ≤ B
n_leaf = ⌊(B – P_next) / (K + P_data)⌋

Min/Max keys in nodes:
Non-root internal: min ⌈n/2⌉-1 keys, max n-1 keys
Non-root leaf: min ⌈n_leaf/2⌉ keys, max n_leaf keys
Root: min 1 key

Min/Max keys in B+ tree of height h (order n):
Max keys = (n-1) × (1 + n + n² + … + n^(h-1)) + n^h × n_leaf
Min keys (height h, h ≥ 1): 2 × ⌈n/2⌉^(h-1) × ⌈n_leaf/2⌉

Min leaves for N records:
Min leaf nodes = ⌈N / n_leaf⌉

Disk accesses for point query (B+ tree height = h):
= h + 1 (h for tree traversal + 1 for data block)

Example: B = 4096 B, K = 8 B, P = 8 B, P_data = 8 B, P_next = 8 B
Internal order: n = ⌊(4096 + 8) / (8 + 8)⌋ = ⌊4104/16⌋ = 256
Leaf capacity: n_leaf = ⌊(4096 – 8) / (8 + 8)⌋ = ⌊4088/16⌋ = 255
Min keys per non-root internal node = ⌈256/2⌉ – 1 = 127

5. Query Cost Formulas

Selection cost (b_r = data blocks, h_i = index height):
Linear scan: b_r
Binary search (sorted key): ⌈log₂(b_r)⌉
Primary B+ index, key equality: h_i + 1
Secondary B+ index, key equality: h_i + 1
Secondary B+ index, non-key: h_i + n_matches (worst case)

Join cost (b_R, b_S = blocks; n_R = tuples in R; M = memory buffers):
Tuple nested loop: b_R + n_R × b_S
Block nested loop: b_R + ⌈b_R/(M-2)⌉ × b_S
Merge join (pre-sorted): b_R + b_S
Hash join: 3(b_R + b_S) [assumes partition fits in M-2 pages]

Sort cost (external merge sort):
Passes = 1 + ⌈log_{M-1}(⌈b_r/M⌉)⌉
Total I/Os = 2 × b_r × passes

Join cost example: b_R = 200, b_S = 500, M = 12
Block nested loop (R outer): 200 + ⌈200/10⌉ × 500 = 200 + 20 × 500 = 10,200
Block nested loop (S outer): 500 + ⌈500/10⌉ × 200 = 500 + 50 × 200 = 10,500
Hash join: 3 × (200 + 500) = 2,100 ← best choice

6. Concurrency Control Key Rules

Conflict detection:
Two operations conflict if: different transactions + same data item + at least one WRITE
Conflict pairs: R-W, W-R, W-W (R-R does NOT conflict)

Precedence graph:
Edge Ti → Tj if Ti has conflicting operation before Tj’s on same item
Acyclic graph → conflict serializable; cyclic → not conflict serializable

2PL lock compatibility:
S+S: compatible (grant both) | S+X: incompatible | X+X: incompatible

Timestamp ordering:
Read(Ti, Q): abort if TS(Ti) < W-timestamp(Q)
Write(Ti, Q): abort if TS(Ti) < R-timestamp(Q)
ignore (Thomas Write Rule) if TS(Ti) < W-timestamp(Q)

7. Quick GATE Facts

Topic	Key Fact
Super key from CK {A,B} in n-attribute relation	2^(n-2) super keys
Cartesian product size	\|R\| × \|S\| tuples, degree(R) + degree(S) columns
Natural join max size	\|R\| × \|S\|; min size = 0
Dense index entries	One per record = n_r entries
Sparse index entries	One per block = b_r entries
Secondary index type	Always dense (data not ordered)
BCNF vs 3NF	BCNF may not preserve FDs; 3NF always does
2NF and single-attribute PK	Single-attribute PK → automatically in 2NF
BCNF violation but 3NF satisfaction	∃ X→A: X not superkey, A is prime attribute
Conflict: R-R	NOT a conflict — both are reads
2PL and deadlock	2PL does NOT prevent deadlock; use wait-for graph
Hash join condition	Only for equi-joins (equality condition)
Sort-merge join advantage	Works for both equality and range join conditions
B+ tree vs B tree	B+ tree: all data in leaves, leaves linked, shorter height
Primary B+ index, key lookup	h_i + 1 disk accesses
NOT IN with NULL in subquery	Returns empty result (UNKNOWN for all rows)
SQL clause execution order	FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY
COUNT(*) vs COUNT(A)	COUNT(*) includes NULLs; COUNT(A) ignores NULLs
UNION vs UNION ALL	UNION removes duplicates; UNION ALL keeps all rows
Lossless decomposition test	(R1∩R2) must be a key of R1 or R2
Wait-Die protocol	Older waits for younger; younger dies (aborts)
Wound-Wait protocol	Older wounds (aborts) younger; younger waits for older

ER to Schema — Minimum Tables Needed

ER Component	Tables Needed
Strong entity	1 table
Weak entity	1 table (+ owner FK)
1:1 relationship	0 new tables (FK added to one entity table)
1:N relationship	0 new tables (FK added to N-side table)
M:N relationship	1 new relationship table
Multi-valued attribute	1 new table

DBMS Formulas Quick Reference – Complete Cheat Sheet for GATE CS

DBMS Formulas Quick Reference – Complete Cheat Sheet for GATE CS

What This Page Covers

1. Relational Model Formulas

2. Functional Dependency Formulas

3. Normalisation Quick Rules

4. B+ Tree Formulas

5. Query Cost Formulas

6. Concurrency Control Key Rules

7. Quick GATE Facts

ER to Schema — Minimum Tables Needed

Continue Studying

Leave a Comment