DBMS Formulas Quick Reference – Complete Cheat Sheet for GATE CS
What This Page Covers
- Relational model: super keys, candidate keys counting
- Functional dependencies: closure, Armstrong’s axioms
- Normalisation: normal form identification rules
- B+ tree: order, height, minimum/maximum key counts
- Query cost: selection, join algorithms
- SQL aggregate behaviour with NULLs
- Quick GATE facts table
1. Relational Model Formulas
Number of super keys = 2^(n – |CK|)
(every superset of CK is a super key)
Super keys from multiple candidate keys:
Use inclusion-exclusion principle.
Cartesian product result size:
|R × S| = |R| × |S| tuples; attributes = degree(R) + degree(S)
Natural join result size:
0 ≤ |R ⋈ S| ≤ |R| × |S|
= |R| × |S| only when join attribute matches all tuples
= 0 when no tuples share a join attribute value
Join size estimate (uniform distribution):
|R ⋈_{A=B} S| ≈ |R| × |S| / max(V(A,R), V(B,S))
where V = number of distinct values
n = 4, |CK| = 2
Super keys = 2^(4-2) = 4 super keys: {A,B}, {A,B,C}, {A,B,D}, {A,B,C,D}
2. Functional Dependency Formulas
Start: X+ = X
For each FD A→B: if A ⊆ X+, add B to X+
Repeat until stable.
X is a super key iff: X+ = all attributes of R
FD A→B holds in F iff: B ⊆ A+ (computed under F)
Armstrong’s Axioms:
Reflexivity: Y ⊆ X → X→Y
Augmentation: X→Y → XZ→YZ
Transitivity: X→Y, Y→Z → X→Z
Union: X→Y, X→Z → X→YZ
Decomposition: X→YZ → X→Y and X→Z
Lossless decomposition test (R → R1, R2):
(R1 ∩ R2) → R1 ∈ F+ OR (R1 ∩ R2) → R2 ∈ F+
i.e., common attributes form a key for R1 or R2
A+ = {A} → A→B: {A,B} → B→C: {A,B,C} → C→D: {A,B,C,D}
D ∈ A+ → A→D holds. ✓
3. Normalisation Quick Rules
| Normal Form | Violation | Condition for Compliance |
|---|---|---|
| 1NF | Multi-valued or composite attributes | All attributes are atomic |
| 2NF | Partial dependency (non-prime attr depends on part of CK) | Every non-prime attribute fully depends on every CK |
| 3NF | Transitive dependency (non-prime → non-prime → non-prime chain) | For X→A: X is superkey OR A is prime |
| BCNF | Non-superkey determines any attribute | For every non-trivial X→Y: X is a superkey |
Prime attribute: appears in at least ONE candidate key
Non-prime attribute: does NOT appear in ANY candidate key
BCNF vs 3NF violation pattern:
Relation violates BCNF but satisfies 3NF when:
∃ non-trivial X→A where X is NOT a super key AND A IS a prime attribute
Number of tables in 3NF decomposition:
One table per FD in canonical cover + one table for a candidate key (if not already covered)
Canonical Cover F_c:
Minimal set of FDs equivalent to F; no extraneous attributes or redundant FDs
4. B+ Tree Formulas
n × P + (n-1) × K ≤ B
n = ⌊(B + K) / (P + K)⌋
P = pointer size, K = key size, B = block size
Leaf node capacity n_leaf (max keys per leaf):
n_leaf × (K + P_data) + P_next ≤ B
n_leaf = ⌊(B – P_next) / (K + P_data)⌋
Min/Max keys in nodes:
Non-root internal: min ⌈n/2⌉-1 keys, max n-1 keys
Non-root leaf: min ⌈n_leaf/2⌉ keys, max n_leaf keys
Root: min 1 key
Min/Max keys in B+ tree of height h (order n):
Max keys = (n-1) × (1 + n + n² + … + n^(h-1)) + n^h × n_leaf
Min keys (height h, h ≥ 1): 2 × ⌈n/2⌉^(h-1) × ⌈n_leaf/2⌉
Min leaves for N records:
Min leaf nodes = ⌈N / n_leaf⌉
Disk accesses for point query (B+ tree height = h):
= h + 1 (h for tree traversal + 1 for data block)
Internal order: n = ⌊(4096 + 8) / (8 + 8)⌋ = ⌊4104/16⌋ = 256
Leaf capacity: n_leaf = ⌊(4096 – 8) / (8 + 8)⌋ = ⌊4088/16⌋ = 255
Min keys per non-root internal node = ⌈256/2⌉ – 1 = 127
5. Query Cost Formulas
Linear scan: b_r
Binary search (sorted key): ⌈log₂(b_r)⌉
Primary B+ index, key equality: h_i + 1
Secondary B+ index, key equality: h_i + 1
Secondary B+ index, non-key: h_i + n_matches (worst case)
Join cost (b_R, b_S = blocks; n_R = tuples in R; M = memory buffers):
Tuple nested loop: b_R + n_R × b_S
Block nested loop: b_R + ⌈b_R/(M-2)⌉ × b_S
Merge join (pre-sorted): b_R + b_S
Hash join: 3(b_R + b_S) [assumes partition fits in M-2 pages]
Sort cost (external merge sort):
Passes = 1 + ⌈log_{M-1}(⌈b_r/M⌉)⌉
Total I/Os = 2 × b_r × passes
Block nested loop (R outer): 200 + ⌈200/10⌉ × 500 = 200 + 20 × 500 = 10,200
Block nested loop (S outer): 500 + ⌈500/10⌉ × 200 = 500 + 50 × 200 = 10,500
Hash join: 3 × (200 + 500) = 2,100 ← best choice
6. Concurrency Control Key Rules
Two operations conflict if: different transactions + same data item + at least one WRITE
Conflict pairs: R-W, W-R, W-W (R-R does NOT conflict)
Precedence graph:
Edge Ti → Tj if Ti has conflicting operation before Tj’s on same item
Acyclic graph → conflict serializable; cyclic → not conflict serializable
2PL lock compatibility:
S+S: compatible (grant both) | S+X: incompatible | X+X: incompatible
Timestamp ordering:
Read(Ti, Q): abort if TS(Ti) < W-timestamp(Q)
Write(Ti, Q): abort if TS(Ti) < R-timestamp(Q)
ignore (Thomas Write Rule) if TS(Ti) < W-timestamp(Q)
7. Quick GATE Facts
| Topic | Key Fact |
|---|---|
| Super key from CK {A,B} in n-attribute relation | 2^(n-2) super keys |
| Cartesian product size | |R| × |S| tuples, degree(R) + degree(S) columns |
| Natural join max size | |R| × |S|; min size = 0 |
| Dense index entries | One per record = n_r entries |
| Sparse index entries | One per block = b_r entries |
| Secondary index type | Always dense (data not ordered) |
| BCNF vs 3NF | BCNF may not preserve FDs; 3NF always does |
| 2NF and single-attribute PK | Single-attribute PK → automatically in 2NF |
| BCNF violation but 3NF satisfaction | ∃ X→A: X not superkey, A is prime attribute |
| Conflict: R-R | NOT a conflict — both are reads |
| 2PL and deadlock | 2PL does NOT prevent deadlock; use wait-for graph |
| Hash join condition | Only for equi-joins (equality condition) |
| Sort-merge join advantage | Works for both equality and range join conditions |
| B+ tree vs B tree | B+ tree: all data in leaves, leaves linked, shorter height |
| Primary B+ index, key lookup | h_i + 1 disk accesses |
| NOT IN with NULL in subquery | Returns empty result (UNKNOWN for all rows) |
| SQL clause execution order | FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY |
| COUNT(*) vs COUNT(A) | COUNT(*) includes NULLs; COUNT(A) ignores NULLs |
| UNION vs UNION ALL | UNION removes duplicates; UNION ALL keeps all rows |
| Lossless decomposition test | (R1∩R2) must be a key of R1 or R2 |
| Wait-Die protocol | Older waits for younger; younger dies (aborts) |
| Wound-Wait protocol | Older wounds (aborts) younger; younger waits for older |
ER to Schema — Minimum Tables Needed
| ER Component | Tables Needed |
|---|---|
| Strong entity | 1 table |
| Weak entity | 1 table (+ owner FK) |
| 1:1 relationship | 0 new tables (FK added to one entity table) |
| 1:N relationship | 0 new tables (FK added to N-side table) |
| M:N relationship | 1 new relationship table |
| Multi-valued attribute | 1 new table |