DBMS Formulas Quick Reference – Complete Cheat Sheet for GATE CS



DBMS Formulas Quick Reference – Complete Cheat Sheet for GATE CS

What This Page Covers

  • Relational model: super keys, candidate keys counting
  • Functional dependencies: closure, Armstrong’s axioms
  • Normalisation: normal form identification rules
  • B+ tree: order, height, minimum/maximum key counts
  • Query cost: selection, join algorithms
  • SQL aggregate behaviour with NULLs
  • Quick GATE facts table

1. Relational Model Formulas

Super keys from a candidate key CK in relation R with n attributes:
Number of super keys = 2^(n – |CK|)
(every superset of CK is a super key)

Super keys from multiple candidate keys:
Use inclusion-exclusion principle.

Cartesian product result size:
|R × S| = |R| × |S| tuples; attributes = degree(R) + degree(S)

Natural join result size:
0 ≤ |R ⋈ S| ≤ |R| × |S|
= |R| × |S| only when join attribute matches all tuples
= 0 when no tuples share a join attribute value

Join size estimate (uniform distribution):
|R ⋈_{A=B} S| ≈ |R| × |S| / max(V(A,R), V(B,S))
where V = number of distinct values

Super key example: R(A, B, C, D), CK = {A, B}
n = 4, |CK| = 2
Super keys = 2^(4-2) = 4 super keys: {A,B}, {A,B,C}, {A,B,D}, {A,B,C,D}

2. Functional Dependency Formulas

Attribute Closure X+:
Start: X+ = X
For each FD A→B: if A ⊆ X+, add B to X+
Repeat until stable.

X is a super key iff: X+ = all attributes of R

FD A→B holds in F iff: B ⊆ A+ (computed under F)

Armstrong’s Axioms:
Reflexivity: Y ⊆ X → X→Y
Augmentation: X→Y → XZ→YZ
Transitivity: X→Y, Y→Z → X→Z
Union: X→Y, X→Z → X→YZ
Decomposition: X→YZ → X→Y and X→Z

Lossless decomposition test (R → R1, R2):
(R1 ∩ R2) → R1 ∈ F+   OR   (R1 ∩ R2) → R2 ∈ F+
i.e., common attributes form a key for R1 or R2

Test A→D in R(A,B,C,D) with F = {A→B, B→C, C→D}:
A+ = {A} → A→B: {A,B} → B→C: {A,B,C} → C→D: {A,B,C,D}
D ∈ A+ → A→D holds. ✓

3. Normalisation Quick Rules

Normal FormViolationCondition for Compliance
1NFMulti-valued or composite attributesAll attributes are atomic
2NFPartial dependency (non-prime attr depends on part of CK)Every non-prime attribute fully depends on every CK
3NFTransitive dependency (non-prime → non-prime → non-prime chain)For X→A: X is superkey OR A is prime
BCNFNon-superkey determines any attributeFor every non-trivial X→Y: X is a superkey
Identifying prime vs non-prime attributes:
Prime attribute: appears in at least ONE candidate key
Non-prime attribute: does NOT appear in ANY candidate key

BCNF vs 3NF violation pattern:
Relation violates BCNF but satisfies 3NF when:
∃ non-trivial X→A where X is NOT a super key AND A IS a prime attribute

Number of tables in 3NF decomposition:
One table per FD in canonical cover + one table for a candidate key (if not already covered)

Canonical Cover F_c:
Minimal set of FDs equivalent to F; no extraneous attributes or redundant FDs

4. B+ Tree Formulas

Internal node order n (max pointers per internal node):
n × P + (n-1) × K ≤ B
n = ⌊(B + K) / (P + K)⌋
P = pointer size, K = key size, B = block size

Leaf node capacity n_leaf (max keys per leaf):
n_leaf × (K + P_data) + P_next ≤ B
n_leaf = ⌊(B – P_next) / (K + P_data)⌋

Min/Max keys in nodes:
Non-root internal: min ⌈n/2⌉-1 keys, max n-1 keys
Non-root leaf: min ⌈n_leaf/2⌉ keys, max n_leaf keys
Root: min 1 key

Min/Max keys in B+ tree of height h (order n):
Max keys = (n-1) × (1 + n + n² + … + n^(h-1)) + n^h × n_leaf
Min keys (height h, h ≥ 1): 2 × ⌈n/2⌉^(h-1) × ⌈n_leaf/2⌉

Min leaves for N records:
Min leaf nodes = ⌈N / n_leaf⌉

Disk accesses for point query (B+ tree height = h):
= h + 1 (h for tree traversal + 1 for data block)

Example: B = 4096 B, K = 8 B, P = 8 B, P_data = 8 B, P_next = 8 B
Internal order: n = ⌊(4096 + 8) / (8 + 8)⌋ = ⌊4104/16⌋ = 256
Leaf capacity: n_leaf = ⌊(4096 – 8) / (8 + 8)⌋ = ⌊4088/16⌋ = 255
Min keys per non-root internal node = ⌈256/2⌉ – 1 = 127

5. Query Cost Formulas

Selection cost (b_r = data blocks, h_i = index height):
Linear scan: b_r
Binary search (sorted key): ⌈log₂(b_r)⌉
Primary B+ index, key equality: h_i + 1
Secondary B+ index, key equality: h_i + 1
Secondary B+ index, non-key: h_i + n_matches (worst case)

Join cost (b_R, b_S = blocks; n_R = tuples in R; M = memory buffers):
Tuple nested loop: b_R + n_R × b_S
Block nested loop: b_R + ⌈b_R/(M-2)⌉ × b_S
Merge join (pre-sorted): b_R + b_S
Hash join: 3(b_R + b_S) [assumes partition fits in M-2 pages]

Sort cost (external merge sort):
Passes = 1 + ⌈log_{M-1}(⌈b_r/M⌉)⌉
Total I/Os = 2 × b_r × passes

Join cost example: b_R = 200, b_S = 500, M = 12
Block nested loop (R outer): 200 + ⌈200/10⌉ × 500 = 200 + 20 × 500 = 10,200
Block nested loop (S outer): 500 + ⌈500/10⌉ × 200 = 500 + 50 × 200 = 10,500
Hash join: 3 × (200 + 500) = 2,100 ← best choice

6. Concurrency Control Key Rules

Conflict detection:
Two operations conflict if: different transactions + same data item + at least one WRITE
Conflict pairs: R-W, W-R, W-W (R-R does NOT conflict)

Precedence graph:
Edge Ti → Tj if Ti has conflicting operation before Tj’s on same item
Acyclic graph → conflict serializable; cyclic → not conflict serializable

2PL lock compatibility:
S+S: compatible (grant both) | S+X: incompatible | X+X: incompatible

Timestamp ordering:
Read(Ti, Q): abort if TS(Ti) < W-timestamp(Q)
Write(Ti, Q): abort if TS(Ti) < R-timestamp(Q)
              ignore (Thomas Write Rule) if TS(Ti) < W-timestamp(Q)

7. Quick GATE Facts

TopicKey Fact
Super key from CK {A,B} in n-attribute relation2^(n-2) super keys
Cartesian product size|R| × |S| tuples, degree(R) + degree(S) columns
Natural join max size|R| × |S|; min size = 0
Dense index entriesOne per record = n_r entries
Sparse index entriesOne per block = b_r entries
Secondary index typeAlways dense (data not ordered)
BCNF vs 3NFBCNF may not preserve FDs; 3NF always does
2NF and single-attribute PKSingle-attribute PK → automatically in 2NF
BCNF violation but 3NF satisfaction∃ X→A: X not superkey, A is prime attribute
Conflict: R-RNOT a conflict — both are reads
2PL and deadlock2PL does NOT prevent deadlock; use wait-for graph
Hash join conditionOnly for equi-joins (equality condition)
Sort-merge join advantageWorks for both equality and range join conditions
B+ tree vs B treeB+ tree: all data in leaves, leaves linked, shorter height
Primary B+ index, key lookuph_i + 1 disk accesses
NOT IN with NULL in subqueryReturns empty result (UNKNOWN for all rows)
SQL clause execution orderFROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY
COUNT(*) vs COUNT(A)COUNT(*) includes NULLs; COUNT(A) ignores NULLs
UNION vs UNION ALLUNION removes duplicates; UNION ALL keeps all rows
Lossless decomposition test(R1∩R2) must be a key of R1 or R2
Wait-Die protocolOlder waits for younger; younger dies (aborts)
Wound-Wait protocolOlder wounds (aborts) younger; younger waits for older

ER to Schema — Minimum Tables Needed

ER ComponentTables Needed
Strong entity1 table
Weak entity1 table (+ owner FK)
1:1 relationship0 new tables (FK added to one entity table)
1:N relationship0 new tables (FK added to N-side table)
M:N relationship1 new relationship table
Multi-valued attribute1 new table

Leave a Comment