Advanced Trees – Data Structures | GATE CS


Advanced Trees

B-trees, B+ trees, Red-Black trees, Segment trees — appear across DS, DBMS, and OS questions in GATE CS. Know the properties and height formulas cold.

Last updated: April 2026  |  GATE CS 2024–2026 syllabus

Key Takeaways

  • B-tree of order m: each node has at most m children and m−1 keys. Min children (non-root): ⌈m/2⌉. Height = O(logmn).
  • B+ tree: all data in leaf nodes; leaves linked as a list. Better for range queries. Used in DBMS indexes.
  • Red-Black tree: BST with 5 coloring properties guaranteeing height ≤ 2log2(n+1). All operations O(log n).
  • Red-Black vs AVL: AVL is more strictly balanced (faster lookup). RB has fewer rotations on insert/delete (better write performance).
  • Segment tree: O(n) build, O(log n) range query, O(log n) point update. Array of size 4n.
  • Fenwick (BIT) tree: simpler, O(log n) prefix sum and update. Uses lowbit trick.
  • Trie: prefix tree for strings. Search O(m) where m = string length. Used for autocomplete, spell check.

1. B-Trees

A B-tree of order m is an m-way search tree where all leaves are at the same level (perfectly height-balanced). Designed for disk-based storage (minimises I/O by packing many keys per node).

Properties of B-tree of order m:
• Every node has at most m children and m−1 keys.
• Every non-root internal node has at least ⌈m/2⌉ children and ⌈m/2⌉−1 keys.
• Root has at least 2 children (unless it is a leaf).
• All leaves at the same level.

Height: h = O(log⌈m/2⌉n) = O(log n / log m)
Max keys in B-tree of height h, order m: mh+1 − 1
Min keys in B-tree of height h, order m: 2⌈m/2⌉h − 1

B-tree Operations

OperationTimeNotes
SearchO(log n)O(h) disk reads, O(m) comparisons per node
InsertO(log n)May cause node split propagating to root
DeleteO(log n)May cause merge or redistribution

2. B+ Trees

A B+ tree is a variant of B-tree where all actual data/records are stored only in leaf nodes. Internal nodes contain only key copies (routing keys) to guide search.

Key differences from B-tree:
• Leaf nodes linked as a doubly linked list → efficient range scans.
• Internal nodes store only keys, not data → higher branching factor → shorter tree.
• All keys appear in leaves (even those also appearing in internal nodes).

Range query: Find start key in O(log n), scan linked list for range in O(range size).
Applications: Database indexes (InnoDB, PostgreSQL), file systems (NTFS, ext4).

B-tree vs B+ tree Comparison

PropertyB-treeB+ tree
Data storageAll nodesLeaf nodes only
Range queriesO(n) traversalO(log n + range)
Search (single key)May stop early (key in internal)Always reaches leaf
Space (internal nodes)Larger (stores data)Smaller (keys only)
Use caseGeneral purposeDatabases, file systems

3. Red-Black Trees

A Red-Black tree (RBT) is a self-balancing BST with an extra colour bit per node.

Red-Black Tree Properties:
1. Every node is Red or Black.
2. The root is Black.
3. Every NIL leaf is Black.
4. If a node is Red, both children are Black (no two consecutive Reds).
5. All paths from any node to its descendant NIL leaves have the same number of Black nodes (Black-height bh).

Height guarantee: h ≤ 2×log2(n+1)
All operations: O(log n)
Insert/Delete: at most 2 rotations (vs AVL: up to O(log n) rotations on delete)

Black-Height Formula

If bh = Black-height of the tree (Black nodes on any root-to-leaf path, not counting root if Red):
Minimum n = 2bh − 1 (all-black perfect tree of height bh−1)
Maximum n = 4bh − 1 (alternating Red-Black full tree)

4. Red-Black Tree vs AVL Tree

PropertyAVL TreeRed-Black Tree
Balance criterion|BF| ≤ 1 (strict)Black-height equal on all paths
HeightAt most 1.44 log2nAt most 2 log2n
Search performanceSlightly faster (shorter)Slightly slower
Insert rotationsAt most 2At most 2
Delete rotationsO(log n)At most 3
Best forRead-heavy workloadsWrite-heavy / balanced R&W
Used inDatabases needing fast lookupLinux kernel, Java TreeMap, C++ std::map

5. Segment Trees

A segment tree answers range aggregate queries (sum, min, max) on an array, with point or range updates.

Build: O(n) — recursively split [l,r] into [l,mid] and [mid+1,r]; leaves store A[i].
Range query [l,r]: O(log n)
Point update A[i]: O(log n)
Range update (lazy propagation): O(log n)
Space: O(4n) array (safe upper bound for segment tree size)

Array indexing: Node i has children 2i (left) and 2i+1 (right). (1-indexed root)

Range Query Algorithm

query(node, node_l, node_r, query_l, query_r):
• No overlap (node_r < query_l or node_l > query_r): return identity (0 for sum, ∞ for min)
• Complete overlap (query_l ≤ node_l and node_r ≤ query_r): return tree[node]
• Partial overlap: return combine(query(2*node,…), query(2*node+1,…))

6. Fenwick Tree (Binary Indexed Tree)

A Fenwick tree (BIT) supports prefix sum queries and point updates in O(log n) with O(n) space — simpler than a segment tree for sum-only problems.

Key operation: lowbit(i) = i & (−i) (isolates rightmost set bit)
Prefix sum A[1..i]: sum = 0; while i>0: sum += BIT[i]; i -= lowbit(i)
Update A[i] by delta: while i≤n: BIT[i]+=delta; i+=lowbit(i)
Time: O(log n)    Space: O(n)

7. Trie (Prefix Tree)

A trie stores strings character by character. Each root-to-node path represents a prefix; root-to-marked-leaf = complete word.

Insert/Search: O(m) where m = string length (independent of number of strings)
Space: O(ALPHABET_SIZE × m × n) worst case for n strings of length m
Applications: Autocomplete, spell check, IP routing (radix trie), dictionary.

8. GATE CS Solved Examples

Example 1 — B-tree capacity (GATE CS 2019)

Q: A B-tree of order 5 has height 3. What is the maximum number of keys it can store?

Max keys = mh+1−1 = 54−1 = 625−1 = 624.

Answer: 624 keys.

Example 2 — Red-Black tree height (GATE CS 2020)

Q: A Red-Black tree has n = 15 nodes. What is the maximum possible height?

h ≤ 2×log2(n+1) = 2×log2(16) = 2×4 = 8.

Maximum height = 8. (Tight bound; minimum height = ⌈log2(n+1)⌉ = 4)

Example 3 — Segment tree query (GATE CS 2022)

Q: Array A = [1, 3, 5, 7, 9, 11]. Build a min-segment tree. Answer: range-min query for indices 2 to 5 (1-indexed)?

A[2..5] = [3, 5, 7, 9]. Minimum = 3.

Answer: 3. Query time O(log 6) = O(log n).

9. Common Mistakes

Mistake 1 — B-tree order vs degree confusion

Order m = maximum number of children. Some textbooks define order as maximum keys (= m−1). Always clarify: “order m = at most m children = at most m−1 keys per node.”

Mistake 2 — B+ tree: all keys in all nodes

In a B+ tree, internal nodes have KEY COPIES (routing keys) that appear again in leaves. All actual data is in leaves. The total keys across the tree is greater than in a B-tree of the same order.

Mistake 3 — Red-Black tree root must be Black

After insert, the newly inserted root (for an empty tree) is always painted Black. Some implementations insert Red and then fix — the fix must include making the root Black.

Mistake 4 — Segment tree size = 2n

The safe array size for a segment tree is 4n, not 2n. 2n is only sufficient for perfect binary trees. Use 4n to handle all cases without index-out-of-bounds.

Mistake 5 — Fenwick tree is 1-indexed only

Fenwick trees MUST be 1-indexed. The lowbit operation on index 0 gives 0, causing an infinite loop. Shift input arrays to be 1-indexed when implementing.

10. FAQ

What is the difference between a B-tree and a B+ tree?

B-tree: data in all nodes. B+ tree: data only in leaves, which are linked as a list for efficient range scans. B+ trees have higher fanout (shorter height) and better range query performance — preferred in databases.

What are the properties of a Red-Black tree?

Root is Black. NIL leaves are Black. Red node → both children Black. All root-to-leaf paths have same Black-height. Guarantees height ≤ 2log(n+1). All operations O(log n).

What is a Segment tree and what problems does it solve?

Segment tree: O(n) build, O(log n) range query (sum/min/max), O(log n) point update. Solves range aggregate problems where queries and updates are frequent. Stored as array of size 4n.

What is the order of a B-tree?

Order m: each node has at most m children and m−1 keys. Min children for non-root: ⌈m/2⌉. Height = O(logmn). Higher m = shorter tree = fewer disk I/Os.