Greedy Algorithms
Activity selection, Huffman, Kruskal, Prim, Dijkstra, and fractional knapsack — with correctness proofs for GATE CS.
Last updated: April 2026 | GATE CS 2024–2026 syllabus
Key Takeaways
- Greedy: make the locally optimal choice at each step, never backtrack.
- Greedy is correct when: greedy choice property + optimal substructure.
- Correctness proven by: exchange argument or greedy stays ahead.
- Activity selection: sort by finish time, always pick earliest finishing. O(n log n).
- Huffman: build prefix-free code. O(n log n). Minimises expected code length.
- Dijkstra: single-source shortest paths, non-negative edges. O((V+E) log V) with binary heap.
- Kruskal/Prim: MST. Kruskal: O(E log E). Prim with Fibonacci heap: O(E + V log V).
1. Greedy vs Dynamic Programming
| Property | Greedy | Dynamic Programming |
|---|---|---|
| Subproblem dependency | Subproblem chosen by greedy choice | All subproblems solved |
| Decisions | One decision per step, irrevocable | Considers all alternatives |
| Speed | Usually faster | Usually slower |
| When correct | Only with greedy choice property | Always with optimal substructure |
| Example | Fractional knapsack, Dijkstra | 0-1 Knapsack, Bellman-Ford |
2. Activity Selection
Select maximum number of non-overlapping activities. Each activity has start s[i] and finish f[i].
Algorithm: Sort by f[i]. Select first activity. For each next: if s[i] ≥ f[last], select it.
Time: O(n log n) for sort + O(n) scan = O(n log n)
Proof (exchange argument): Let OPT be optimal. If OPT’s first choice has later finish than greedy’s, swap it with greedy choice — OPT size doesn’t decrease (earlier finish ⇒ at least as much room for remaining). So greedy stays ahead.
3. Fractional Knapsack
Greedy rule: Sort by ratio v[i]/w[i] descending. Fill greedily; take fraction of last item.
Time: O(n log n) for sort + O(n) fill
Proof: Exchange argument — swapping any fraction of a lower-ratio item for a fraction of a higher-ratio item increases total value. So taking highest ratios first is optimal.
4. Huffman Coding
Assign variable-length prefix-free binary codes to symbols to minimise expected code length.
1. Insert all symbols as leaf nodes into a min-heap (key = frequency).
2. Repeat n−1 times: extract two minima x,y; create parent node with freq x+y; insert parent.
3. Root is Huffman tree. Left edge = 0, right edge = 1.
Time: O(n log n) using min-heap
Expected code length = ∑ freq[i] × depth[i] = minimised
Huffman code is optimal prefix-free code.
Properties: More frequent symbols → shorter codes. No code is prefix of another.
5. Job Scheduling
Greedy: sort by deadline (Earliest Deadline First). Schedule in this order.
Lateness of job i = max(0, completion_time − d[i]). Minimises max lateness. O(n log n).
Maximize profit (each job takes 1 unit, has deadline d[i], profit p[i]):
Sort by profit descending. Schedule each job in latest available slot before its deadline using Union-Find.
Time: O(n log n) sort + O(n α(n)) scheduling.
6. Dijkstra’s Algorithm
Single-source shortest paths on graphs with non-negative edge weights.
Extract vertex u with min dist. For each neighbour v: if dist[u]+w(u,v)<dist[v]: relax.
Time: O((V+E) log V) with binary heap
Time: O(E + V log V) with Fibonacci heap
Space: O(V)
Greedy choice: The vertex with smallest dist[] that hasn’t been settled has its true shortest distance.
Fails on: Negative edges (settled vertices may be re-optimised by negative edge later).
7. Minimum Spanning Tree: Kruskal & Prim
Sort all edges by weight. Add edge if it doesn’t create cycle (use Union-Find).
Time: O(E log E) = O(E log V) Space: O(V)
Best for sparse graphs.
Prim’s algorithm:
Start with any vertex. Repeatedly add minimum-weight edge connecting tree to non-tree vertex.
Time: O(E log V) with binary heap; O(E + V log V) with Fibonacci heap
Best for dense graphs (adjacency matrix): O(V²).
Correctness (Cut property): For any cut of graph, minimum weight edge crossing cut is in some MST.
Cycle property: Maximum weight edge in any cycle is NOT in any MST.
| Property | Kruskal | Prim |
|---|---|---|
| Approach | Edge-based | Vertex-based |
| Data structure | Union-Find | Priority queue |
| Best for | Sparse (E ≈ V) | Dense (E ≈ V²) |
| Time (binary heap) | O(E log V) | O(E log V) |
8. Proving Correctness
Define measure of solution quality. Show after each step, greedy solution ≥ any other (by induction). Conclude greedy is optimal at the end.
Method 2 — Exchange argument:
Take an arbitrary optimal solution OPT. Show you can transform OPT into greedy solution step by step without decreasing objective value. Conclude greedy ≥ OPT ⇒ greedy is optimal.
9. GATE Examples
Build tree: merge 5+10=15, 15+20=35, 25+35=60, 40+60=100.
Codes: 40→0(1-bit), 25→10(2), 20→110(3), 10→1110(4), 5→1111(4)
Expected length = (40×1+25×2+20×3+10×4+5×4)/100 = (40+50+60+40+20)/100 = 210/100 = 2.1 bits/symbol
Total = 2100 bits.
Dijkstra may give incorrect answers. Settled vertices cannot be reopened. Answer: Dijkstra may not give correct SSSP with negative edges.
MST always has V−1 edges = 4 edges. Answer: 4.
10. Common Mistakes
- Applying Dijkstra to graphs with negative edges: It gives incorrect results. Use Bellman-Ford for negative edges (O(VE)).
- Confusing Huffman optimality with minimum characters: Huffman minimises expected (average) code length weighted by frequency, not total characters.
- Kruskal needs connected graph: Kruskal produces MST only if graph is connected; otherwise produces MSF (minimum spanning forest).
- Activity selection: sort by start time, not finish time: Sorting by start time does NOT give maximum activities. Sort by finish time.
- Greedy for 0-1 Knapsack: Greedy by ratio fails for 0-1 knapsack. Classic counterexample: items (v=60,w=10),(v=100,w=20),(v=120,w=30) with W=50.
11. FAQ
- What is the greedy choice property?
- A greedy algorithm has the greedy choice property if making the locally optimal choice at each step leads to a globally optimal solution. Formally: there always exists an optimal solution that includes the greedy choice made at the first step.
- Why does Dijkstra fail on negative edges?
- Dijkstra assumes once a vertex u is extracted from the priority queue (settled), dist[u] is final. A negative-weight edge could later provide a shorter path to u through an as-yet unsettled vertex, contradicting this assumption. Bellman-Ford handles negative edges by relaxing all edges V−1 times.
- What is the time complexity of Huffman coding?
- O(n log n) using a min-heap. Building initial heap: O(n). Each of the n−1 merge operations extracts twice and inserts once: O(log n) each. Total: O(n log n).
- How do you prove a greedy algorithm is correct?
- Two standard methods: (1) Greedy stays ahead — prove by induction that after k steps, greedy is at least as good as any other algorithm. (2) Exchange argument — take any optimal solution, swap its choices with greedy choices one by one, proving each swap doesn’t decrease objective value, until you arrive at the greedy solution.