- Code optimisation improves program performance (speed, memory, power) without changing what the program computes.
- Local optimisation works within a single basic block. Global optimisation works across basic blocks using the Control Flow Graph.
- Common Subexpression Elimination (CSE) avoids recomputing the same expression by reusing its previously computed value.
- Constant folding evaluates constant expressions at compile time. Constant propagation replaces a variable with its known constant value at each use.
- Loop optimisations (invariant code motion, strength reduction, induction variable elimination) are the most impactful for programs with tight loops.
- Data flow analysis (reaching definitions, live variable analysis) computes global information about values and variable lifetimes across the CFG.
1. Optimisation Taxonomy
Optimisations are classified by scope:
| Level | Scope | Examples |
|---|---|---|
| Peephole | Few consecutive instructions | Remove redundant loads, replace multiply-by-2 with shift |
| Local | Within a single basic block | CSE, constant folding, dead code elimination |
| Global | Across basic blocks within a function | Global CSE, constant propagation, live variable analysis |
| Loop | Within and around loops | Code motion, strength reduction, loop unrolling |
| Interprocedural | Across function boundaries | Inlining, alias analysis, whole-program optimisation |
The fundamental constraint: An optimisation must be safe — it must never change the observable behaviour of a correct program. Compilers prove safety using data flow analysis before applying transformations.
2. Peephole Optimisation
Peephole optimisation examines a small sliding window of instructions (the “peephole”) and replaces pattern sequences with more efficient equivalents. It is simple, machine-specific, and applied after code generation or on intermediate code.
Common Peephole Transformations
| Pattern (Before) | Replacement (After) | Reason |
|---|---|---|
| x = x + 0 | (delete) | Adding 0 is identity |
| x = x * 1 | (delete) | Multiplying by 1 is identity |
| x = x * 0 | x = 0 | Zero property |
| x = x * 2 | x = x << 1 | Shift is cheaper than multiply |
| x = x / 4 | x = x >> 2 | Shift is cheaper than divide (unsigned) |
| STORE x; LOAD x | STORE x | Redundant load after store to same location |
| JUMP L; L: | L: | Useless jump to next instruction |
| if (2 > 3) goto L | (delete) | Always-false branch |
3. Local Optimisations (Within a Basic Block)
Constant Folding
Evaluate expressions with constant operands at compile time.
t = 3 * 4 + 2 → t = 14if (3 > 5) goto L → (deleted — always false)Constant Propagation
Replace a variable with its known constant value at all subsequent uses within the block.
x = 5; y = x + 3 → x = 5; y = 8Common Subexpression Elimination (CSE)
If the same expression is computed multiple times and its operands have not changed between computations, replace the second computation with a reference to the first result.
t1 = a * b; t2 = a * b; c = t1 + t2→ t1 = a * b; c = t1 + t1Dead Code Elimination
Remove instructions whose results are never used. An assignment x = expr is dead if x is never read after this point.
t = a + b; /* t is never used */ → (deleted)Copy Propagation
After a copy x = y, replace subsequent uses of x with y (while y is not redefined).
x = y; z = x + 1 → x = y; z = y + 1(then dead code elimination may remove x = y if x unused)4. Loop Optimisations
Loops are the hotspot of most programs — a statement executed 1000 times in a loop contributes 1000× more to runtime than a statement executed once. Loop optimisations yield the greatest performance gains.
Loop Invariant Code Motion (LICM)
Move computations that produce the same result on every iteration to outside the loop (into the loop header/pre-header).
for (i=0; i<n; i++) { t = a * b; x[i] = t + i; }→ t = a * b; for (i=0; i<n; i++) { x[i] = t + i; }Strength Reduction
Replace expensive operations with cheaper equivalent ones. Most commonly: replace multiplication by an induction variable with addition.
for (i=0; i<n; i++) { t = i * 4; a[t] = 0; }→ t = 0; for (i=0; i<n; i++) { a[t] = 0; t = t + 4; }Induction Variable Elimination
After strength reduction, the original loop counter (induction variable) may no longer be needed — it can be eliminated if the loop exit condition can be reformulated in terms of the new variable.
Loop Unrolling
Replicate the loop body k times and reduce the iteration count by k, eliminating k-1 out of every k branch-back-to-loop-start instructions.
for (i=0; i<8; i++) a[i] = 0;→ a[0]=0; a[1]=0; a[2]=0; a[3]=0; a[4]=0; a[5]=0; a[6]=0; a[7]=0;Loop Fusion
Combine two adjacent loops with the same iteration range into one loop, reducing loop overhead and improving cache locality.
Loop Interchange
Swap the order of nested loops to improve cache performance (accessing array memory in row-major order).
5. Data Flow Analysis
Data flow analysis computes information about values and variable uses across the entire CFG — not just within one basic block. It uses equations defined for each basic block (gen and kill sets) and propagates solutions iteratively until a fixed point is reached.
Each analysis defines a direction (forward or backward) and a meet operator (∪ or ∩).
| Analysis | Direction | Meet | Purpose |
|---|---|---|---|
| Reaching Definitions | Forward | ∪ | Which definitions reach each point? |
| Live Variables | Backward | ∪ | Which variables are live (needed) at each point? |
| Available Expressions | Forward | ∩ | Which expressions have already been computed? |
| Very Busy Expressions | Backward | ∩ | Which expressions are computed on every path? |
6. Reaching Definitions
A definition d of variable x reaches point p if there exists a path from d to p along which x is not redefined. Used for constant propagation and identifying uses of uninitialised variables.
Transfer Functions
OUT[B] = gen[B] ∪ (IN[B] − kill[B])IN[B] = ∪ OUT[P] for all predecessors P of B- gen[B]: Definitions generated in B (last definition of each variable in B)
- kill[B]: All other definitions of variables defined in B (from anywhere in the program)
7. Live Variable Analysis
A variable x is live at point p if there is a path from p to some use of x along which x is not redefined. Used for dead code elimination and register allocation (live variables must stay in registers or memory).
Transfer Functions (Backward Analysis)
IN[B] = use[B] ∪ (OUT[B] − def[B])OUT[B] = ∪ IN[S] for all successors S of B- use[B]: Variables used in B before being defined in B
- def[B]: Variables defined in B before being used in B
8. Available Expressions
An expression x op y is available at point p if on every path from entry to p, x op y has been computed and neither x nor y has been subsequently modified. Used for global CSE.
Transfer Functions (Forward Analysis)
OUT[B] = e_gen[B] ∪ (IN[B] − e_kill[B])IN[B] = ∩ OUT[P] for all predecessors P of BNote: the meet operator is intersection — an expression is only available if it was computed on all preceding paths (conservative: must be available everywhere to safely reuse it).
9. Common Misconceptions
Misconception 1: Code optimisation can always make programs faster.
Optimisation is a best-effort process. Some transformations trade speed for memory (loop unrolling increases code size). Some optimisations help on one architecture but hurt on another (cache effects vary). Compilers offer optimisation levels (O0–O3) precisely because aggressive optimisation is not always beneficial.
Misconception 2: Constant folding and constant propagation are the same thing.
Constant folding evaluates constant expressions at compile time (3*4 becomes 12). Constant propagation replaces a variable with its known constant value (if x=5, replace later uses of x with 5). Folding applies to literal expressions; propagation tracks variables. They are often applied together iteratively.
Misconception 3: Dead code elimination removes unreachable code.
Unreachable code (e.g., code after a return statement) is one form of dead code. Dead code in the data flow sense is live code that produces results nobody reads — assignments to variables never subsequently used. Both are eliminated, but they are detected differently: unreachable code by CFG analysis, dead assignments by live variable analysis.
Misconception 4: Loop invariant code motion is always safe.
Moving code out of a loop is safe only if the instruction would have been executed at least once (the loop runs at least once) and has no side effects (or the side effect can safely happen earlier). Moving an expression that could throw an exception changes the program’s observable behaviour. Compilers check these conditions carefully.
Misconception 5: Data flow analysis gives exact answers.
Data flow analysis gives conservative (safe) approximations. Reaching definitions analysis may report that a definition “might reach” a point even if it never does at runtime (due to infeasible paths). The analysis is correct in the sense that it never misses a real dependency — but it may report false dependencies. This conservatism ensures transformations based on it are always safe.
10. Frequently Asked Questions
What is the difference between local and global optimisation?
Local optimisation works within a single basic block — a straight-line code segment with no branches. It can see all values computed in that block and apply transformations like CSE, constant folding, and dead code elimination. Global optimisation works across multiple basic blocks using the Control Flow Graph and data flow analysis. It can move code across branches (like hoisting loop-invariant computations out of a loop), but requires more complex analysis to prove safety.
How does a compiler perform common subexpression elimination globally?
Global CSE uses available expressions analysis. The compiler computes which expressions are available at the entry of each basic block (have been computed on all preceding paths and not subsequently killed). If expression x op y is available at a point where it is computed again, the second computation is replaced by the previously computed value. This requires inserting a temporary variable to hold the value across basic blocks.
What is live variable analysis and why is it important for register allocation?
A variable is live at a point if its current value will be used later on some execution path. Live variable analysis computes the set of live variables at every program point. For register allocation: a register holding a variable’s value must not be reused for another variable as long as the first variable is still live. Two variables that are never simultaneously live can share a register. This analysis is the foundation of graph-colouring register allocation.
What is strength reduction and when is it used?
Strength reduction replaces a computationally expensive operation with a cheaper one that produces the same result. The most common example is replacing multiplication by an induction variable (i * k) with repeated addition (add k each iteration). It is applied during loop optimisation. On modern CPUs, multiplication is fast, so strength reduction matters mainly for memory address calculations and embedded/RISC systems where multiply is expensive.