Code Optimisation – CSE, Constant Folding, Loop Optimisation and Data Flow Analysis Explained

Home
CS & IT
Compiler Design
Code Optimisation
Key Takeaways:

  • Code optimisation improves program performance (speed, memory, power) without changing what the program computes.
  • Local optimisation works within a single basic block. Global optimisation works across basic blocks using the Control Flow Graph.
  • Common Subexpression Elimination (CSE) avoids recomputing the same expression by reusing its previously computed value.
  • Constant folding evaluates constant expressions at compile time. Constant propagation replaces a variable with its known constant value at each use.
  • Loop optimisations (invariant code motion, strength reduction, induction variable elimination) are the most impactful for programs with tight loops.
  • Data flow analysis (reaching definitions, live variable analysis) computes global information about values and variable lifetimes across the CFG.

1. Optimisation Taxonomy

Optimisations are classified by scope:

LevelScopeExamples
PeepholeFew consecutive instructionsRemove redundant loads, replace multiply-by-2 with shift
LocalWithin a single basic blockCSE, constant folding, dead code elimination
GlobalAcross basic blocks within a functionGlobal CSE, constant propagation, live variable analysis
LoopWithin and around loopsCode motion, strength reduction, loop unrolling
InterproceduralAcross function boundariesInlining, alias analysis, whole-program optimisation

The fundamental constraint: An optimisation must be safe — it must never change the observable behaviour of a correct program. Compilers prove safety using data flow analysis before applying transformations.

2. Peephole Optimisation

Peephole optimisation examines a small sliding window of instructions (the “peephole”) and replaces pattern sequences with more efficient equivalents. It is simple, machine-specific, and applied after code generation or on intermediate code.

Common Peephole Transformations

Pattern (Before)Replacement (After)Reason
x = x + 0(delete)Adding 0 is identity
x = x * 1(delete)Multiplying by 1 is identity
x = x * 0x = 0Zero property
x = x * 2x = x << 1Shift is cheaper than multiply
x = x / 4x = x >> 2Shift is cheaper than divide (unsigned)
STORE x; LOAD xSTORE xRedundant load after store to same location
JUMP L; L:L:Useless jump to next instruction
if (2 > 3) goto L(delete)Always-false branch

3. Local Optimisations (Within a Basic Block)

Constant Folding

Evaluate expressions with constant operands at compile time.

t = 3 * 4 + 2   →   t = 14
if (3 > 5) goto L  →   (deleted — always false)

Constant Propagation

Replace a variable with its known constant value at all subsequent uses within the block.

x = 5; y = x + 3  →  x = 5; y = 8

Common Subexpression Elimination (CSE)

If the same expression is computed multiple times and its operands have not changed between computations, replace the second computation with a reference to the first result.

t1 = a * b; t2 = a * b; c = t1 + t2
→ t1 = a * b; c = t1 + t1

Dead Code Elimination

Remove instructions whose results are never used. An assignment x = expr is dead if x is never read after this point.

t = a + b;  /* t is never used */  →  (deleted)

Copy Propagation

After a copy x = y, replace subsequent uses of x with y (while y is not redefined).

x = y; z = x + 1  →  x = y; z = y + 1
(then dead code elimination may remove x = y if x unused)

4. Loop Optimisations

Loops are the hotspot of most programs — a statement executed 1000 times in a loop contributes 1000× more to runtime than a statement executed once. Loop optimisations yield the greatest performance gains.

Loop Invariant Code Motion (LICM)

Move computations that produce the same result on every iteration to outside the loop (into the loop header/pre-header).

for (i=0; i<n; i++) { t = a * b; x[i] = t + i; }
→ t = a * b; for (i=0; i<n; i++) { x[i] = t + i; }

Strength Reduction

Replace expensive operations with cheaper equivalent ones. Most commonly: replace multiplication by an induction variable with addition.

for (i=0; i<n; i++) { t = i * 4; a[t] = 0; }
→ t = 0; for (i=0; i<n; i++) { a[t] = 0; t = t + 4; }

Induction Variable Elimination

After strength reduction, the original loop counter (induction variable) may no longer be needed — it can be eliminated if the loop exit condition can be reformulated in terms of the new variable.

Loop Unrolling

Replicate the loop body k times and reduce the iteration count by k, eliminating k-1 out of every k branch-back-to-loop-start instructions.

for (i=0; i<8; i++) a[i] = 0;
→ a[0]=0; a[1]=0; a[2]=0; a[3]=0; a[4]=0; a[5]=0; a[6]=0; a[7]=0;

Loop Fusion

Combine two adjacent loops with the same iteration range into one loop, reducing loop overhead and improving cache locality.

Loop Interchange

Swap the order of nested loops to improve cache performance (accessing array memory in row-major order).

5. Data Flow Analysis

Data flow analysis computes information about values and variable uses across the entire CFG — not just within one basic block. It uses equations defined for each basic block (gen and kill sets) and propagates solutions iteratively until a fixed point is reached.

Each analysis defines a direction (forward or backward) and a meet operator (∪ or ∩).

AnalysisDirectionMeetPurpose
Reaching DefinitionsForwardWhich definitions reach each point?
Live VariablesBackwardWhich variables are live (needed) at each point?
Available ExpressionsForwardWhich expressions have already been computed?
Very Busy ExpressionsBackwardWhich expressions are computed on every path?

6. Reaching Definitions

A definition d of variable x reaches point p if there exists a path from d to p along which x is not redefined. Used for constant propagation and identifying uses of uninitialised variables.

Transfer Functions

OUT[B] = gen[B] ∪ (IN[B] − kill[B])
IN[B] = ∪ OUT[P] for all predecessors P of B
  • gen[B]: Definitions generated in B (last definition of each variable in B)
  • kill[B]: All other definitions of variables defined in B (from anywhere in the program)

7. Live Variable Analysis

A variable x is live at point p if there is a path from p to some use of x along which x is not redefined. Used for dead code elimination and register allocation (live variables must stay in registers or memory).

Transfer Functions (Backward Analysis)

IN[B] = use[B] ∪ (OUT[B] − def[B])
OUT[B] = ∪ IN[S] for all successors S of B
  • use[B]: Variables used in B before being defined in B
  • def[B]: Variables defined in B before being used in B

8. Available Expressions

An expression x op y is available at point p if on every path from entry to p, x op y has been computed and neither x nor y has been subsequently modified. Used for global CSE.

Transfer Functions (Forward Analysis)

OUT[B] = e_gen[B] ∪ (IN[B] − e_kill[B])
IN[B] = ∩ OUT[P] for all predecessors P of B

Note: the meet operator is intersection — an expression is only available if it was computed on all preceding paths (conservative: must be available everywhere to safely reuse it).

9. Common Misconceptions

Misconception 1: Code optimisation can always make programs faster.
Optimisation is a best-effort process. Some transformations trade speed for memory (loop unrolling increases code size). Some optimisations help on one architecture but hurt on another (cache effects vary). Compilers offer optimisation levels (O0–O3) precisely because aggressive optimisation is not always beneficial.

Misconception 2: Constant folding and constant propagation are the same thing.
Constant folding evaluates constant expressions at compile time (3*4 becomes 12). Constant propagation replaces a variable with its known constant value (if x=5, replace later uses of x with 5). Folding applies to literal expressions; propagation tracks variables. They are often applied together iteratively.

Misconception 3: Dead code elimination removes unreachable code.
Unreachable code (e.g., code after a return statement) is one form of dead code. Dead code in the data flow sense is live code that produces results nobody reads — assignments to variables never subsequently used. Both are eliminated, but they are detected differently: unreachable code by CFG analysis, dead assignments by live variable analysis.

Misconception 4: Loop invariant code motion is always safe.
Moving code out of a loop is safe only if the instruction would have been executed at least once (the loop runs at least once) and has no side effects (or the side effect can safely happen earlier). Moving an expression that could throw an exception changes the program’s observable behaviour. Compilers check these conditions carefully.

Misconception 5: Data flow analysis gives exact answers.
Data flow analysis gives conservative (safe) approximations. Reaching definitions analysis may report that a definition “might reach” a point even if it never does at runtime (due to infeasible paths). The analysis is correct in the sense that it never misses a real dependency — but it may report false dependencies. This conservatism ensures transformations based on it are always safe.

10. Frequently Asked Questions

What is the difference between local and global optimisation?

Local optimisation works within a single basic block — a straight-line code segment with no branches. It can see all values computed in that block and apply transformations like CSE, constant folding, and dead code elimination. Global optimisation works across multiple basic blocks using the Control Flow Graph and data flow analysis. It can move code across branches (like hoisting loop-invariant computations out of a loop), but requires more complex analysis to prove safety.

How does a compiler perform common subexpression elimination globally?

Global CSE uses available expressions analysis. The compiler computes which expressions are available at the entry of each basic block (have been computed on all preceding paths and not subsequently killed). If expression x op y is available at a point where it is computed again, the second computation is replaced by the previously computed value. This requires inserting a temporary variable to hold the value across basic blocks.

What is live variable analysis and why is it important for register allocation?

A variable is live at a point if its current value will be used later on some execution path. Live variable analysis computes the set of live variables at every program point. For register allocation: a register holding a variable’s value must not be reused for another variable as long as the first variable is still live. Two variables that are never simultaneously live can share a register. This analysis is the foundation of graph-colouring register allocation.

What is strength reduction and when is it used?

Strength reduction replaces a computationally expensive operation with a cheaper one that produces the same result. The most common example is replacing multiplication by an induction variable (i * k) with repeated addition (add k each iteration). It is applied during loop optimisation. On modern CPUs, multiplication is fast, so strength reduction matters mainly for memory address calculations and embedded/RISC systems where multiply is expensive.

Leave a Comment