Evaluate constant expressions at compile time.
# Before var x := 10 + 20; var y := x * 2; # After var x := 30; var y := 60;Multi-pass propagation through the AST.
Dead Code EliminationBuild CFG, compute reachability, remove unreachable code.
# Before sub Foo() is return; print("never"); # dead end sub; # After sub Foo() is return; end sub;Common Subexpression Elimination (CSE)Track available expressions, reuse computed values.
# Before var a := x + y; var b := x + y; # redundant # After var a := x + y; var b := a;For 8080, this is critical - arithmetic is expensive.
Strength ReductionReplace expensive operations with cheaper ones.
On 8080 without MUL/DIV instructions, this is huge.
Move computations that don't change inside a loop to outside.
# Before while i < n loop var offset := base + stride; # invariant arr[i] := arr[i] + offset; i := i + 1; end loop; # After var offset := base + stride; while i < n loop arr[i] := arr[i] + offset; i := i + 1; end loop;Loop Unrolling (Small Loops)For loops with known small iteration counts (2-8), unroll completely.
# Before var i: uint8 := 0; while i < 4 loop arr[i] := 0; i := i + 1; end loop; # After arr[0] := 0; arr[1] := 0; arr[2] := 0; arr[3] := 0;Eliminates loop overhead (compare, branch, increment).
Backward dataflow to determine which variables are live at each point.
Register AllocationThe biggest single optimization for 8080.
Strategy: Linear Scan (simpler than graph coloring)
8080 Register Priorities:
HL - pointer operations, array access DE - secondary pointer, 16-bit values BC - loop counters, 16-bit values A - arithmetic (implicit, always available)Allocation Rules:
Eliminate unnecessary MOVs by giving source and dest the same register.
; Before MOV A, B MOV C, A ; A is now dead ; After (allocate B=C) ; eliminated entirelyCowgol already does this conservatively. With full call-graph analysis, we can do it optimally.
sub A() calls B(), C() sub B() uses x, y sub C() uses z, w # x,y and z,w can share memory (B and C never concurrent)Workspace MinimizationNested subroutines share workspace. Minimize total workspace by optimal variable packing.
Constant PoolingShare identical constants in memory.
print("Error"); print("Error"); # share the stringSince Cowgol forbids recursion, we can inline aggressively.
Inline Criteria:
Benefits:
When an interface variable has only one possible implementation, inline it.
interface Cmp(a: uint8, b: uint8): (r: int8); sub NumCmp implements Cmp is ... end sub; var cmp: Cmp := NumCmp; # If NumCmp is the only implementation, inline calls to cmp()After code generation, pattern-match and replace.
Choose optimal instructions during code gen.
; Load 0 into A MVI A, 0 ; 7 cycles, 2 bytes XRA A ; 4 cycles, 1 byte ← better ; Compare A to 0 CPI 0 ; 7 cycles, 2 bytes ORA A ; 4 cycles, 1 byte ← better (sets Z flag) ; 16-bit increment INR L ; need to handle carry ; vs INX H ; 5 cycles, 1 byte ← betterConvert to Static Single Assignment for:
Reorder instructions to minimize stalls (less critical on 8080 than modern CPUs).
Profile-Guided OptimizationRun program, collect execution counts, optimize hot paths.