Local Optimizations
Table of Contents
- Intuition
- Review
- Classic optimizations and their analyses
- Reaching definitions analysis
- Defining the analysis with set operations
- An example
- Available expressions analysis
- Defining the analysis with set operations
- An example
- Live variable analysis
- Informal algorithm
- Defining the analysis with set operations
- An example
- Generalizing data-flow analyses
- Transfer functions
- Foramlizing the analyses
- GEN and KILL
Intuition
Many equivalent programs
a := 2; b := a + 3;
We chose an easy translation
0: _t0 := 2 1: a := _t0 2: _t1 := 3 3: _t2 := a + _t1
This leads to lots of extra assignments, operations, etc.
For instance, why bother performing an assignment when we know a
will be a constant?
0: _t0 := 2 1: a := _t0
=>
0: _t0 := 2 1: a := 2 # rewrite a with _t0 = 2
Can we remove _t0 := 2
?
Moreover, now that we know a is constant, we can conclude that _t2 is constant too.
0: _t0 := 2 1: a := 2 2: _t1 := 3 3: _t2 := a + _t1
=>
0: _t0 := 2 1: a := 2 2: _t1 := 3 3: _t2 := 5 # 2 + 3 = 5
How can we do this? one way is peephole optimizations
We can look for specific patterns of instruction sequences and rewrite them.
For instance, for constant propagation, we can check every two instructions and see if we have the pattern,
Const(t, n) Assign(a, t)
and always rewrite the Assign to a Const instruction,
Const(t, n) Const(a, n)
We can prove by hand (or convince ourselves intuitively) that each transformation is equivalent
Other peephole optimizations include strength reduction
_t0 := 0 _t1 := x * _t0
=>
_t0 := 0 _t1 := 0 # x * 0 = 0
_t0 := 0 _t1 := x + _t0
=>
_t0 := 0 _t1 := x # x + 0 = x
But what if the pattern isn't just 2-3 lines or there are unrelated instructions between lines in the pattern?before? or the
a := 2 _t1 := 3 x := y # intervening line b := 5 # intervening line _t2 := a + _t1
Let's devise a way to make this work on any number of instructions.
How can we do this? What info do we need at this line _t2 := a + _t1
to know that we can replace this with a constant?
If we knew which definition reached each usage, we could do constant folding.
What about with branching?
- CFG helps visualize this
- can break our assumptions (multiple paths)
- so let's leave aside branches for now and look at straightline code
Let's come up with an intuitive algorithm
- keep track of each definition (just store the whole thing)
- when we see it's use, we look it up!
- what if there is a redefinition? update the definition
Frame this as a set: in the math world now, set operations are cheap and easy.
- At each program point, record the set of instructions that reach that point in the program.
- after each instruction, check is it being reassigned?
- remove (kill) and add new one (gen)
This is called reaching definitions analysis and we can use it for (among other things) constant propagation (also copy propagation).
Let's make sure we have this defined this for all instructions in our language
What instructions do we have?
What are the rules?
Just for assign and const
Puting this all together
Reaching definitions for our example.
{} a := 7 { Const(a, 7) } _t0 := 2 { Const(a, 7), Const(_t0, 2) } a := _t0 { Assign(a, _t0), Const(_t0, 2) } _t1 := 3 { Assign(a, _t0), Const(_t0, 2), Const(_t1, 3) } _t2 := _t0 + _t1 { Assign(a, _t0), Const(_t0, 2), Const(_t1, 3), Op(_t2, a, +, _t1) }
Once we have computed this for each line, we have the information we need to do the rewriting
Let's define the rewriting rules for optimization: when we have an assignment, look up the right hand side in the data flow set and see if there is a const instruction.
Assign(var, fromvar): look for Const(fromvar, num) rewrite assignment to Const(var, num) Op(var, left, opt, right): look up left and right if both constant, rewrite to Const(var, leftnum op rightnum) look for strength reduction opportunities as well
After rewriting
{} a := 7 { Const(a, 7) } _t0 := 2 { Const(a, 7), Const(_t0, 2) } a := 2 # rewrite this one { Assign(a, _t0), Const(_t0, 2) } _t1 := 3 { Assign(a, _t0), Const(_t0, 2), Const(_t1, 3) } _t2 := a + _t1 { Assign(a, _t0), Const(_t0, 2), Const(_t1, 3), Op(_t2, a, +, _t1) }
Do another round of analysis and rewriting
Reaching definitions on the rewritten example
{} a := 7 { Const(a, 7) } _t0 := 2 { Const(a, 7), Const(_t0, 2) } a := 2 # rewrote this one { Const(a, 2), Const(_t0, 2) } _t1 := 3 { Const(a, 2), Const(_t0, 2), Const(_t1, 3) } # new data flow fact _t2 := a + _t1 { Const(a, 2), Const(_t0, 2), Const(_t1, 3), Op(_t2, a, +, _t1) }
Rewrite again based on new dataflow facts
a := 7 { Const(a, 7) } _t0 := 2 { Const(a, 7), Const(_t0, 2) } a := 2 { Const(a, 2), Const(_t0, 2) } _t1 := 3 { Const(a, 2), Const(_t0, 2), Const(_t1, 3) } _t2 := 5 # rewrite this one { Const(a, 2), Const(_t0, 2), Const(_t1, 3), Op(_t2, a, +, _t1) }
Review
- Optimize by rewriting IR
- Faster operations, e.g., constants
- Fewer operations, e.g., dead code elimination
- Collect facts about program at each line
- Avoid having to search every time we want to rewrite
- Conceptually, use a set to store this information
- Rewrite according to a pattern
- E.g., replacing Assign with Const
- Ensure rewrting is correct for all inputs of the program
Recall that we are only considering straightline code right now.
Classic optimizations and their analyses
Optimization | Analysis |
---|---|
Constant propagation | Reaching definitions |
Copy propagation | Available expressions |
Common subexpression elimination | Available expressions |
Dead code elimination | Live variables |
Note that these analyses aren't the only way to perform optimization or find related information about the program.
Reaching definitions analysis
For each program line, record which prior definitions reach the current line
- If the current line is an assignment (Assign or Const)
- Remove any assignments to that variable from the prior set of facts
- Add the current line's assigment to the set of facts
Recall that we are only considering straightline code right now.
Defining the analysis with set operations
If the next instruction matches x := y
, then we update the set of reaching definitions \(R\) like this
\(R' = (R - \{ \text{x := _} \}) \cup \{ \texttt{x := y} \}\)
where x := _
matches any existing instruction assigning to x
in \(R\).
An example
begin a := 7; a := 2; outparam := a + 3 end
0: _t0 := 7 1: a := _t0 2: _t1 := 2 3: a := _t1 4: _t2 := 3 5: _t3 := a + _t2 6: outparam := _t3
Reaching definitions analysis
{}
_t0 := 7
{ Const(_t0, 7) }
a := _t0
{ Const(_t0, 7), Assign(a, _t0) }
_t1 := 2
{ Const(_t0, 7), Assign(a, _t0), Const(_t1, 2) }
a := _t1
{ Const(_t0, 7), Const(_t1, 2), Assign(a, _t1) }
_t2 := 3
{ Const(_t0, 7), Const(_t1, 2), Assign(a, _t1), Const(_t2, 3) }
_t3 := a + _t2
{ Const(_t0, 7), Const(_t1, 2), Assign(a, _t1), Const(_t2, 3), Op(_t3, a, +, _t2) }
outparam := _t3
{ Const(_t0, 7), Const(_t1, 2), Assign(a, _t1), Const(_t2, 3), Op(_t3, a, +, _t2), Assign(outparam, _t3) }
Constant propagation identified
{}
_t0 := 7
{ Const(_t0, 7) }
a := _t0
{ Const(_t0, 7), Assign(a, _t0) }
_t1 := 2
{ Const(_t0, 7), Assign(a, _t0), Const(_t1, 2) }
a := _t1
{ Const(_t0, 7), Const(_t1, 2), Assign(a, _t1) }
_t2 := 3
{ Const(_t0, 7), Const(_t1, 2), Assign(a, _t1), Const(_t2, 3) }
_t3 := a + _t2
{ Const(_t0, 7), Const(_t1, 2), Assign(a, _t1), Const(_t2, 3), Op(_t3, a, +, _t2) }
outparam := _t3
{ Const(_t0, 7), Const(_t1, 2), Assign(a, _t1), Const(_t2, 3), Op(_t3, a, +, _t2), Assign(outparam, _t3) }
Constant propagation applied
{}
_t0 := 7
{ Const(_t0, 7) }
a := 7
{ Const(_t0, 7), Assign(a, _t0) }
_t1 := 2
{ Const(_t0, 7), Assign(a, _t0), Const(_t1, 2) }
a := 2
{ Const(_t0, 7), Const(_t1, 2), Assign(a, _t1) }
_t2 := 3
{ Const(_t0, 7), Const(_t1, 2), Assign(a, _t1), Const(_t2, 3) }
_t3 := a + _t2
{ Const(_t0, 7), Const(_t1, 2), Assign(a, _t1), Const(_t2, 3), Op(_t3, a, +, _t2) }
outparam := _t3
{ Const(_t0, 7), Const(_t1, 2), Assign(a, _t1), Const(_t2, 3), Op(_t3, a, +, _t2), Assign(outparam, _t3) }
Fresh reaching definitions analysis
{}
_t0 := 7
{ Const(_t0, 7) }
a := 7
{ Const(_t0, 7), Assign(a, 7) }
_t1 := 2
{ Const(_t0, 7), Assign(a, 7), Const(_t1, 2) }
a := 2
{ Const(_t0, 7), Const(_t1, 2), Const(a, 2) }
_t2 := 3
{ Const(_t0, 7), Const(_t1, 2), Const(a, 2), Const(_t2, 3) }
_t3 := a + _t2
{ Const(_t0, 7), Const(_t1, 2), Const(a, 2), Const(_t2, 3), Op(_t3, a, +, _t2) }
outparam := _t3
{ Const(_t0, 7), Const(_t1, 2), Const(a, 2), Const(_t2, 3), Op(_t3, a, +, _t2), Assign(outparam, _t3) }
Constant propagation and folding identified and applied
{}
_t0 := 7
{ Const(_t0, 7) }
a := 7
{ Const(_t0, 7), Assign(a, 7) }
_t1 := 2
{ Const(_t0, 7), Assign(a, 7), Const(_t1, 2) }
a := 2
{ Const(_t0, 7), Const(_t1, 2), Const(a, 2) }
_t2 := 3
{ Const(_t0, 7), Const(_t1, 2), Const(a, 2), Const(_t2, 3) }
_t3 := 5
{ Const(_t0, 7), Const(_t1, 2), Const(a, 2), Const(_t2, 3), Op(_t3, a, +, _t2) }
outparam := _t3
{ Const(_t0, 7), Const(_t1, 2), Const(a, 2), Const(_t2, 3), Op(_t3, a, +, _t2), Assign(outparam, _t3) }
Available expressions analysis
For each program line, record which prior expressions reach the current line
- If the current line is an assignment
- Remove any expressions that involve the variable (left- or right-hand side)
- Add the current line's assigment
Similar to reaching definitions, but also expressions when variable in the expression gets redefined.
Defining the analysis with set operations
If the next instruction matches x := e
, then we update the set of reaching definitions \(R\) like this
\(R' = (R - \{ \text{x := _}, \text{_ := x}, \}) \cup \{ \texttt{x := e} \}\)
where x := _
matches any existing instruction assigning to x
in \(R\) and _ := x
matches any instruction with x
somwhere on the right-hand side.
An example
While
begin x := inparam; y := x; x := 10; z := y + x; outparam := z end
IR (with constant prop)
0: x := inparam 1: y := x 2: _t0 := 10 3: x := 10 # constant prop applied 4: _t1 := y + x 5: z := _t1 6: outparam := z
Available expressions analysis
{}
x := inparam
{ Assign(x, inparam) }
y := x
{ Assign(x, inparam), Assign(y, x) }
_t0 := 10
{ Assign(x, inparam), Assign(y, x), Const(_t0, 10) }
x := 10
{ Const(_t0, 10), Const(x, 10) }
_t1 := y + x
{ Const(_t0, 10), Const(x, 10), Op(_t1, y, +, x) }
z := _t1
{ Const(_t0, 10), Const(x, 10), Op(_t1, y, +, x), Assign(z, _t1) }
outparam := z
{ Const(_t0, 10), Const(x, 10), Op(_t1, y, +, x), Assign(z, _t1), Assign(outparam, z) }
Copy propagation identified
{}
x := inparam
{ Assign(x, inparam) }
y := x
{ Assign(x, inparam), Assign(y, x) }
_t0 := 10
{ Assign(x, inparam), Assign(y, x), Const(_t0, 10) }
x := 10
{ Const(_t0, 10), Const(x, 10) }
_t1 := y + x
{ Const(_t0, 10), Const(x, 10), Op(_t1, y, +, x) }
z := _t1
{ Const(_t0, 10), Const(x, 10), Op(_t1, y, +, x), Assign(z, _t1) }
outparam := z
{ Const(_t0, 10), Const(x, 10), Op(_t1, y, +, x), Assign(z, _t1), Assign(outparam, z) }
Copy propagation identified
{}
x := inparam
{ Assign(x, inparam) }
y := inparam
{ Assign(x, inparam), Assign(y, x) }
_t0 := 10
{ Assign(x, inparam), Assign(y, x), Const(_t0, 10) }
x := 10
{ Const(_t0, 10), Const(x, 10) }
_t1 := y + x
{ Const(_t0, 10), Const(x, 10), Op(_t1, y, +, x) }
z := _t1
{ Const(_t0, 10), Const(x, 10), Op(_t1, y, +, x), Assign(z, _t1) }
outparam := _t1
{ Const(_t0, 10), Const(x, 10), Op(_t1, y, +, x), Assign(z, _t1), Assign(outparam, z) }
Live variable analysis
- In order to eliminate dead code, we need to know whether a variable is ever used again the program.
- We need information about lines later in the program
- Solution: step through instructions in reverse
Informal algorithm
For each program line in reverse order, record which variables will be used later in the program
- If the current line is an assignment
- Remove the variable from the set (since any earlier definition would be overwritten by this instruction)
- Add the variables from the right-hand-side since we know an earlier definition will be needed by this instruction)
Recall that we are only considering straightline code right now.
Defining the analysis with set operations
If the next instruction matches x := y
, then we update the set of reaching definitions \(R\) like this
\(R' = (R - \{ \text{x} \}) \cup \{ \texttt{y} \}\)
An example
Live variable analysis
{ inparam }
x := inparam
{ inparam }
y := inparam
{ y }
_t0 := 10
{ y }
x := 10
{ y, x }
_t1 := y + x
{ _t1 }
z := _t1
{ _t1 }
outparam := _t1
{ outparam }
outparam is defined to be the return value in our implementation of While, so it is in the initial set
Dead code identified
{ inparam }
x := inparam
{ inparam }
y := inparam
{ y }
_t0 := 10
{ y }
x := 10
{ y, x }
_t1 := y + x
{ _t1 }
z := _t1
{ _t1 }
outparam := _t1
{ outparam }
Generalizing data-flow analyses
- Direction \(D\), forwards or backwards
- Values \(V\), the set of values possible at each program point
- Transfer functions \(F\), the state update function for each expression in the program, i.e., \(f : V \rightarrow V\)
- An initial state \(I\)
Transfer functions
- We can construct this automatically for each expression in the language
Functions for the (local) analyses seen so can be framed as
\(R' = (R - \text{KILL} \cup \text{GEN} \}\)
- GEN and KILL are based on the analysis used
Foramlizing the analyses
What are the direction \(D\), values \(V\), transfer functions \(F\), and initial states \(I\) for our analyses?
Analysis | \(D\) | \(V\) | \(F\) | \(I\) |
---|---|---|---|---|
Reaching definitions | ||||
Available expressions | ||||
Live variables |
GEN and KILL
What are GEN and KILL for each analysis?
Analysis | GEN | KILL |
---|---|---|
Reaching definitions | ||
Available expressions | ||
Live variables |