UP | HOME

Local Optimizations

Table of Contents

Intuition

Many equivalent programs

a := 2;
b := a + 3;

We chose an easy translation

0: _t0 := 2
1:   a := _t0
2: _t1 := 3
3: _t2 := a + _t1

This leads to lots of extra assignments, operations, etc.

For instance, why bother performing an assignment when we know a will be a constant?

0: _t0 := 2
1:   a := _t0

=>

0: _t0 := 2
1:   a := 2  # rewrite a with _t0 = 2

Can we remove _t0 := 2?

Moreover, now that we know a is constant, we can conclude that _t2 is constant too.

0: _t0 := 2
1:   a := 2
2: _t1 := 3
3: _t2 := a + _t1

=>

0: _t0 := 2
1:   a := 2
2: _t1 := 3
3: _t2 := 5  # 2 + 3 = 5

How can we do this? one way is peephole optimizations

We can look for specific patterns of instruction sequences and rewrite them.

For instance, for constant propagation, we can check every two instructions and see if we have the pattern,

Const(t, n)
Assign(a, t)

and always rewrite the Assign to a Const instruction,

Const(t, n)
Const(a, n)

We can prove by hand (or convince ourselves intuitively) that each transformation is equivalent

Other peephole optimizations include strength reduction

_t0 := 0
_t1 := x * _t0

=>

_t0 := 0
_t1 := 0  # x * 0 = 0
_t0 := 0
_t1 := x + _t0

=>

_t0 := 0
_t1 := x  # x + 0 = x

But what if the pattern isn't just 2-3 lines or there are unrelated instructions between lines in the pattern?before? or the

  a := 2
_t1 := 3
  x := y  # intervening line
  b := 5  # intervening line
_t2 := a + _t1

Let's devise a way to make this work on any number of instructions.

How can we do this? What info do we need at this line _t2 := a + _t1 to know that we can replace this with a constant?

If we knew which definition reached each usage, we could do constant folding.

What about with branching?

  • CFG helps visualize this
  • can break our assumptions (multiple paths)
  • so let's leave aside branches for now and look at straightline code

Let's come up with an intuitive algorithm

  • keep track of each definition (just store the whole thing)
  • when we see it's use, we look it up!
  • what if there is a redefinition? update the definition

Frame this as a set: in the math world now, set operations are cheap and easy.

  • At each program point, record the set of instructions that reach that point in the program.
  • after each instruction, check is it being reassigned?
  • remove (kill) and add new one (gen)

This is called reaching definitions analysis and we can use it for (among other things) constant propagation (also copy propagation).

Let's make sure we have this defined this for all instructions in our language

What instructions do we have?

What are the rules?

Just for assign and const

Puting this all together

Reaching definitions for our example.

{}
  a := 7
{ Const(a, 7) }
_t0 := 2
{ Const(a, 7), Const(_t0, 2) }
  a := _t0
{ Assign(a, _t0), Const(_t0, 2) }
 _t1 := 3
{ Assign(a, _t0), Const(_t0, 2), Const(_t1, 3) }
 _t2 := _t0 + _t1
{ Assign(a, _t0), Const(_t0, 2), Const(_t1, 3), Op(_t2, a, +, _t1) }

Once we have computed this for each line, we have the information we need to do the rewriting

Let's define the rewriting rules for optimization: when we have an assignment, look up the right hand side in the data flow set and see if there is a const instruction.

Assign(var, fromvar):
  look for Const(fromvar, num)
  rewrite assignment to Const(var, num)
Op(var, left, opt, right):
  look up left and right
  if both constant, rewrite to Const(var, leftnum op rightnum)
  look for strength reduction opportunities as well

After rewriting

{}
  a := 7
{ Const(a, 7) }
_t0 := 2
{ Const(a, 7), Const(_t0, 2) }
  a := 2  # rewrite this one
{ Assign(a, _t0), Const(_t0, 2) }
 _t1 := 3
{ Assign(a, _t0), Const(_t0, 2), Const(_t1, 3) }
 _t2 := a + _t1
{ Assign(a, _t0), Const(_t0, 2), Const(_t1, 3), Op(_t2, a, +, _t1) }

Do another round of analysis and rewriting

Reaching definitions on the rewritten example

{}
  a := 7
{ Const(a, 7) }
_t0 := 2
{ Const(a, 7), Const(_t0, 2) }
  a := 2  # rewrote this one
{ Const(a, 2), Const(_t0, 2) }
 _t1 := 3
{ Const(a, 2), Const(_t0, 2), Const(_t1, 3) }  # new data flow fact
 _t2 := a + _t1
{ Const(a, 2), Const(_t0, 2), Const(_t1, 3), Op(_t2, a, +, _t1) }

Rewrite again based on new dataflow facts

  a := 7
{ Const(a, 7) }
_t0 := 2
{ Const(a, 7), Const(_t0, 2) }
  a := 2
{ Const(a, 2), Const(_t0, 2) }
 _t1 := 3
{ Const(a, 2), Const(_t0, 2), Const(_t1, 3) }
 _t2 := 5  # rewrite this one
{ Const(a, 2), Const(_t0, 2), Const(_t1, 3), Op(_t2, a, +, _t1) }

Review

  • Optimize by rewriting IR
    • Faster operations, e.g., constants
    • Fewer operations, e.g., dead code elimination
  • Collect facts about program at each line
    • Avoid having to search every time we want to rewrite
    • Conceptually, use a set to store this information
  • Rewrite according to a pattern
    • E.g., replacing Assign with Const
    • Ensure rewrting is correct for all inputs of the program

Recall that we are only considering straightline code right now.

Classic optimizations and their analyses

Optimization Analysis
Constant propagation Reaching definitions
Copy propagation Available expressions
Common subexpression elimination Available expressions
Dead code elimination Live variables

Note that these analyses aren't the only way to perform optimization or find related information about the program.

Reaching definitions analysis

For each program line, record which prior definitions reach the current line

  • If the current line is an assignment (Assign or Const)
    • Remove any assignments to that variable from the prior set of facts
    • Add the current line's assigment to the set of facts

Recall that we are only considering straightline code right now.

Defining the analysis with set operations

If the next instruction matches x := y, then we update the set of reaching definitions \(R\) like this

\(R' = (R - \{ \text{x := _} \}) \cup \{ \texttt{x := y} \}\)

where x := _ matches any existing instruction assigning to x in \(R\).

An example

begin
a := 7;
a := 2;
outparam := a + 3
end
0: _t0 := 7
1: a := _t0
2: _t1 := 2
3: a := _t1
4: _t2 := 3
5: _t3 := a + _t2
6: outparam := _t3

Reaching definitions analysis

{}

_t0 := 7

{ Const(_t0, 7) }

a := _t0

{ Const(_t0, 7), Assign(a, _t0) }

_t1 := 2

{ Const(_t0, 7), Assign(a, _t0), Const(_t1, 2) }

a := _t1

{ Const(_t0, 7), Const(_t1, 2), Assign(a, _t1) }

_t2 := 3

{ Const(_t0, 7), Const(_t1, 2), Assign(a, _t1), Const(_t2, 3) }

_t3 := a + _t2

{ Const(_t0, 7), Const(_t1, 2), Assign(a, _t1), Const(_t2, 3), Op(_t3, a, +, _t2) }

outparam := _t3

{ Const(_t0, 7), Const(_t1, 2), Assign(a, _t1), Const(_t2, 3), Op(_t3, a, +, _t2), Assign(outparam, _t3) }

Constant propagation identified

{}

_t0 := 7

{ Const(_t0, 7) }

a := _t0

{ Const(_t0, 7), Assign(a, _t0) }

_t1 := 2

{ Const(_t0, 7), Assign(a, _t0), Const(_t1, 2) }

a := _t1

{ Const(_t0, 7), Const(_t1, 2), Assign(a, _t1) }

_t2 := 3

{ Const(_t0, 7), Const(_t1, 2), Assign(a, _t1), Const(_t2, 3) }

_t3 := a + _t2

{ Const(_t0, 7), Const(_t1, 2), Assign(a, _t1), Const(_t2, 3), Op(_t3, a, +, _t2) }

outparam := _t3

{ Const(_t0, 7), Const(_t1, 2), Assign(a, _t1), Const(_t2, 3), Op(_t3, a, +, _t2), Assign(outparam, _t3) }

Constant propagation applied

{}

_t0 := 7

{ Const(_t0, 7) }

a := 7

{ Const(_t0, 7), Assign(a, _t0) }

_t1 := 2

{ Const(_t0, 7), Assign(a, _t0), Const(_t1, 2) }

a := 2

{ Const(_t0, 7), Const(_t1, 2), Assign(a, _t1) }

_t2 := 3

{ Const(_t0, 7), Const(_t1, 2), Assign(a, _t1), Const(_t2, 3) }

_t3 := a + _t2

{ Const(_t0, 7), Const(_t1, 2), Assign(a, _t1), Const(_t2, 3), Op(_t3, a, +, _t2) }

outparam := _t3

{ Const(_t0, 7), Const(_t1, 2), Assign(a, _t1), Const(_t2, 3), Op(_t3, a, +, _t2), Assign(outparam, _t3) }

Fresh reaching definitions analysis

{}

_t0 := 7

{ Const(_t0, 7) }

a := 7

{ Const(_t0, 7), Assign(a, 7) }

_t1 := 2

{ Const(_t0, 7), Assign(a, 7), Const(_t1, 2) }

a := 2

{ Const(_t0, 7), Const(_t1, 2), Const(a, 2) }

_t2 := 3

{ Const(_t0, 7), Const(_t1, 2), Const(a, 2), Const(_t2, 3) }

_t3 := a + _t2

{ Const(_t0, 7), Const(_t1, 2), Const(a, 2), Const(_t2, 3), Op(_t3, a, +, _t2) }

outparam := _t3

{ Const(_t0, 7), Const(_t1, 2), Const(a, 2), Const(_t2, 3), Op(_t3, a, +, _t2), Assign(outparam, _t3) }

Constant propagation and folding identified and applied

{}

_t0 := 7

{ Const(_t0, 7) }

a := 7

{ Const(_t0, 7), Assign(a, 7) }

_t1 := 2

{ Const(_t0, 7), Assign(a, 7), Const(_t1, 2) }

a := 2

{ Const(_t0, 7), Const(_t1, 2), Const(a, 2) }

_t2 := 3

{ Const(_t0, 7), Const(_t1, 2), Const(a, 2), Const(_t2, 3) }

_t3 := 5

{ Const(_t0, 7), Const(_t1, 2), Const(a, 2), Const(_t2, 3), Op(_t3, a, +, _t2) }

outparam := _t3

{ Const(_t0, 7), Const(_t1, 2), Const(a, 2), Const(_t2, 3), Op(_t3, a, +, _t2), Assign(outparam, _t3) }

Available expressions analysis

For each program line, record which prior expressions reach the current line

  • If the current line is an assignment
    • Remove any expressions that involve the variable (left- or right-hand side)
    • Add the current line's assigment

Similar to reaching definitions, but also expressions when variable in the expression gets redefined.

Defining the analysis with set operations

If the next instruction matches x := e, then we update the set of reaching definitions \(R\) like this

\(R' = (R - \{ \text{x := _}, \text{_ := x}, \}) \cup \{ \texttt{x := e} \}\)

where x := _ matches any existing instruction assigning to x in \(R\) and _ := x matches any instruction with x somwhere on the right-hand side.

An example

  • While

    begin
    x := inparam;
    y := x;
    x := 10;
    z := y + x;
    outparam := z
    end
    
  • IR (with constant prop)

    0: x := inparam
    1: y := x
    2: _t0 := 10
    3: x := 10  # constant prop applied
    4: _t1 := y + x
    5: z := _t1
    6: outparam := z
    

Available expressions analysis

{}

x := inparam

{ Assign(x, inparam) }

y := x

{ Assign(x, inparam), Assign(y, x) }

_t0 := 10

{ Assign(x, inparam), Assign(y, x), Const(_t0, 10) }

x := 10

{ Const(_t0, 10), Const(x, 10) }

_t1 := y + x

{ Const(_t0, 10), Const(x, 10), Op(_t1, y, +, x) }

z := _t1

{ Const(_t0, 10), Const(x, 10), Op(_t1, y, +, x), Assign(z, _t1) }

outparam := z

{ Const(_t0, 10), Const(x, 10), Op(_t1, y, +, x), Assign(z, _t1), Assign(outparam, z) }

Copy propagation identified

{}

x := inparam

{ Assign(x, inparam) }

y := x

{ Assign(x, inparam), Assign(y, x) }

_t0 := 10

{ Assign(x, inparam), Assign(y, x), Const(_t0, 10) }

x := 10

{ Const(_t0, 10), Const(x, 10) }

_t1 := y + x

{ Const(_t0, 10), Const(x, 10), Op(_t1, y, +, x) }

z := _t1

{ Const(_t0, 10), Const(x, 10), Op(_t1, y, +, x), Assign(z, _t1) }

outparam := z

{ Const(_t0, 10), Const(x, 10), Op(_t1, y, +, x), Assign(z, _t1), Assign(outparam, z) }

Copy propagation identified

{}

x := inparam

{ Assign(x, inparam) }

y := inparam

{ Assign(x, inparam), Assign(y, x) }

_t0 := 10

{ Assign(x, inparam), Assign(y, x), Const(_t0, 10) }

x := 10

{ Const(_t0, 10), Const(x, 10) }

_t1 := y + x

{ Const(_t0, 10), Const(x, 10), Op(_t1, y, +, x) }

z := _t1

{ Const(_t0, 10), Const(x, 10), Op(_t1, y, +, x), Assign(z, _t1) }

outparam := _t1

{ Const(_t0, 10), Const(x, 10), Op(_t1, y, +, x), Assign(z, _t1), Assign(outparam, z) }

Live variable analysis

  • In order to eliminate dead code, we need to know whether a variable is ever used again the program.
  • We need information about lines later in the program
  • Solution: step through instructions in reverse

Informal algorithm

For each program line in reverse order, record which variables will be used later in the program

  • If the current line is an assignment
    • Remove the variable from the set (since any earlier definition would be overwritten by this instruction)
    • Add the variables from the right-hand-side since we know an earlier definition will be needed by this instruction)

Recall that we are only considering straightline code right now.

Defining the analysis with set operations

If the next instruction matches x := y, then we update the set of reaching definitions \(R\) like this

\(R' = (R - \{ \text{x} \}) \cup \{ \texttt{y} \}\)

An example

Live variable analysis

{ inparam }

x := inparam

{ inparam }

y := inparam

{ y }

_t0 := 10

{ y }

x := 10

{ y, x }

_t1 := y + x

{ _t1 }

z := _t1

{ _t1 }

outparam := _t1

{ outparam }

outparam is defined to be the return value in our implementation of While, so it is in the initial set

Dead code identified

{ inparam }

x := inparam

{ inparam }

y := inparam

{ y }

_t0 := 10

{ y }

x := 10

{ y, x }

_t1 := y + x

{ _t1 }

z := _t1

{ _t1 }

outparam := _t1

{ outparam }

Generalizing data-flow analyses

  • Direction \(D\), forwards or backwards
  • Values \(V\), the set of values possible at each program point
  • Transfer functions \(F\), the state update function for each expression in the program, i.e., \(f : V \rightarrow V\)
  • An initial state \(I\)

Transfer functions

  • We can construct this automatically for each expression in the language
  • Functions for the (local) analyses seen so can be framed as

    \(R' = (R - \text{KILL} \cup \text{GEN} \}\)

  • GEN and KILL are based on the analysis used

Foramlizing the analyses

What are the direction \(D\), values \(V\), transfer functions \(F\), and initial states \(I\) for our analyses?

Analysis \(D\) \(V\) \(F\) \(I\)
Reaching definitions        
Available expressions        
Live variables        

GEN and KILL

What are GEN and KILL for each analysis?

Analysis GEN KILL
Reaching definitions    
Available expressions    
Live variables    

Author: Paul Gazzillo

Created: 2024-02-19 Mon 21:52

Validate