Local optimization (Local)
Compiler Project
Table of Contents
- Local optimizer driver function
- The
analyzeblock
function - Example analysis: reaching definitions
- The
optimizeblock
function - Example optimization: copy propagation
- Implementing constant propagation
- Implementing available expressions
- Implementing live variable analysis
- Implementing dead-code elimination
In this project, you will implement block-local optimizations that work by first analyzing the instructions in a block to infer data-flow facts then rewriting eligible instructions according to those facts. Provided below is an infrastructure that handles running the analyses and optimizations. For this project, you will write the state transfer functions and instruction rewrite functions that are taken by the infrastructure and run on each instruction in a block for you.
Here are the analyses and optimizations to implement for the project. One analysis, reaching definitions, is given, but the constant propagation function is not. The copy propagation optimization is given, but its the available expressions analysis it depends on isn't.
Analysis | Optimization |
---|---|
(given) Reaching definitions | (TODO) Constant propagation |
(TODO) Available expressions | (given) Copy propagation |
(optional) Common subexpression elimination | |
(TODO) Live variables | (TODO) Dead code elimination |
Local optimizer driver function
Here is the driver for local optimization. It will not work correctly in general on programs that have branching, but we will replace it with the global optimizer in the next project. The local optimizer works by iterating over over basic block nodes (in an arbtrary and running both the analysis analyzeblock
and the rewriter optimizeblock
on each block.
def optimizer(cfg): return localoptimizer(cfg) # call the local optimizer def localoptimizer(cfg): logging.debug(textualize_graph(cfg)) # example debugging, outputting the graph for node in cfg.nodes: initial_state = set() analysis, final_state = analyzeblock(initial_state, cfg.nodes[node]['instrs'], reaching_definitions) cfg.nodes[node]['instrs'] = optimizeblock(analysis, constant_propagation) logging.debug(textualize_graph(cfg)) # (optional) # for node in cfg.nodes: # initial_state = set() # analysis, final_state = analyzeblock(initial_state, cfg.nodes[node]['instrs'], available_expressions) # cfg.nodes[node]['instrs'] = optimizeblock(analysis, common_subexpression_elimination) # logging.debug(textualize_graph(cfg)) for node in cfg.nodes: analysis, final_state = analyzeblock({}, cfg.nodes[node]['instrs'], available_expressions) cfg.nodes[node]['instrs'] = optimizeblock(analysis, copy_propagation) logging.debug(textualize_graph(cfg)) for node in cfg.nodes: initial_state = set([util.output_variable]) analysis, final_state = analyzeblock(initial_state, reversed(cfg.nodes[node]['instrs']), liveness) # example debugging logging.debug("block\n" + "\n".join( f'{instr}: {live}' for (live, instr) in analysis)) logging.debug(f"final state: {final_state}") instrs = reversed(optimizeblock(analysis, dead_code_elimination)) cfg.nodes[node]['instrs'] = list(filter(lambda x: x != None, instrs)) logging.debug(textualize_graph(cfg)) return cfg
The analyzeblock
function
def analyzeblock(initial_state, instrs, transfer): """Take a list of instructions and a state transfer function and return a list of (state, instruction) pairs and a final state. The state transfer function takes the current state and the next instruction, then returns the updated state after interpreting the instruction. :param initial_state: The initial state before the first instruction :param instrs: The list of instructions. :param transfer: The transfer function, which takes an input state and an instruction and returns the output state. def transfer(state: set, instr: Instruction) -> set: # body of the function return updated_state """ state = initial_state analysis = [] for instr in instrs: analysis.append((state, instr)) state = transfer(state, instr) final_state = state return analysis, final_state
Called by localoptimizer, analyzeblock
runs the state transition function transfer
on each instruction in the given basic block instr
, starting with an initial_state
. It returns the analysis result as a list of (state, instruction)
pairs that will be used by the rewriter to modify the instructions.
Example analysis: reaching definitions
def reaching_definitions(reaching, instr): """The reaching definitions flow function for a single instruction. :param reaching: The state before the instruction. :param instr: The instruction being interpreted. :return: The state after the instruction. """ def is_definition_of(defined_var, instr): match instr: case Const(var, num): return var == defined_var case Assign(var, fromvar): return var == defined_var case Op(var, left, op, right): return var == defined_var case _: return False # logging.debug("reaching definitions") # logging.debug(f"{instr}") # logging.debug(f"{reaching}") match instr: case Const(var, num): # kill reaching = { x for x in reaching if not is_definition_of(var, x) } # gen reaching = reaching | {instr} case Assign(var, fromvar): # kill reaching = { x for x in reaching if not is_definition_of(var, x) } # gen reaching = reaching | {instr} case Op(var, left, op, right): # kill reaching = { x for x in reaching if not is_definition_of(var, x) } # gen reaching = reaching | {instr} case _: # no definition, nothing to update pass return reaching
The optimizeblock
function
def optimizeblock(analysis, optimizeinstr): """Takes the analysis result (a list of (state, instr) pairs), the final_state from the analysis, and an optimizeinstr function for each instruction. The optimizeinstr function takes the state before the instruction and the instruction and return an optimized instruction (or the same instruction if no optimization was made). def optimizeinstr(state: set, instr: Instruction) -> Instruction: ... return updated_instruction """ instrs = [] for state, instr in analysis: instrs.append(optimizeinstr(state, instr)) return instrs
optimizeblock
takes the result of an analysis and runs the given rewriting function optimizeinstr
.
Example optimization: copy propagation
def copy_propagation(available, instr): """The copy propagation optimizer for a single instruction. :param available: The state before the instruction. :param instr: The instruction being optimized. :return: The instruction after optimization. """ # logging.debug(f"copy_propagation {instr} {available}") match instr: case Assign(var, fromvar): for candidate in available: if (isinstance(candidate, Assign) and fromvar == candidate.var): logging.debug(f"optimize {instr} by replacing {fromvar} with {candidate.fromvar} (copy propagation)") instr = Assign(var, candidate.fromvar) break case Op(var, left, op, right): for candidate in available: if (isinstance(candidate, Assign) and left == candidate.var): logging.debug(f"optimize {instr} by replacing {left} with {candidate.fromvar} (copy propagation)") left = candidate.fromvar break for candidate in available: if (isinstance(candidate, Assign) and right == candidate.var): logging.debug(f"optimize {instr} by replacing {right} with {candidate.fromvar} (copy propagation)") right = candidate.fromvar break instr = Op(var, left, op, right) case Test(var, opr): for candidate in available: if (isinstance(candidate, Assign) and var == candidate.var): logging.debug(f"optimize {instr} by replacing {var} with {candidate.fromvar}") instr = Test(candidate.fromvar, opr) break pass case _: pass return instr
Implementing constant propagation
The reaching definitions analysis is given to you above. You'll be implementing the constant_propagation
rewriter, which takes in the data-flow state (reaching) just before the instruction and the instruction itself. It returns an instruction. If there is no rewriting possible, then return the original instruction that was passed in. Here is when rewriting can occur:
- You see an Assign statement, and the right-hand side (fromvar) is a constant, i.e., it is in the reaching on the left-hand-side of a Const. Rewrite the Assign to a Const.
- You see an Op statement, and both variables on the right-hand side (left and right) are constants. (Our While3Addr language doesn't support operations mixing variables and constants, so both must be constant.) Fold the constants and rewrite the Op as a Const.
- (Optional) There are also opportunities for strength reduction optimizations, i.e., rewriting
x := y * 0
tox := 0
.
See the copy_propagation
rewriter given above for examples of accessing information about the instruction and creating new instructions.
Implementing available expressions
Copy propagation depends on the available expressions analysis, which you'll be implementing. The copy propagation optimzer has been given to you above. The rewriter replaces any variable used on the instruction's right-hand side with the right-hand side of a previous definition of that variable, e.g., x:=a; y:=x
becomes x:=a; y:=a
. The available expressions analysis tracks definitions and removes them when variables on their right-hand-side get reassigned, thereby invalidating the use of that expression. Accordingly, the available expressions state transfer function will kills any definitions of or uses of a variable if and instruction redefines it, e.g., in x:=a; a:=1; y:=x
, the a:=1
definition removes x:=a
from the state. Formally, if our instruction is a definition x:=_
, where _
means the rest of the instruction is irrelevant, the transfer function will update the set of available expressions set \(A\) to \(A'\) as follows:
Implementing live variable analysis
def liveness(live, instr): """The liveness flow function for a single instruction. :param live: The state before the instruction. :param instr: The instruction being interpreted. :return: The state after the instruction. # TODO: implement live variable analysis return live
The live variable analysis is a backwards analysis; it walks the instructions from last to first in the basic block. This is handled for you in the provided driver function by reversing the instructions before analysis and optimization and re-reversing them aftwards. Additionally, the initial state containing outparam
is also provided for you.
Live variable analysis works by removing any redefined variables from the set and adding any used variables. Variables can be defined with Const
, Assign
, or Op
and used in Assign
, Op
, or Test
.
Implementing dead-code elimination
def dead_code_elimination(live, instr): """The dead code elimination optimizer for a single instruction. :param available: The state before the instruction. :param instr: The instruction being optimized. :return: The instruction after optimization, which can be Nop() for removed instructions. """ # TODO: implement dead-code elimination, replacing dead code with a Nop() instruction return instr
The dead-code eliminator uses the results of the live variable analysis. The instructions are reversed before both analysis and optimizations, then re-reversed aftwards, so the instructions are provided in reverse to the dead_code_elimination
function. This ensures that the live
set provided with the instr
is the right one (the set of live variables after executing the statement).
Dead code elimination happens when a variable assignment, Const
, Assign
, or Op
, is to a variable that isn't in the live set.