ASSEMBLY LANGUAGE AND ASSEMBLERS

def: *Assembly language* is a language
     that directly maniplulates a (virtual) machine's state
     and in which each statement corresponds
     to one (1) machine instruction.


def: An *assembler* translates
     from assembly language
     to machine code


       ASSEMBLERS VS. COMPILERS

Statements correspond to

 In assembly language:
    one machine instruction


 In a higher-level programming language:
   several machine instructions


Expressions correspond to:

 In assembly language:
  must be explicitly programmed
  with many machine instructions
  and explicit use of temporary storage

 In a higher-level programming language:
  are implicitly computed
  with many machine instructions
  but with implicit use of temporary storage

Names are abstractions of

 In assembly language:
   locations (of data)
   and address (of program code)


 In a higher-level programming language:
   variables or constants
   or computations (procedures, functions)


       GOALS OF ASSEMBLERS

- Relieve tedium of machine code by:
    translating names to location
    translation mnemonics to opcodes
    translation of decimal to binary


- Help communication between people with:
    comments
    symbolic names (labels)
    translating decimal (or hex, octal) to binary


Programmer still needs to know how the machine works
   in assembler


          BASIC PROBLEM FOR ASSEMBLERS

  ...
         JMP ahead  ; forward reference
  ...
  ahead:            ; label

How can assembler know the address of
the label "ahead"?


       TWO PASS DESIGN OF ASSEMBLER

Pass 1:
   count instructions
   determine address of each label


Pass 2:
   check that all labels are defined
   generate machine code


          RELATIVE JUMPS

Additional task for assembler:

  If computer architecture has jumps
   to (absolute) addresses

  Then translate jumps relative to PC into


        SECTIONS OF EXECUTABLE FILES

(header: info about each section)

Text Section:
  executable instructions (binary format)

Data Section:
  data (e.g., for constants) in binary format

Relocation Section:
  identifies locations needing adjustment
  when the program is moved in memory

(debugging sections:
   symbol table section: global labels
   debugging section: file + line info

        EXECUTABLE ELF FILE LAYOUT

  |------------------------|
  [ General info           ] (header
  [ Program name           ]  section)
  [ Start address of text  ]
  [ Length of text section ]
  [ Start address of data  ]
  [ Length of data section ]
  [ Start address of reloc.]
  [ Length of reloc. sect. ]
  |------------------------|
  |                        |
  |                        |
  |    Text Section        |
  |                        |
  |                        |  
  |------------------------|
  |                        |
  |                        |
  |    Data Section        |
  |                        |
  |                        |  
  |------------------------|
  |                        |
  |                        |
  |    Relocation Section  |
  |                        |
  |                        |  
  |------------------------|


     RELOCATION OF EXECUTABLE FILES

Assemblers generate code assuming
   starting address is 0

Relocation data identifies
  parts of instructions that need offset
  added if the starting address is
    changed to 0 + offset

       WHAT A LINKER DOES

Combines object files
   resolving symbolic names
  For example, a program and
 

Object Code P              Executable
 [ header   ]              [ header    ]
 [ text P   ]              [ text P    ]
 [ data P   ] \            [ text L    ]
 [ sym tab P]  \           [ data P    ]
                > Linker ->[ data L    ]
Object Code L   >          [ sym tab P ]
 [ header   ]  /           [ sym tab L ]
 [ text L   ] /
 [ data L   ]
 [ sym tab L]

           KINDS OF LINKING

def: *static linking* is linking that
     happens before runtime, once and for all time


def: *dynamic linking* is linking that
     happens during runtime or at beginning of runtime
     the library code is typically shared

Advantages:

  + static linking:
      faster at runtime
      code is not shared among processes
        may prevent security problems of changing code


  + dynamic linking:
     security updates immediately reflected for all
        (so faster updates)
     less memory is used for code


           WHAT A LOADER DOES

A loader places
   a program (its instructions and data) into memory
   so it can be run


Types of loaders:

 - Absolute loader:
     puts a program's text (code) into a specified
     memory address

 - Bootstrap loader:
     loads (a loader to load) the operating system itself
 
 - Relocating loader:
     puts a program in memory anywhere
     (that there is space)
     uses information in the relocation section