UP | HOME

Functions
Compiler Implementation
COP-3402 COP-3402

Function semantics

What are functions in programming languages?

Function abstraction: encapsulate a computation

  • Defined by name (usually) and its input/output
  • User-defined extension of the language
  • Lots of uses: abstraction, reuse, organization, interfaces, and more
  • Higher-order functions can take functions as inputs
  • Lambda functions allow runtime creation of anonymous functions

Function behavior

  • Caller transfers control to callee function
  • Caller provides input values
  • Callee provides output value(s)
  • Execution resumes in caller once callee is finished

Function calls "freeze" state of caller

  • Function-local state
  • Stack-allocated state

Example

int g(int b) {
  return b * 2;
}

int f(int a) {
  return g(a) + 1;
}

int main() {
  printf("%d\n", f(3));
}

Diagram

  • Save state on stack
  • Unconditional branch
  • Save return value
  • Another branch to go back to where we left off

Local state

No function-local state

int x;

int f() {
  x = 1;
  y = g();
  print(x);
  print(y);
}

int g() {
  x = 2;
  return x;
}

With function-local state

int f() {
  int x;
  x = 1;
  y = g();
  print(x);
  print(y);
}

int g() {
  int x;
  x = 2;
  return x;
}

Stack allocation

Statically-allocated state (factorial)

int factorial(int x) {
  static int old_x;

  if (x <= 0) {
    return 1;
  } else {
                old_x = x;
                x = x - 1;
    return factorial(x) * old_x;
  }
}

Stack-allocated state (factorial)

  • Change static to auto
    • auto is the default in C
    • i.e., int x is really auto int x in a function
int factorial(int x) {
  auto int old_x;

  if (x <= 0) {
    return 1;
  } else {
                old_x = x;
                x = x - 1;
    return old_x * factorial(x);
  }
}

Statically-allocated state (fib)

int fib(int n) {
        static int n_minus_1;
        static int n_minus_2;

        if (n <= 0) {
                // F_0 = 0
                return 0;
        } else if (n == 1) {
                // F_1 = 1
                return 1;
        } else {
                // F_n = F_n-1 + F_n-2
                n_minus_1 = fib(n - 1);
                n_minus_2 = fib(n - 2);
                return n_minus_1 + n_minus_2;
        }
}

Stack-allocated state (fib)

int fib(int n) {
        auto int n_minus_1;
        auto int n_minus_2;

        if (n <= 0) {
                // F_0 = 0
                return 0;
        } else if (n == 1) {
                // F_1 = 1
                return 1;
        } else {
                // F_n = F_n-1 + F_n-2
                n_minus_1 = fib(n - 1);
                n_minus_2 = fib(n - 2);
                return n_minus_1 + n_minus_2;
        }
}

Function implementation

Stack frame (or activation record)

  • Holds all information needed to "freeze" state of function
    • Parameters and local variables
    • Return address
    • Caller's stack frame (nested calls)

Parameter passing

  • Registers and/or stack
  • Registers are faster, but limited in number
  • May need to save them before making call

Application binary interface (ABI)

  • Calling conventions and stack frame layout
    • How to pass parameters
    • Layout of data in the stack frame
    • How to return values
    • Caller and callee responsbilities
  • Architecture- and OS-dependent

Intel x86-64 support for functions

  • %rbp - base pointer points to the current function's stack frame
  • %rsp - stack pointer points to the top of the stack
  • push/pop - push to and pop from the stack (move data and update %rsp)
  • call - saves next instruction address (%rip) onto stack and branches to function's address
  • ret - pops the caller's next instruction address and branches to it

Recall that "points to" just means that the register holds an address

Writing and calling ABI-compatible functions

Recall that compiler is printing out instructions that will implement the stack at runtime, not actually creating the stack while compiling.

# https://github.com/longld/peda
# for C compile with -g for debug symbols
# run gdb
gdb example
b main # break at main
si # step instruction, assembly instructions (intead of code)
info file # get address of rodata
x/8xb 0x0000555555556000 # print memory, 8 he(x) (b)ytes
x/i addr # print as instruction

# dump symtab
objdump -s test

gdb resources:

https://sourceware.org/gdb/onlinedocs/gdb/Memory.html

https://sourceware.org/gdb/onlinedocs/gdb/Registers.html#Registers

https://visualgdb.com/gdbreference/commands/set_disassembly-flavor

http://dbp-consulting.com/tutorials/debugging/basicAsmDebuggingGDB.html

https://github.com/longld/peda

https://sourceware.org/gdb/onlinedocs/gdb/Auto-Display.html#Auto-Display https://sourceware.org/gdb/onlinedocs/gdb/Continuing-and-Stepping.html https://sourceware.org/binutils/docs-2.16/as/index.html https://sourceware.org/binutils/docs/as/i386_002dMemory.html

Code generation overview

  • Recall: compiler takes source language and produces target language
    • Translate, not execute
  • Generates equivalent program in assembly
    • Each language construct has corresponding assembly code patterns

Assembly file layout

  • data
    • Fixed size, global data section (bss section is zeroed out)
  • rodata
    • Immutable data, e.g., for string constants
  • text
    • Executable part

https://wiki.osdev.org/ELF

#+BEGIN_NOTES Note that the heap and stack and created at runtime by the OS

https://wiki.osdev.org/ELF#Loading_ELF_Binaries https://stackoverflow.com/questions/9226088/how-are-the-different-segments-like-heap-stack-text-related-to-the-physical-me

What about local variables, malloc'ed data?

Memory layout

  • Local variables and heap-allocated variables are stored in memory allocated at runtime
  • Running program (process) works with the OS to be allocated memory as needed at load and runtime

Implementing functions for SimpleIR

Recall: compiler is translating (not executing) the function

  • Need to print out (emit) equivalent assembly
  • Retrieve name, parameters from AST
  • Type-checker provides function type guarantees

Generate while walking the tree

  • (Diagram)

Implement by emitting assembly "templates" from the compiler

Assembly language resources

Author: Paul Gazzillo

Created: 2024-10-31 Thu 12:02

Validate