CIS 6614 meeting -*- Outline -*-

* Return-oriented Programming

------------------------------------------
 HOW ATTACKERS CAN DEFEAT ASLR AND W xor X         


ASLR
   - hard to know address
     of attacker's code

W xor X protection
   - can't execute shell code put on stack


Both are widely used in Linux and Windows

But there is a kind of attack that
   could defeat both...


------------------------------------------
        Q: The combination of ASLR and W xor X protection is powerful,
           but could there still be a way to:
              (a) do arbitrary computations, and
              (b) not need to know addresses ahead of time?
          Yes (unfortunately), that is what we'll talk about
           
------------------------------------------
          RETURN-ORIENTED PROGRAMMING

Characteristics:
  - only needs to use return addresses
    on stack, so no code executed in stack


  - doesn't need to know fixed addresses


Idea:


------------------------------------------
    ... so W xor X protection doesn't help

    ... so ASLR doesn't help

    ... use return addresses (on stack)
        to string together pre-existing code
        (from program's code segment,
        so no code needs to be
        executed from the stack)

** examples to motivate the idea
*** simplest motivating example
------------------------------------------
     MOTIVATING EXAMPLE

C program:

void run_shell() {
   system("/bin/sh");
}
void get_msg() {
   char buf[100];
   gets(buf);
}


------------------------------------------
        Q: How could attacker call run_shell?
           Run gdb to find address of run_shell
           put that address in return address of stack frame,
           then when the return happens,
           run_shell will be essentially called

------------------------------------------
    EXPLAINING THE MOTIVATING EXAMPLE

          |                     |
          |---------------------|
          | return address      |
          |---------------------|
          |                     |
          |---------------------|
          | ... buf[98], buf[99]|
          |        ...          |
  %esp -> | buf[0], buf[1], ... |
          |vvvvvvvvvvvvvvvvvvvvv|
          
------------------------------------------

        The overflow puts the address of run_shell in the return
        address spot

        This somewhat shows the end goal of return-oriented
        programming

*** slightly more difficult example
------------------------------------------
       SLIGHTLY MORE DIFFICULT EXAMPLE

C program:

char sh_path = "/bin/sh";
void list_dir() {
   system("/bin/ls");
}
void get_msg() {
   char buf[100];
   gets(buf);
}

------------------------------------------
        Q: How could this program be attacked to run the shell?
        Want to call system, but with a different argument
        Use gdb to find address of list_dir and sh_path
        Then invoke system with an argument of our choosing

------------------------------------------
         FAKING A CALL TO SYSTEM

Want to make a stack frame like a call:


          |                     |
          |---------------------|
          |  argument value     |
          |---------------------|
  %esp -> | return address      |
          |vvvvvvvvvvvvvvvvvvvvv|


------------------------------------------

        This is the kind of stack that system expects when starting
        The argument value is used by system as its argument
        The return address is where the call returns to

------------------------------------------
          HOW TO CREATE A CALL FRAME

char sh_path = "/bin/sh";
void list_dir() {
   system("/bin/ls");
}
void get_msg() {
   char buf[100];
   gets(buf);
}


          |                     |
          |---------------------|
          |                     |
          |---------------------|
          |                     |
          |---------------------|
          | return address      |
          |---------------------|
          |                     |
          |---------------------|
          | ... buf[98], buf[99]|
          |        ...          |
  %esp -> | buf[0], buf[1], ... |
          |vvvvvvvvvvvvvvvvvvvvv|
          

------------------------------------------

        Remember that we overwrite from the bottom up,
        so we:
         1. overwrite the return address
           (for the call to get_msg)
           with the address of system
         2. put something in the next item up the stack
             (where system returns, we don't care for now)
         3. put the address of sh_path above that
             (where system will look for its argument)

        See how that contains the call frame for system in it?

        Q: So how does system get called?
           by returning to it!

------------------------------------------
     AFTER THE CALL TO GET_MSG RETURNS

          |                     |
          |---------------------|
          |  argument value     |
          |---------------------|
  %esp -> | return address      |
          |vvvvvvvvvvvvvvvvvvvvv|

------------------------------------------

        And we have system looking at this stack
           which looks just like a call to system!

       Q: What if sh_path isn't already in the program?
          Point into the stack and use the overflow to write "/bin/sh"
          into that

       Q: What should the attack set the return address to be?
          (i.e., where should system return?)
          have it do the next thing the attacker wants.

** return-oriented programming gadgets
------------------------------------------
     RETURN-ORIENTED PROGRAMMING

See
Ryan Roemer, Erik Buchanan, Hovav Shacham,
and Stefan Savage.
Return-Oriented Programming:
Systems, Languages, and Applications.
ACM Trans. Inf. Syst. Secur.
Vol. 15, num. 1, Article 2 (March 2012).
https://doi.org/10.1145/2133375.2133377


"can induce arbitrary computation
without injecting any code" (p. 2:3)

Shows that "preventing the introduction
of malicious code is" not "sufficient
to prevent the introduction of malicious
computation"


------------------------------------------
      This shows that protection mechanisms such as:
      "W xor X, memory tainting,
      virus scanners, and most of 'trusted computing'"
      don't always prevent attacks (p. 2:2)

*** pop-ret gadget
------------------------------------------
     POP-RET GADGET

Def: *a gadget* is


Key gadget:

    pop %eax
    ret


Call this the "pop-ret gadget"
------------------------------------------
    ... a (short) sequence of instructions
        ending in a return (ret instruction)

        Q: What does the key gadget do?
          ... // puts TOS (into eax register)
          ... // puts TOS into eip (instruction pointer)

           (eax is the "accumulator" register,
            but for this gadget, it doesn't matter
            which register receives the result of the pop,
            so any pop instruction will do)
        
        Easy to find this gadget (esp. if you have the binary)

*** using the stack pointer as an instruction pointer
------------------------------------------
   USING THE STACK POINTER AS THE IP

How would we repeat a system call N times?

Set up the stack like:

          |       ...           |
          |---------------------|
          |  address of sh_path |
          |---------------------|
          |  address of pop-ret |
          |---------------------|
          | address of system() |
          |---------------------|
          |  address of sh_path |
          |---------------------|
          |  address of pop-ret |
          |---------------------|
          | address of system() |
          |---------------------|
          |  address of sh_path |
          |---------------------|
          |  address of pop-ret |
          |---------------------|
          | address of system() |
          |---------------------|
          |                     |
          |---------------------|
          | ... buf[98], buf[99]|
          |        ...          |
  %esp -> | buf[0], buf[1], ... |
          |vvvvvvvvvvvvvvvvvvvvv|
          
------------------------------------------
        Q: How did we make the stack look like that?
           By overflowing buf...
        Q: What happends when the current function returns?
           1. the return runs the code for system()
           2. code for system() finds a stack frame that looks like a call
                 with (the address of) sh_path as the argument
           3. When system() returns,
                 it jumps (returns) to the pop-ret gadget
           4. The pop-ret gadget pops (the address of) sh_path off the stack
                 and then returns, which calls system() (as in 1 above...)

        Q: Does this execute any code in the stack?
            No, the instructions just use addresses the attacker
               wrote into the stack
            So the OS doesn't see violation of the W xor X policy?
            Can we prohibit writing to the stack?
               No, that's needed for normal computation...

        Q: In what sense is the stack pointer being used as the
        instruction pointer?
           The next thing to do is what is returned to by ret instructions,
           which moves the stack pointer along,
           telling the CPU what to do next...
           At end of each gadget, return to stack to find what to do next

** guessing random stack canaries
------------------------------------------
  DEFEATING RANDOM STACK CANARIES

Some assumptions needed:
 - server has a buffer overflow
    vulnerability
 - server will crash and restart
    if the canary value is bad
 - after restart, the canary and ASLR
    is NOT rerandomized

Attack approach:
 - Probe the bytes of the canary 1 by 1
 - Use crash to tell if guess was wrong
 
------------------------------------------

        Q: Why would the server restart without rerandomization?
           Because the POSIX fork() system call inherits address space
             from parent,
             so child will have the same values for  canaries and ASLR

*** Probing to find the canary
------------------------------------------
        PROBING TO FIND THE CANARY

Stack (bytes shown):

          
          |          |
          |----------|
          |canary[3] |
          |----------|
          |canary[2] |
          |----------|
          |canary[1] |
          |----------|
          |canary[0] |
          |----------|
          | buf[99]  |
          |----------|
          | buf[98]  |
          |----------|
          |  ...     |
          | buf[1]   |
          |----------|
  %esp -> | buf[0]   |
          |vvvvvvvvvv|


------------------------------------------

        Algorithm used:

         subroutine probe(int i) {
           for j in 0..255 do
              write j into canary[i]
                 (by giving input that uses buffer overflow)
              let the process return
              if process chashes then continue
              else return j
         }

         for k in 0..(CANARY_SIZE - 1) do
           canary[k] = probe(k)

         then at the end we have all bytes of the stack canary...

         This kind of process can also find values that cause the process
            to crash if they are wrong...
            (e.g., the saved frame pointer)

** Blind Return-oriented Programming
*** 64 bit background
------------------------------------------
    64-BIT X86 ARCHITECTURE

Arguments passed in registers
   not on the stack


E.g., write() system call takes 3 args:
     socket (in rdi)
     buffer (in rsi)
     length (in rdx)

So need the following gadgets:
 1) pop rdi; ret (socket)
 2) pop rsi; ret (buffer)
 3) pop rdx; ret (length)
 4) pop rax; ret (write syscall number)
 5) syscall

------------------------------------------
        Q: What does that mean for the attacker?
          Need to put arguments in the right registers
            (e.g., to call write)

*** Reference for BROP
------------------------------------------
 BLIND RETURN-ORIENTED PROGRAMMING (BROP)

Blind?
   to start attacker does NOT have copy
   of the binary to disassemble

See
  A. Bittau, A. Belay, A. Mashtizadeh,
  D. Mazieres and D. Boneh,
  "Hacking Blind,"
  In 2014 IEEE Symposium on
  Security and Privacy, 2014, pp. 227-242.
  doi: 10.1109/SP.2014.22.

and https://youtu.be/xSQxaie_h1o

Assumptions:
  - server has
    a buffer overflow vulnerability
  - server restarts after crash
     without re-randomizing
     
Goal:


Approach:
  1. Find stop gadget
  2. Find gadgets that pop stack entries
  3. Determine what registers gadgets
     pop gadgets pop into
  4. Invoke write() system call
     to send program's code to attacker

------------------------------------------
     Q: Why does the server not re-randomize after restarting?
        because it uses fork() instead of exec(), as noted before...
        also note: no new stack canaries!

     Attack model:
        "The threat model for a BROP attack is an attacker that
        knows an input string that crashes a server due to a stack
        overflow bug. The attacker must be able to overwrite a variable
        length of bytes including a return instruction pointer. The
        attacker need not know the source or binary of the server."
        (p. 230)

     Goal: 
   ... find enough gadgets to invoke write()
         so can get copy of all the code
       (then can analyze it using gdb and do standard return-oriented
        programming on it)

*** stop gadget
------------------------------------------
            STOP GADGET

A gadget that when called


How to find a stop gadget?

------------------------------------------

        will pause the program's execution
        but will NOT cause a crash
            e.g., call to sleep() or does pause instruction

        ... randomly call
             (via buffer overflow, overwriting return address)
            locations in code
               if it crashes, you haven't found it...

*** pop gadgets
------------------------------------------
         FINDING POP GADGETS

Put on the stack
 - a return address to test (probe)
 - address of stop gadget
 - address of trap (causes crash)
   (and more traps)

Looks like:

          |         ...         |
          |---------------------|
          | addr of trap (0x0)  |
          |---------------------|
          | addr of trap (0x0)  |
          |---------------------|
          | addr of trap (0x0)  |
          |---------------------|
          | addr of trap (0x0)  |
          |---------------------|          
          | addr of stop gadget |
          |---------------------|
          | addr of trap (0x0)  |
          |---------------------|
          | addr to probe (A)   |
          |---------------------|
          |                     |
          |---------------------|

Notation:
   (probe(A), stop, trap, trap,...)

Find address P that does NOT pop stack:
   (probe(P), stop, trap, trap, trap, ...)

Find address A that pops 1 stack entry:
   (probe(A), trap, stop, trap, trap,...)
  
------------------------------------------
        Q: What kind of address would cause a crash?
            0x0 is one, never valid as a return address (due to C)

        Q: How does the first layout (pictured) work?
           if (returning to) address A does NOT pop the stack,
              then when A's code returns, it returns to the trap
                                          which causes a crash
           else if the address A pops the stack (throws away the trap),
              then it returns to the stop gadget (so no crash)

           The attacker can tell if a crash happens or not
        
        Q: What kind of stack layout would pop exactly N stack entries?
           (probe(N), trap, trap, ..., trap, stop, trap, trap, trap...)
                            -- N-1 traps --
           Why is the first trap there?
            because a ret expects to discard the old frame pointer,
              so that is the first one popped...
          
*** Determine what registers pop gadgets use
------------------------------------------
  FINDING WHICH GADGET POPS INTO RAX

Set up stack like:

          |                      |
          |----------------------|
          | addr of syscall      |
          |----------------------|
          |         ...          |
          |----------------------|
          | syscall no. of pause |
          |----------------------|
          | addr of pop gadget 3 |
          |----------------------|          
          | syscall no. of pause |
          |----------------------|
          | addr of pop gadget 2 |
          |----------------------|
          | syscall no. of pause |
          |----------------------|
          | addr of pop gadget 1 |
          |----------------------|
          |                      |
          |----------------------|


Note: to make a syscall:
  - put number of call desired in rax
  - then invoke syscall() from libc
------------------------------------------
       Q: What does this do?
           it puts syscall number of pause into some register
           (attacker doesn't know which register,
            hope that one of them puts it in rax)
           if get a pause, then:
              (a) know the address of syscall in libc
                   (because there was no crash, but a pause)
              (b) know that one of the pop gadets pops into rax

       Q: How would you find which pop gadget pops into rax?
           try the pop gadgets out one at a time...

       So now know what pops into rax, and address of syscall.

       Q: How would you find which gadgets pop into other registers?
           do something similar, with calls that need more arguments

*** conclusion of attack
------------------------------------------
        CONCLUDING THE ATTACK

Now can call write using the
following gadgets:

 1) pop rdi; ret (socket)
 2) pop rsi; ret (buffer address)
 3) pop rdx; ret (length)
 4) pop rax; ret (write syscall number)
 5) syscall

------------------------------------------
        Q: How to find the socket number for the attacker's connection?
           there are only a limited number of these, just try them all...
        Q: What buffer address to use?
           the program's .text segment (where the code is)
             so will be able to analyze the code off line to find gadgets
               (so the attacker wins!)
               
        
*** How to defend against BROP?
------------------------------------------
        HOW TO DEFEND AGAINST BROP?

Basic idea:
 - re-randomize
      On Linux: use exec instead of fork
      Run on Windows: always uses exec

 - do bounds checking
 
------------------------------------------
        Q: What if keep connections open instead of re-randomizing?
           That would stop information about what is working
              from leaking to the attacker
            However, that would use up resources (so get denial-of-service)