CIS 6614 meeting -*- Outline -*- * Memory Corruption Attacks based on slides by Suman Jana, which were themselves based on slides by Dan Boneh See also: https://ocw.mit.edu/courses/6-858-computer-systems-security-fall-2014/ resources/lecture-2-control-hijacking-attacks/ which is also https://youtu.be/r4KjHEgg9Wg Goal is to understand how attackers think and how attacks work to defend against them... ** attacker goals ------------------------------------------ MEMORY ATTACK GOALS Goals of attacker: Attack examples: - Buffer overflows - Integer overflows - Format string attacks ------------------------------------------ ... - take over target machine ultimately get a shell to execute attacker's commands on the target machine ** Buffer overflow attacks W: What is a buffer overflow attack? It's where giving too much input causes a program to write past the end of an array, leading to changing the return address of a function and running the attacker's code *** background ------------------------------------------ BACKGROUND FOR BUFFER OVERFLOW ATTACKS - C calling conventions with stack/heap layout - How addressing in C works ------------------------------------------ Q: Why do we focus on C for this? Because most OS and network software is written in C The attacker's are limited to using running software (to start with) C uses memory addresses and does not do bounds checking For C, need to know the stack/heap layout Details vary by machine and OS Q: Is there a way for an attacker to find out what machine/OS is running a program like a web server? Yes, sometimes directly (but that's probably deprecated) and often in error messages. The server shouldn't make that easy to find out this info... ------------------------------------------ LINUX PROCESS MEMORY LAYOUT WITHOUT ASLR 0xC0000000 |------------------| | user stack | |vvvvvvvvvvvvvvvvvv| <- %esp | | | | |^^^^^^^^^^^^^^^^^^| | shared libraries | 0x40000000 |------------------| | | |^^^^^^^^^^^^^^^^^^| <- brk | runtime | | heap | |------------------| loaded | instructions | by exec() 0x08048000 |------------------| | | unused 0 |------------------| ------------------------------------------ The addresses are just examples (for a version of Linux without ASLR) and not important The heap grows upwards (towards higher addresses) The space used by shared libraries also grows upwards The stack grows downwards (towards lower addresses) *** how attacks work ------------------------------------------ BUFFER OVERFLOW ATTACK Example C (or C++) function void f(char *strarg) { char buf[100]; strcpy(buf, strarg); /* ... */ } Stack frame of f when called: |------------------| | strarg | |------------------| | return address | |------------------| | old esp | |------------------| | buf[99] | | | | ... buf ... | | | | buf[1] | | buf[0] | |vvvvvvvvvvvvvvvvvv| <- %esp ------------------------------------------ strarg is a pointer (value) pushed on the stack by caller also pushes the return address and old stack pointer (esp) Then space for buf is pushed on the stack after that... Q: What happens if strarg has 108 characters before the null char? Then strcpy will copy past the end of the space for buf, and will overwrite the return address! (because strcpy doesn't know what the length is) ------------------------------------------ BUFFER OVERFLOW AFTER STRCPY Example C (or C++) function void f(char *strarg) { char buf[100]; strcpy(buf, strarg); /* ... */ } Stack frame of f after strcpy |------------------| | strarg | |------------------| | | |------------------| | | |------------------| | buf[99] | | | | ... buf ... | | | | buf[1] | | buf[0] | |vvvvvvvvvvvvvvvvvv| <- %esp ------------------------------------------ Q: What is now in the place where the return address was? the 4 chars of strarg that correspond to where the return address was! Q: How could an attacker exploit that? encode an address as characters, so: point the return address into other chars in buf so return jumps to code that the attacker sent! e.g., it could start a shell... (code for exec("/bin/sh");) Q: Is there anything special about the size of 100? No, just a convenient example... Q: Will the attack work if the function f crashes? No, it has to exit normally Q: Why doesn't the OS stop this attack? It's not watching what goes on in a process, just sets it up and lets it run, handling system calls... (Only the hardware is involved in each instruction.) ------------------------------------------ BUFFER OVERFLOW ATTACK STEPS a. overwrite the stack so that code to execute is at location P b. overwrite the return address to be c. when call returns, calls P ------------------------------------------ ... start of P Q: Is the new code to execute in the heap or the stack? the stack, but could be anywhere... Q: What does the attacker need to know? It looks like stack layout and thus where P starts needs to be known, but... ------------------------------------------ THE NOP SLIDE Where should attacker's code start? | ... | | attacker's code | | start of P | |------------------| | ... | | nop instructions | |==================| | | |------------------| | return address | |------------------| | | |------------------| | | | ... buf ... | | | |vvvvvvvvvvvvvvvvvv| <- %esp ------------------------------------------ Q: How does the attacker the nop instructions and P on the stack? By overflowing buf... Q: Will the attacker's code work if the return address points into the list of nop instructions? Yes, those will just do nothing Q: Will the attacker's code work if it contains a byte of zero (0)? No, not if the attacker wants C string routines to copy it into memory *** background, why C? ------------------------------------------ WHY NOT USE A MEMORY-SAFE LANGUAGE? Why not use a memory-safe language like Java, Python, C#, Rust, ...? ------------------------------------------ Q: Why not use a memory-safe language? ... - legacy code (already in C for OS, networking, ...) - need low-level access to hardware (for device drivers, OS, ...) - performance (but other languages are gaining on C, esp. with JIT compilers..., and performance doesn't matter if program isn't compute bound) *** background, unsafe libc functions ------------------------------------------ UNSAFE FUNCTIONS IN LIBC strcpy(char *dest, char *src) strcat(char *dest, char *src) gets(char *s) scanf(char *format, ...) ... ------------------------------------------ Q: Why can't these functions in libc check sizes? Because they don't know what the sizes are! A modern compiler will warn about these unsafe functions, and tell you to use a safer alternative Q: How could such C functions be made safer? Pass the size as a parameter ------------------------------------------ SAFER FUNCTIONS IN LIBC strncpy(char *dest, char *src, size_t n) strncat(char *dest, char *src, size_t n) fgets(char *s, size_t n, FILE *stream) with scanf, use width specifier to read strings e.g., %100s ------------------------------------------ Note: strncpy doesn't guarantee dest is null-terminated, on Windows (in the CRT library) can use strcpy_s(char *dest, size_t n, char *src) to ensure proper termination However, a lot of programs manipulate buffers directly without using such functions... so still could be attacked... *** background: other possibilities for buffer overflows **** SEH overwrite attack See David Litchfield. Defeating the Stack Based Buffer Overflow Prevention Mechanism of Microsoft Windows 2003 Server. Sep, 2003. www.ngssoftware.com/papers/defeating-w2k3-stack-protection.pdf ------------------------------------------ STRUCTURED EXCEPTION HANDLER OVERWRITES Structured Exception Handler (SEH) - exception dispatch in Windows - registration record on stack contains: - pointer to handler - pointer to next record ------------------------------------------ Q: When are SEHs used in Windows? When there is an exception, e.g., trying to address a bad address then Windows calls each handler in the list... Q: How could such an attack be prevented? 1. compile code with metadata needed to detect overwrites (Microsoft tried this in Visual Studio, but requires recompilation...) 2. SEHOP a dynamic technique: checks integrity of next pointer and also uses ASLR "verifying that a thread's exception handler list is intact before allowing any of the registered exception handlers to be called..." mostly works because "When the majority of stack-based buffer overflows occur, an attacker will implicitly overwrite the next pointer of an exception registration record prior to overwriting the record's exception handler function pointer" so ... "the integrity of the exception handler chain is broken." Quoted from https://msrc-blog.microsoft.com/2009/02/02/ preventing-the-exploitation-of-structured-exception-handler- seh-overwrites-with-sehop/ "SEHOP is enabled by default on Windows Server 2008 and disabled by default on Windows Vista SP1. The primary reason this feature was disabled by default on Windows Vista SP1 was due to a lack of adequate application compatibility data." **** Function pointer overwriting Q: What would happen if there is a data structure with a function pointer above an array in C or C++? A buffer overflow could change the pointer... then a call through that pointer could call attacker's code... ------------------------------------------ FUNCTION POINTERS IN DATA Is this safe? |------------------| | function pointer | |------------------| | | | ... buf ... | | | |------------------| ------------------------------------------ No, it's not safe... affected: PHP 4.0.2 and Microsoft MediaPlayer Bitmaps **** Setjmp buffer overwriting ------------------------------------------ BACKGROUND: SETJMP and LONGJMP in C setjmp(buf) saves in array buf: - stack pointer (sp), - frame pointer (fp), and - program counter (pc). longjmp() restores: - the stack and frame pointers - the program counter (jumps) So, what does the following do? #include jmp_buf env; setjmp(env); label1: ; /* ... */ longjmp(env, 2); ------------------------------------------ When longjmp() is executed, picks up just after the setjmp call (at label1) ------------------------------------------ SETJMP and LONGJMP in C #include jmp_buf handler; char msg[100]; void set_handler(char *s, fptr rest) { int i = setjmp(saved); /* ... */ strcpy(msg, s); *rest(); } void call_handler() { longjmp(handler, 2); } Eventually program does: read_from_user(s); set_handler(s); /* ... */ call_handler(); ------------------------------------------ Q: What could go wrong in that code? overflow of msg buffer could change the stat structure saved by setjmp (i.e., handler) - changing the address of the jump...