CIS 6614 meeting -*- Outline -*- * Injection Attacks Based in part on the book Michael Howard and David LeBlanc and John Viega. 24 Deadly Sins of Software Security: Programming Flaws and How to Fix Them. McGraw-Hill, Inc., 2009. ** What is the problem? Note: this is no longer just a C language problem See https://xkcd.com/327 (or the file exploits_of_a_mom.png in this directory) ------------------------------------------ THE PROBLEM Code Injection is 3rd on OWASP Top Ten (owasp.org/www-project-top-ten/, formerly in first place) From OWASP A03 Injection (owasp.org/Top10/A03_2021-Injection/): "App vulnerable when: "User-supplied data is not validated, filtered, or sanitized" "Dynamic queries or non-parameterized calls without context-aware escaping are used directly in the interpreter." "Hostile data is directly used or concatenated." "Common injections are: SQL, NoSQL, OS command, Object Relational Mapping (ORM), LDAP, and Expression Language (EL) or Object Graph Navigation Library (OGNL) injection." ------------------------------------------ *** examples **** SQL Injection Attacks ------------------------------------------ EXAMPLE SCENARIO IN SQL Scenario 1 from OWSAP A03 (owasp.org/Top10/A03_2021-Injection/): String query = "SELECT \* FROM accounts WHERE custID='" + request.getParameter("id") + "'"; ------------------------------------------ Q: What happens if the input starts with ' or '1'='1 ? Then the user's input is taken as an SQL command and executed! In the second case, this query selects all accounts! "More dangerous attacks could modify or delete data or even invoke stored procedures." This problem can also be caused by leaving the database port open to the internet (and using a default sysadmin password)! This is CWE-89 and also covered in the PCI Data Security Standard requirement 6.5.6. Q: What programming languages can have this problem? Almost any language that interfaces with SQL, including: C#, PHP, Python, Ruby, Java, C, C++ **** Example in Python The following Python example is from the 24 Deadly sins book, p. 9 ------------------------------------------ SQL INJECTION VULNERABILITY IN PYTHON import MySQLdb conn = MySQLdb.connect(host="...", port=3306, user=admin, password="passwd", db=clientsDB) cursor = conn.cursor() cursor.execute("select * from customer where id=" + id) results = cursor.fetchall() conn.close() ------------------------------------------ **** Example in SQL The vulnerability also affects SQL itself. (This is from the 24 Deadly Sins book, p. 12) ------------------------------------------ SQL INJECTION IN SQL CREATE PROCEDURE dbo.doQuery( @id nchar(128)) AS DECLARE @query nchar(256) SELECT @query = 'select ccnum from cust where id = ''' + @id + '''' EXEC @query RETURN ------------------------------------------ Q: What does this do? It looks for credit card numbers where the id is given by a parameter. Q: How could this cause an injection attack? If @id is user input ***** What an Attacker Could Do ------------------------------------------ WHAT ATTACKER WOULD DO Add more clauses to the query, Comment out clauses not needed for attack Example input: 1 or 2>1 --- ------------------------------------------ Q: What is the effect of that input? It selects all rows in the table (since 2>1 is true). The classic input is "1=1" but systems may look for that... Q: What is necessary for the attack to work? That attacker can give input that is passed directly to the SQL interpreter. ***** Attack can be obfuscated ------------------------------------------ AN EXPLOIT FROM 2008 orderitem.asp?IT=GM-204;DECLARE%20@S %20NVARCHAR(4000);SET%20@S=CAST( 0x4400450043004C0041005200450020004000... ... ...F007200%20AS@20NVARCHAR(4000)); EXEC(@S);-- which decodes to: DECLARE @T varchar(255)'@C varchar(255) DECLARE Table_Cursor CURSOR FOR select... ------------------------------------------ Q: How does the obsfucation work in this example? It is using the CAST primitive of SQL to convert a hex string into text Q: What lesson can we learn from such eamples? Injected code doesn't have to look like code at first... It's best to completely avoid using untrusted inputs **** Example in HQL ------------------------------------------ EXAMPLE SCENARIO in HQL Scenario 2 from OWSAP A03 (owasp.org/Top10/A03_2021-Injection/): Query HQLQuery = session.createQuery( "FROM accounts WHERE custID='" + request.getParameter("id") + "'"); ------------------------------------------ Q: What happens if the input starts with ' or '1'='1 ? Again, the query selects all accounts! Q: So, is this kind of attack limited to SQL? No, but that is a particularly popular kind of attack... It can affect any interpreter *** Format String Attacks This is CWE-134: Uncontrolled Format String. See also the 24 Deadly Sins book, chapter 6 and (for details) https://seclists.org/bugtraq/2000/Sep/214 This kind of attack mostly affects C and C++, but could also affect languages that are translated into C/C++. **** Simple Example ------------------------------------------ EXAMPLE OF FORMAT STRING VULNERABILITY /* A Unix command, written in C */ #include int main(int argc, char *argv[]) { if (argc > 1) { printf(argv[1]); } return 0; } ------------------------------------------ Q: What could go wrong? If the input is "%p" then the output is the address of the top of the runtime stack. This can leak information about ASLR! (including the main function's return address, etc.) The current gcc will warn about such a usage of printf **** The %n format specifier ------------------------------------------ THE %n FORMAT SPECIFIER %n writes the number of characters written so far into the corresponding argument Useful example: unsigned int bytes; printf(%s%n\n, argv[1], &bytes); then bytes is set to the number of characters in argv[1] ------------------------------------------ Q: How could an attacker abuse such a format string? From the 24 Deadly Sins book (p. 111): 1. Put the address desired on the stack (e.g., using a buffer overflow) 2. Give input of the right length to write the number desired into that address These format strings allow attackers to probe the stack and correct the attack dynamically (p. 112) **** More Revealing Example This example is in the current directory files fmtme.c and the Makefile ------------------------------------------ EXAMPLE FROM TIM NEWSHAM'S BLOG https://seclists.org/bugtraq/2000/Sep/214 // code from: // seclists.org/bugtraq/2000/Sep/214 #include #include #include int main(int argc, char **argv) { char buf[100]; int x; if(argc != 2) { exit(1); } x = 1; snprintf(buf, sizeof(buf), argv[1]); buf[sizeof(buf) - 1] = 0; printf("buf (length is %lu): \"%s\"\n", strlen(buf), buf); for (int i = 0; i < strlen(buf); i++) { printf("buf[%d] is '%c' (0x%x)\n", i, buf[i], buf[i]); } printf( "x is %d (0x%x) (@ address %p)\n", x, x, &x); return 0; } ------------------------------------------ Q: Does gcc give any warning when this is compiled? No, it apparently doesn't worry about the use of a format string as an argument to snprintf... The commands to run this are in the Makefile ------------------------------------------ RUNNING FMTME $ ./fmtme "hello" buf (length is 5): hello x is 1 (0x1) (@ address 0x7ffffcbcc) ------------------------------------------ Q: What does the output of the command ./fmtme "hello" show? That it's copying the command line argument into buf and printing that, and other information. Q: Could we use the address printed to help defeat ASLR? Yes, but we won't do that now... Q: Is that command the same as running the command perl -e 'system "./fmtme", "hello"'? yes, that will be convenient for later. ------------------------------------------ PASSING A FORMAT STRING Consider the command $ ./fmtme \ 'abcd\n0x%lx 0x%lx 0x%lx 0x%lx' buf (length is 43): abcd 0x0 0x0 0x1ffffcd30 0x3078300a64636261 x is 1 (0x1) (@ address 0x7ffffcbcc) ------------------------------------------ You can see in the output that the format string interpreter is being run on the input (and that it's printing out extra information), so the attacker will realize that their input is being interpreted as a format string. The extra information output is showing something about the runtime stack when snprintf is called. It seems to have 2 words of zeros (space for registers?) then it shows the caller's frame pointer (i.e., main's %rbp register) then first 8 chars printed into buf (since the x86 is little-endian, the first character is at the far right, and note the value of "abcd\n" is there, starting at the right, where a's hex code is 0x61, b is 0x62, c is 0x63, d is 0x64, \n is 0x0a, and then the null char '\0' is 0x00, then '0' is 0x30, 'x' is 0x78, '0' is 0x30, so the string starts: "abcd\n" and then continues, after the null char, with "0x0") The x86 64-bit register layout is partly based on http://6.s081.scripts.mit.edu/sp18/x86-64-architecture-guide.html More ASCII codes: ' ' is 0x20, 'l' is 0x63, '0' is 0x30 ------------------------------------------ ATTEMPTING TO SET X perl -e 'system "./fmtme", "\xcc\xcb\xff\xff\x07\x00%d%n%x%x%x%x\n"' buf (length is 5): ÌËÿÿ x is 1 (0x1) (@ address 0x7ffffcbcc) ------------------------------------------ This doesn't work, but on machine that used the stack, this would pull the address out of buf and use that as the address to write the number of characters written so far So the attacker controls: - the address written to and - the value written there. **** Related Problems Q: What would happen if an attacker could control the format string used for scanf? Then the attacker could write into arbitrary parts of memory! Q: Could using sprintf cause a problem too? Sure, since the attacker can quote format character specifications that are later used in printf or fprintf... This is what is happening in fmtme.c Q: What if strings are stored in an external file that isn't protected? Then an attacker could change that file, and thus get the application to use strings that contain format specifiers Q: What if a user's locale (e.g., country) tells where (i.e., in what directory) language-specific files are stored? Then an attacker could force the application to use their directory of files, which could have format strings. *** Other Kinds of Injection Attacks Q: Are there any other attacks where we should not trust user input? Yes (see below) ------------------------------------------ OTHER KINDS OF INJECTION ATTACKS What other kinds of attacks might use inputs? ------------------------------------------ ... - buffer overflows are certainly one (where the amount of text read or numbers read controls the output) - Identification and Authentication failures (A07), when passwords or login ids are used without checking... - Server side request failures (A10), when a web app fetches a remote resource without validating the user-supplied URL (e.g., using file:///etc/passwd as a URL) - Commands that execute user inputs (as in a command processor, interpreter, or compiler) Q: Could this affect programs that act on user inputs? Yes, if the action is to pass the user input to a command interpreter Could that be an interpreter you write yourself? Yes! ** Conventional Tools for Preventing Injection Attacks *** Do's and Don'ts from the 24 Deadly Sins Book ------------------------------------------ PREVENTION TECHNIQUES From "24 Deadly Sins" book (p. 18, 27-28): - Don't use string concatenation to build SQL - Do use parameterized queries to build SQL - Do check the input for validity at the server, - Do use regular expressions to parse input - Don't check input (for validity) ONLY at the client - Don't simply strip out "bad words" - Don't connect to the DB as a highly privledged account ------------------------------------------ Q: Will these guidelines guarantee freedom from injection attacks? No, but they might reduce their frequency *** Prevention Recommendations from OWASP ------------------------------------------ PREVENTING INJECTION ATTACKS From OWASP A03 Injection (owasp.org/Top10/A03_2021-Injection/) "The Preferred option is to use a safe API, which avoids using the interpreter" "Use positive server-side input validation. This is not a complete defense ..." "For residual dynamic queries, escape special characters using the specific escape syntax for that interpreter." ------------------------------------------ Q: Why should using "any interpreter" be avoided? Beacuse it gives the attacker power to do arbitrary things! Notes from OWASP A03: "Even when parameterized, stored procedures can still introduce SQL injection if PL/SQL or T-SQL concatenates queries and data or executes hostile data with EXECUTE IMMEDIATE or exec()." "SQL structures such as table names, column names, and so on cannot be escaped, and thus user-supplied structure names are dangerous. This is a common issue in report-writing software." Q: Will these recommendations prevent injection attacks? No necessarily.