Remote debugging with LLDB

The other day I was working on a project in Xcode and was getting fed up with it crashing and just not behaving.  So I set out on a mission to figure out how to remote debug an iOS app.  The secret to it all is LLDB, the LLVM Debugger.  LLDB is now the default debugger in Xcode (has been for awhile) and is a pretty powerful debugger complete with scripting in Python and many other hidden gems.

To follow along you will need:

  • A jailbroken iDevice setup for development
  • Developer Tools (from Xcode) installed on a Mac

At a high-level this approach works by running a little server on the iDevice and then connecting remotely from your Mac.  To begin, SSH into your iDevice and find some program of interest (such as an iOS app you may be developing).  Then start the debug server on your iDevice.

iPhone:/Applications/FieldTest.app root# /Developer/usr/bin/debugserver localhost:12345 ./FieldTest
debugserver-189 for armv7.
Listening to port 12345...

Now on your Mac, we launch LLDB and then connect to the remote session.

[dean@simba ~]$ lldb
(lldb) platform select remote-ios
  Platform: remote-ios
 Connected: no
  SDK Path: "/Users/dean/Library/Developer/Xcode/iOS DeviceSupport/6.0.1 (10A523)"
 SDK Roots: [ 0] "/Users/dean/Library/Developer/Xcode/iOS DeviceSupport/5.1.1 (9B206)"
 SDK Roots: [ 1] "/Users/dean/Library/Developer/Xcode/iOS DeviceSupport/6.0.1 (10A523)"
(lldb) process connect connect://192.168.1.20:12345
Process 2237 stopped
* thread #1: tid = 0x1603, 0x2fe7a028 dyld`_dyld_start, stop reason = signal SIGSTOP
    frame #0: 0x2fe7a028 dyld`_dyld_start
dyld`_dyld_start:
-> 0x2fe7a028:  mov    r8, sp
   0x2fe7a02c:  sub    sp, sp, #16
   0x2fe7a030:  bic    sp, sp, #7
   0x2fe7a034:  ldr    r3, [pc, #112]            ; _dyld_start + 132
(lldb)

At this point you know have a remote connection to the process being debugged and can use LLDB as you would normally.  Note that this is the exact same way Xcode connects to an app being debugged so anything you can do in Xcode should be possible here.

Enjoy and happy hacking!

Botnets and Butterflies - Unpacking Mariposa

Lately I’ve spent most of my time trying to finish writing my thesis which looks at taking a visual approach to program comprehension.  As a component of the evaluation I decided it would be interesting to look at an unpacked malware sample and I ended up going with a sample of Mariposa. The MD5 for the sample is 3e3f7d8873985de888ce320092ed99c5 and I’ve uploaded it to malware.lu.  For this post I thought it’d be fun to go through the process of unpacking the malware since it includes a couple anti-debugging techniques.

WARNING: If you decide to follow along, do so at your OWN risk.  This is live malware, be smart about what you do.

Step 1: Use Protection

Before doing anything else setup an ISOLATED malware analysis environment.  I’ve done this using VMs on a dedicated machine.  VMWare is quite popular, but you can also use VirtualBox or pretty much any other current solution (I used KVM on a ‘nix box).

Step 2: Setting Up

For this post we’ll be using three tools: OllyDbg, ImpREC, and Lord PE.  If you’re not familiar with these I highly recommend checking them out.  Also, I did this using Windows XP SP3.  I know Mariposa doesn’t run on SP2 or lower, and I haven’t tested on Win7.

Step 3: Get the Party Started

Now that we’re ready to start looking at the malware, first open it up in Olly.  Upon opening the executable, Olly will do some analysis and eventually you’ll be presented with the CPU Window open to the entry point.

OllyDbg - Mariposa Open

At this point we can begin working our way through the code.

The first batch of code (up until address 0x41D4AF) isn’t very exciting, in fact the authors throw in a bunch of SSE/FPU code to intentionally try and catch us off guard.  You’ll then see a jump (through a JMP EAX) to the next block of code.  

chunk_2

The next chunk of code starts out by placing the value 0x41D000 in ECX and then XOR’s every byte up until 0x41D4C0 with the value 0xCA1A51E5.  This is the first decryption loop that unpacks the next stage of the unpacker.  After this we jump to 0x41D047, which at first glance looks like a war zone!

Step 4: Anti-Everything

This next region of code is where a lot of the fun happens.  When you first jump to it, Olly doesn’t display code; just the bytes.  Have no fear though, you can beat it! Start by first telling Olly to analyze the region of code (Ctrl+A), then scroll down a bit and you should see something similar to the following screenshot.

chunk_3

Next, highlight the code in the range 0x41D047 to 0x41D076 and then tell Olly to forget about the analysis it just did (Ctrl+BkSp).

chunk_4

Once you’ve done this you’ll see some code appear.  The first jump skips over a bunch of invalid instructions (opcode 0xFF) that are intended to try and throw off the debugger.  Technically they did because we had to help Olly figure out what it was looking at.

chunk_5

Now follow the JMP to 0x41D057.  The next few lines push an address on the stack, clear EAX, push FS;[0], then store the stack pointer at FS:[0].  What does all this do? Try to trick your debugger of course!  The address pushed on the stack is used as an exception handler (read about Structured Exception Handling) which in normal execution would be triggered by trying to evaluate an invalid opcode.  However, when we run under a debugger it will catch the exception and we’re stuck.  We can avoid this by patching the binary so that the byte at 0x41D069 is a NOP (opcode 0x90).

The next issue we encounter is more anti-disassembly trickery.  Looking at the hex values for the instruction listed at 0x41D06A we see they are 0xFF2B and the disassembled command is JMP FAR DS:[EBX].  But the address DS:[EBX] is not code at this point, so that’s clearly wrong.  The trick here is to help Olly by setting the 0xFF byte at 0x41D06A to another NOP.  Doing this you will once again see that Mariposa tries to trick us again using the invalid opcode trap.

chunk_6

The next issue we encounter are the three instructions beginning at 0x41D078.  The opcode 0xFF we know is invalid, the STC (set carry) is fine, and the ADD ESP, 7C8 is valid but will result in an access violation if executed.  The easiest solution here is to once again set everything to NOPs.

After this the malware then goes and loads the Kernel32.dll library followed by requesting the address of the VirtualProtect() function.  With the address of VirtualProtect() in EAX, the malware then sets 0x1D000 bytes with base at 0x400000 to be EXECUTE_READWRITE.

chunk_7

Following this, a call is made to find the address of OutputDebugStringA().  This function is used to display a string in the debugger and will only return non-zero if successful.  Since we see that an address is pushed on the stack and then a RETN follows immediately after (and the address is just the instruction), we can actually just set the region 0x41D0F3 to 0x41D101 to NOPs.  For the next part, once again use the analyze/unanlyze trick to see the code and jump down to 0x41D113.

chunk_8

Now we’re at the last anti-debugging trick! This last trick tries to catch us with trap flag (in EFLAGS) set.  The easiest way around it is to execute the code until 0x41D121, so set a break point there and hit F9 to RUN the code. Notice that here, if the trap flag was set we’d be adding 0x100 to the address 0x41D128 which would be invalid.  Since we ran the code, a zero is added and we’re all good.

Step 5: Unpack Everything

The next batch of code gets right to work on unpacking everything for us.  All we need to do is run through and let it do work its magic.  Make a note of the regions that are unpacked because one of those is the actual IAT.  Finally, there is a giant loop starting at 0x41D238 that reconstructs the IAT for us.  Set a breakpoint at 0x41D2DB and run to there.

chunk_9

Step 6: Finding The OEP

The last thing we need to do in Olly is identify the original entry point (OEP).  We need this for when we reconstruct the executable.  Generally speaking, the OEP can be identified by looking for either a JMP to an address in a register or a RETN to an address pushed on the stack.  In our case we see the value 0x4100A2 is placed in EDI and then jumped too.  This is the OEP.

Step 7: Dumping the Process

Next we need to dump the process so that we can reconstruct the executable.  Leave Olly running and open up Lord PE.  In Lord PE select the process from the top pane, then right click and choose “dump full”.

chunk_10

This will let you save a copy of the process to file for later use.

Step 8: Rebuilding The IAT

The last step before we can try running our unpacked malware is to rebuild the IAT.  The IAT is used at runtime to resolve addresses to functions in shared libraries.  We can do this using the ImpREC (Import REConstructor) tool.  Open up ImpREC and select the running process (same as the last step) from the drop down.

chunk_11

Here is where we need the OEP and the address of the IAT.  As mentioned earlier, the OEP is at 0x41D100A2 which is a relative address (RVA) of 0x100A2.  Also, you should have keep track of the decryption loops and you’ll notice the IAT is located at RVA 0x16000 with size 0xEC.  Fill this in, then hit “Get Imports”.  Now just select “Fix Dump” and point it to the dumped copy from Lord PE.  If all went well, at this point you should have successfully unpacked the malware.

Step 9: Testing The Malware

The easiest way to see if it worked is to run it.  One of the things that this malware does is tries connect to a C&C server (it’s a botnet after all!).  So, if you have Wireshark available, open it up then run your malware.  You should see something like the following in Wireshark if you were successful.

chunk_12

There you have it folks, a crash course on unpacking malware! For me one of the most fun parts of unpacking is solving the anti-debugging mechanisms.  If you’d like more info on the various mechanisms out there here a few resources:

Until next time, stay protected; use a VM :)

Fuzzy iOS Messages!

Awhile ago I came across a post about fuzzing with a new data flow language called Pythonect.  When I read about it I thought it sounded like a pretty nifty language so I decided to try using it to fuzz the iMessage interface in the iOS Messages app.

The first part of this task is to come up with a way to send messages to an iOS device using the iMessage service.  Luckily the new Messages app on OS X 10.8 has support for AppleScript and you can send messages through it.

#!/usr/bin/osascript

on run argv
    set theMessage to (item 1 of argv)

    tell application "Messages"
         send theMessage to buddy "BUDDY_NAME"
    end tell
end run

In this script we tell the Messages application to send a message that was given as an argument to the buddy BUDDY_NAME.  When you use it be sure to replace BUDDY_NAME with the correct buddy name you are using.  Also, I saved the script and named it send_msg.

From here it’s quite easy to use Pythonect to do some fuzzing.  For example, the following script will send groups of 4 and 8 A’s, B’s, C’s, and D’s.

['A', 'B', 'C', 'D'] -> [_ * n for n in [4, 8]] -> os.system("./send_msg " + _)

So what else can we do with Pythonect?  Well for starters you can increase the number of characters and messages sent effectively DoSing the device.  You could also mix and match characters to see what outcome that may arrive at.

I haven’t had much time to play with this but I’ve found that running the following command seems to crash the device.

['A', 'B', 'C', 'D', '*', '+', '\\', '/'] -> [_ * n for n in [100, 200, 500, 1000, 2000, 5000, 10000, 20000, 50000, 100000, 200000, 500000, 1000000, 2000000, 5000000]] -> os.system("./send_msg " + _)

And, this one causes the actual name of the app to be displayed.

['*'] -> os.system("./send_msg " + _)

So there is clearly something going on here, definitely stay tuned for what lurks within!

Rails - Hex Rays Plugin Contest 2012

This year I decided to try submitting to the annual Hex Rays plugin contest.  I’m pleased to announce my plugin, Rails.

Rails is a plugin that simplifies the task of working with multiple instances of IDA Pro.  There are three main advantages to Rails.  First, you won’t go insane trying to work with several instances at once.  Second, your project databases remain uncluttered from the addition of linked libraries and other bits of code.  And third, you don’t need to continuously reverse the same libraries over and over again.

The plugin is pretty straight forward to use.  Once you’ve opened up a database, just go to Edit->Plugins->Rails and enable it.  This will cause a new panel to appear in IDA which lists any other open instances of IDA that are using Rails as well as output from Rails.  With it you can select a function and then see the associated comments or jump to its definition where ever that may be.  Another handy feature is the ability to see the list of open instances and just jump to them by double clicking there name in the list.  For a demo check out the video below.

If you’d like to work with the code it is available on Github at https://github.com/lightbulbone/rails.  Note that the plugin currently only builds on Mac OS X; however, I will (very soon) make a build script for Windows.

I Spy ChatKit!

In the last post I talked about starting to investigate MobileSMS (Messages app on iOS) and concluded with the mystery of the missing ChatKit.  I’m pleased to say that ChatKit wasn’t missing, it’s just hiding!

Admittedly I spent way to much time staring at the ChatKit.framework folder wondering where that stupid binary went.  Everything I’d read (mainly otool and IDA) told me this framework was being loaded but it wasn’t anywhere to be found on the file system.  I even went as far as writing a little script that identified every binary on the device that linked to this framework—it had to be somewhere.  Needless to say, I was pretty confused.

Well, onward and upward! If you run MobileSMS under GDB and have a peak at the loaded libraries you will indeed see that ChatKit is there with an address and everything.

lux0r:/Applications/MobileSMS.app root# gdb ./MobileSMS 
...
(gdb) b UIApplicationMain
Function "UIApplicationMain" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (UIApplicationMain) pending.
(gdb) r
...
Breakpoint 1, 0x31f988a6 in UIApplicationMain ()
(gdb) info sharedlibrary
...
Num Basename                Type Address         Reason | | Source     
  | |                          | |                    | | | |          
  1 dyld                       - 0x2fe00000        dyld Y Y /usr/lib/dyld at 0x2fe00000 (offset 0x0) with prefix "__dyld_"
  2 MobileSMS                  - 0x1000            exec Y Y /private/var/stash/Applications/MobileSMS.app/MobileSMS (offset 0x0)
  3 Foundation                 F 0x37dff000        dyld Y Y /System/Library/Frameworks/Foundation.framework/Foundation at 0x37dff000 (offset 0x48b000)
                                               (objfile is) [memory object "/System/Library/Frameworks/Foundation.framework/Foundation" at 0x37dff000]
  4 UIKit                      F 0x31f67000        dyld Y Y /System/Library/Frameworks/UIKit.framework/UIKit at 0x31f67000 (offset 0x48b000)
                                               (objfile is) [memory object "/System/Library/Frameworks/UIKit.framework/UIKit" at 0x31f67000]
  5 IMDPersistence             F 0x377c2000        dyld Y Y /System/Library/PrivateFrameworks/IMCore.framework/Frameworks/IMDPersistence.framework/IMDPersistence at 0x377c2000 (offset 0x48b000)
                                               (objfile is) [memory object "/System/Library/PrivateFrameworks/IMCore.framework/Frameworks/IMDPersistence.framework/IMDPersistence" at 0x377c2000]
  6 AddressBook                F 0x36aa5000        dyld Y Y /System/Library/Frameworks/AddressBook.framework/AddressBook at 0x36aa5000 (offset 0x48b000)
                                               (objfile is) [memory object "/System/Library/Frameworks/AddressBook.framework/AddressBook" at 0x36aa5000]
  7 AddressBookUI              F 0x365e2000        dyld Y Y /System/Library/Frameworks/AddressBookUI.framework/AddressBookUI at 0x365e2000 (offset 0x48b000)
                                               (objfile is) [memory object "/System/Library/Frameworks/AddressBookUI.framework/AddressBookUI" at 0x365e2000]
  8 ChatKit                    F 0x32d3a000        dyld Y Y /System/Library/PrivateFrameworks/ChatKit.framework/ChatKit at 0x32d3a000 (offset 0x48b000)
                                               (objfile is) [memory object "/System/Library/PrivateFrameworks/ChatKit.framework/ChatKit" at 0x32d3a000]
...    

And if you have a peak at the address listed (0x32d3a000) you’ll even find a valid Mach-O header.

(gdb) x /7w 0x32d3a000
0x32d3a000:	0xfeedface	0x0000000c	0x00000009	0x00000006
0x32d3a010:	0x00000030	0x0000160c	0x801000b5

This thing was definitely coming from somewhere and it wasn’t the “normal” place in the file system.

After digging through the various plist files hoping to find a clue and running my script to find binaries linking against ChatKit many times I decided it was time to try and catch the loader in the act.  Once again in GDB load up MobileSMS but this time before starting the program set a breakpoint on dlopen(); the function responsible for opening dynamic libraries.

Now run the program and watch the first parameter to dlopen().  A quick way to do this is to attach a command in GDB to the breakpoint.

lux0r:/Applications/MobileSMS.app root# gdb -q ./MobileSMS 
Reading symbols for shared libraries . done
(gdb) b dlopen
Function "dlopen" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (dlopen) pending.
(gdb) command 1
Type commands for when breakpoint 1 is hit, one per line.
End with a line saying just "end".
>x /s $r0
>end

This will cause GDB to interpret the value in register R0 (first parameter of a function) as a pointer to string and print the corresponding string.  Continue along until you see the path to ChatKit printed out.  This is great! The framework is loaded, but seriously where is this thing is coming from?  While I previously knew about the existence of the dlopen() function I’ve never really used it myself so I didn’t know much about the second parameter or how it works.

Well, it turns out the second parameter to dlopen() is used to tell it how to proceed.  Generally speaking, the second parameter of dlopen() is used to convey whether or not to use lazy binding and how symbols from the library should be exported.  It turns out that dlopen() can also double as a mechanism to check if a library has been loaded and, if so, get a handle to it (check out the man page).  You do this by specifying RTLD_NOLOAD.

So, back in our GDB session print out that second parameter (value in R1) passed to dlopen() for ChatKit.

Breakpoint 1, 0x3162957c in dlopen ()
0x32ca67c4:	 "/System/Library/PrivateFrameworks/ChatKit.framework/ChatKit"
(gdb) p /x $r1
$1 = 0x10

Alright, we have 0x10… great! What’s that mean?  Time to go source diving! From the man page for dlopen() we know the names given to the values we can pass in.  So it follows that some combination of those values should equate to 0x10.  And surely enough that is true.

First, head on over to opensource.apple.com and grab the latest version of the dyld package.  Note that while that link is to the packages for Mac OS 10.8 the implementations used in iOS are very similar (if not the same). Once you’ve got that unpacked, do a search for one of the symbols listed in the dlopen() man page (I chose RTLD_LAZY).

[dean@simba dyld-210.2.3]$ grep -rn RTLD_LAZY . | grep -v unit
...
./include/dlfcn.h:65:#define RTLD_LAZY	0x1
...

So we know the symbols accepted by dlopen() are listed in dlfcn.h which isn’t that surprising since that is the file you need to include for dlopen().

#if !defined(_POSIX_C_SOURCE) || defined(_DARWIN_C_SOURCE)
#define RTLD_NOLOAD     0x10
#define RTLD_NODELETE   0x80
#define RTLD_FIRST      0x100   /* Mac OS X 10.5 and later */

/*
 * Special handle arguments for dlsym().
 */
#define RTLD_NEXT       ((void *) -1)   /* Search subsequent objects. */
#define RTLD_DEFAULT    ((void *) -2)   /* Use default search algorithm. */
#define RTLD_SELF       ((void *) -3)   /* Search this and subsequent objects (Mac OS X 10.5 and later) */
#define RTLD_MAIN_ONLY  ((void *) -5)   /* Search main executable only (Mac OS X 10.5 and later) */
#endif /* not POSIX */

This chunk of code is only available in non-POSIX environments, which MobileSMS is! So here we see that the value 0x10 equates to the symbol RTLD_NOLOAD.  Which the man page says means:

RTLD_NOLOAD   —  The specified image is not loaded.  However, a valid handle is returned if the image already exists in the process. This provides a way to query if an image is already loaded.  The handle returned is ref-counted, so you eventually need a corresponding call to dlclose()

Alright, we know for a fact that MobileSMS is not actually loading ChatKit, it’s just checking to make sure it already has been loaded!

To solve this mystery of the missing ChatKit we need to consult the implementation of dlopen() which can be found in the dyld source. Rather than walking through the entire dyld codebase I’ll highlight the important parts.

When we call dlopen() using RTLD_NOLOAD the loader will essentially just verify that the specified image has been loaded and, if so, return a handle to it.  To do this dyld goes through a series of phases and at each one checks to see if some permutation of the given path name exists.  Eventually it gets to the point where it will decide the specified path must be part of the dyld cache.

The dyld cache is present on iOS and contains a variety of images in it.  You can find it at /System/Library/Caches/com.apple.dyld/dyld_shared_cache_armv7 on your device.  The cache is loaded early on in the initialization of dyld.

DYLD Cache (iOS ARMv7)

To verify that ChatKit was in this cache I opened it up in an awesome program called Synalyze It and created a smaller grammar to parse it.  You can see this in the screenshot above.

So there we have it folks! The ChatKit wasn’t missing after all, it was just being loaded from the cache rather than through the filesystem.

Reversing iOS Applications (Part 2)

In the first post of this series we talked about how to get an app from the App Store into a reversable state.  Essentially we had to run the app inside a debugger and dump the contents of memory to a file which was then used to patch the original (encrypted) binary.

After writing that post I started to work through Kik in IDA when I got a text message from a friend.  It then occurred to me, why bother with Kik when I can (in theory) target Apple’s iMessage service?  So I’ve swapped out Kik and replaced it with the default Messages app found on each and every iOS device.

With the new app selected I dutifully copied it from my iPod onto my machine and promptly ran strings expecting to see a bunch of jibberish returned.  To my surprise, there was human readable text!

[dean@simba MobileSMS]$ strings MobileSMS
...
message_guid
chat_identifier
IMDSpotlight
Unable to find a conversation for Message [%@] found row ID [%d] and group ID [%@]
No Message GUID in Spotlight URL [%@].  I have no idea what to show you.
Asked to _showSMSConversationAndMessageForSearchURL: [%@]
Asked to showConversationAndMessageForSearchURL: [%@]
MailAutosaveIdentifier
...

Well, that’s pretty cool! As you can see we clearly have readable strings containing various debug messages along with all the info required for Objective-C.  For the uninitiated (like myself), it turns out that the default apps shipping on iDevices are not encrypted.  I haven’t looked into why this is the case, but I suspect it has something to do with how code signing is implemented.  So we have an unencrypted app that when loaded into IDA doesn’t result in complaining what so ever.

Before digging deeper using IDA it is a good idea to see what libraries and frameworks this app uses.  As usual, this can be done using otool as follows.

[dean@simba MobileSMS]$ otool -VL MobileSMS
MobileSMS:
...
	/System/Library/PrivateFrameworks/IMCore.framework/Frameworks/IMDPersistence.framework/IMDPersistence (compatibility version 1.0.0, current version 800.0.0)
	time stamp 2 Wed Dec 31 16:00:02 1969
	/System/Library/PrivateFrameworks/ChatKit.framework/ChatKit (compatibility version 1.0.0, current version 1.0.0)
	time stamp 2 Wed Dec 31 16:00:02 1969
...
	/System/Library/PrivateFrameworks/Conference.framework/Conference (compatibility version 1.0.0, current version 1.0.0)
	time stamp 2 Wed Dec 31 16:00:02 1969
	/System/Library/PrivateFrameworks/IMCore.framework/IMCore (compatibility version 1.0.0, current version 800.0.0)
	time stamp 2 Wed Dec 31 16:00:02 1969
	/System/Library/PrivateFrameworks/FTClientServices.framework/FTClientServices (compatibility version 1.0.0, current version 800.0.0)
	time stamp 2 Wed Dec 31 16:00:02 1969
...

Among the usual suspects there are a few interestingly named frameworks linked.  The one that caught my eye the most was ChatKit.  When I looked at Messages in IDA I also found what seemed like an unusually high number of references to ChatKit as well and after a bit of digging I started to suspect that ChatKit is actually the framework that is responsible for sending/receiving messages.  With that suspicion let’s dig a little into ChatKit.

Once again, I dutifully got the path to ChatKit from otool and went to copy it to my machine.  

lux0r:~ root# scp /System/Library/PrivateFrameworks/ChatKit.framework/ChatKit dean@192.168.1.11:~/
Password:
/System/Library/PrivateFrameworks/ChatKit.framework/ChatKit: No such file or directory

No such file or directory? What the heck? Upon convincing myself that I do indeed have the correct path I was pretty stumped.  Otool clearly states that MobileSMS (the Messages app) links against this framework and IDA shows me the exact same information.  Even some of the debug strings gathered earlier reference it!  Needless to say this definitely ticked me off.  All I wanted to do was disassemble and inspect ChatKit in peace, but iOS wasn’t having any of that.

Stay tuned for the next instalment of this series where we solve the case of the missing ChatKit!

Beginners Fun With iMessage

A quick detour from my series on reversing iPhone applications, I’ve been looking at the Messages app and just found out that there are hidden settings!

The settings themselves are pretty fun (and super helpful) since they allow you to enable logging of iMessages.  You can access them from within the Settings app (under Messages).

iMessageDebug

You don’t need a jailbroken device to activate these; however, I believe you need one in order to access the logs (haven’t confirmed this yet).  The logs are found at:

/private/var/mobile/Library/Logs/CrashReporter/iMessage

If you want to try these out, all you need to do is install the configuration file below.  Installation is as simple as clicking the link and letting Safari do the rest!

iMessageDebug.mobileconfig

Lastly, note that similar configuration files exist for other applications that come with the device.

Reversing iOS Applications (Part 1)

Recently I acquired a 4th gen iPod Touch for reversing so before tackling something seemingly-impossible I thought I’d start with reversing an application.  In this post I’m going to focus on what I thought would be super easy; loading an application into IDA Pro.   The app I chose to play with is Kik (http://www.kik.com/) mostly because it looked interesting and I’d never used it before.

I’m pretty new to reversing iOS (spent way more time on OS X) and up until now had done very little reading about the security features implemented.  So for some all of what I’m about to say may seem obvious, but for us people new to iOS it’s probably pretty helpful.

Kik (Encrypted)

Figure 1: Opening Kik in IDA Pro

Since our goal is to get a reverse-able app into IDA Pro the first step is to acquire the binary.  Originally, being super naive, I just downloaded the app in iTunes and opened up IDA Pro.  Well, that didn’t work so well.  Looking at Figure 1 you’ll see that the code produced by IDA is so clearly wrong that we know we need to take a different approach.  Not only this, but when opening IDA will tell you the file is encrypted and it’s output is probably going to be useless.

Alright, so it turns out Apple has decided to encrypt the binary.  

I didn’t read much about the encryption they use, but apparently it is some variant of FairPlay.  In reality though, since I have a handy-dandy jailbroken iPod Touch sitting right here, it doesn’t matter.  Apple was kind enough to give us a nicely working copy of GDB and Cydia was kind enough to give us access to the device over SSH.  Note that the version of GDB available in Cydia is busted, for instructions on how to get it installed check out http://pod2g-ios.blogspot.ca/2012/02/working-gnu-debugger-on-ios-43.html.  Another super useful utility is otool, available through Cydia.

At this point we will start by statically analyzing the Kik binary a bit.  First, let’s use otool to see what architectures are in the binary (remember Mach-O can what is known as a “fat binary”).

lux0r:/private/var/mobile/Applications/90E2C7FC-AD60-4F6B-940D-1EC8CC198560/Kik.app root# otool -arch all -Vh Kik
Kik (architecture armv6):
Mach header
      magic cputype cpusubtype  caps    filetype ncmds sizeofcmds      flags
   MH_MAGIC     ARM         V6  0x00     EXECUTE    32       3652   NOUNDEFS DYLDLINK TWOLEVEL
Kik (architecture cputype (12) cpusubtype (9)):
Mach header
      magic cputype cpusubtype  caps    filetype ncmds sizeofcmds      flags
   MH_MAGIC     ARM          9  0x00     EXECUTE    32       3652   NOUNDEFS DYLDLINK TWOLEVEL

Here we see that there are two binaries included in this fat binary: one for ARMv6, and one for ARMv7.  We’re interested in the ARMv7 stuff, but the Kik developers have apparently also included support for older devices.

Next we need to figure out what is actually encrypted in the binary.  For a full discussion on the Mach-O format check out Apple’s article, but basically the file is made up of three parts: a header, a list of load commands, and then the data.  To find out what part of the binary is encrypted we can use otool to print out the load commands and look at the LC_ENCRYPTION_INFO entry.

lux0r:/private/var/mobile/Applications/90E2C7FC-AD60-4F6B-940D-1EC8CC198560/Kik.app root# otool -arch all -Vl Kik | grep -A5 LC_ENCRYP
...
          cmd LC_ENCRYPTION_INFO
      cmdsize 20
 cryptoff  4096
 cryptsize 724992
 cryptid   1
...

This entry is telling us a couple of important pieces of information.  First, the cryptoff field is telling us that the first byte of encrypted data is 4096 bytes into the file.  Second, the cryptsize field is telling us that 724992 bytes of the file (starting at cryptoff) are encrypted.  And third, cryptid is telling us that the file is encrypted (we’ll come back to this shortly).

So now we know what portion of the binary is encrypted, how do we decrypt it?

As I said earlier, I didn’t look much into what encryption is used because it really doesn’t matter.  The wonderful thing about computers is that in order for something meaningful to occur the CPU must receive unencrypted instructions so all we need to do is let the iPod do it’s thing and dump the instructions.  This decryption occurs during the load phase so once the binary is mapped into memory it has been decrypted.  Therefore we can just use GDB to dump the memory to a file and patch the binary accordingly.

In order to dump the memory we first need to know what address range we should be grabbing.  Since we know that the encryption starts 4096 bytes into the file we just need to figure out where that is in memory and then work from there.  We can find this out by inspecting a couple other load commands present in the binary.

lux0r:/private/var/mobile/Applications/90E2C7FC-AD60-4F6B-940D-1EC8CC198560/Kik.app root# otool -arch all -Vl Kik
...
Kik (architecture cputype (12) cpusubtype (9)):
Load command 0
      cmd LC_SEGMENT
  cmdsize 56
  segname __PAGEZERO
   vmaddr 0x00000000
   vmsize 0x00001000
  fileoff 0
 filesize 0
  maxprot ---
 initprot ---
   nsects 0
    flags (none)
Load command 1
      cmd LC_SEGMENT
  cmdsize 464
  segname __TEXT
   vmaddr 0x00001000
   vmsize 0x000b2000
  fileoff 0
 filesize 729088
  maxprot r-x
 initprot r-x
   nsects 6
    flags (none)
...

So we see in the LC_SEGMENT for the _TEXT segment file offset 0 (the start of the file) is mapped to virtual address 0x1000.  For the curious if you look the command for _PAGEZERO you’ll see why _TEXT starts where it does.

Alright, this is great! We now know that the beginning of the file is mapped to 0x1000 and that decrypted data we’re after is starting at 0x2000 (remember it was 4096 bytes into the file).  Now comes the easy part, fire up GDB and dump the memory!

lux0r:/private/var/mobile/Applications/90E2C7FC-AD60-4F6B-940D-1EC8CC198560/Kik.app root# gdb -quiet ./Kik
Reading symbols for shared libraries .. done
(gdb) b UIApplicationMain
Function "UIApplicationMain" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (UIApplicationMain) pending.
(gdb) r
Starting program: /private/var/mobile/Applications/90E2C7FC-AD60-4F6B-940D-1EC8CC198560/Kik.app/Kik 
Removing symbols for unused shared libraries . done
Reading symbols for shared libraries ...+................................................................................................................................................ done
Breakpoint 1 at 0x31d348a6
Pending breakpoint 1 - "UIApplicationMain" resolved

Breakpoint 1, 0x31d348a6 in UIApplicationMain ()
(gdb) dump binary memory mem_dump.kik 0x2000 (0x2000 + 724992)
(gdb) q
The program is running.  Exit anyway? (y or n) y

BAM! We got our data in a file named mem_dump.kik, score! Now just copy it over to your desktop machine and we’re almost there! While you’re at it, it’s a good idea to copy over the Kik binary too so we can patch it.

To make our lives easier, we’ll extract the ARMv7 binary from within the universal Kik binary.  This is pretty simple, use lipo.

dean@BigBertha:~/Reversing/Apps/Kik_iOS $ lipo -thin armv7 -output patched_kik Kik_Fat 
dean@BigBertha:~/Reversing/Apps/Kik_iOS $ otool -arch all -Vh patched_kik 
patched_kik:
Mach header
      magic cputype cpusubtype  caps    filetype ncmds sizeofcmds      flags
   MH_MAGIC     ARM         V7  0x00     EXECUTE    32       3652   NOUNDEFS DYLDLINK TWOLEVEL

Let’s now just see what happens when we use class-dump on this unpatched binary.

dean@BigBertha:~/Reversing/Apps/Kik_iOS $ class-dump patched_kik 
/*
 *     Generated by class-dump 3.3.4 (64 bit).
 *
 *     class-dump is Copyright (C) 1997-1998, 2000-2001, 2004-2011 by Steve Nygard.
 */

#pragma mark -

/*
 * File: patched_kik
 * UUID: F88022A0-B96C-305F-9BDC-9D7FC2D2C76C
 * Arch: arm v7 (armv7)
 *
 *       Objective-C Garbage Collection: Unsupported
 *       This file is encrypted:
 *           cryptid: 0x00000001, cryptoff: 0x00001000, cryptsize: 0x000b1000
 */

Well, that’s not very helpful! Time to patch this thing.

To patch the file we need to do two things.  First we need to copy the decrypted data into the binary and, second, we need to disable the encryption load command.  Copying the data over can be done quite simply enough with the dd shell command.

dean@BigBertha:~/Reversing/Apps/Kik_iOS $ dd bs=1 seek=4096 conv=notrunc if=mem_dump.kik of=patched_kik 
724992+0 records in
724992+0 records out
724992 bytes transferred in 1.340430 secs (540865 bytes/sec)

Disabling the encryption load command is a little more involved, but still pretty easy.  All we need to do is set the cryptid field to 0x0.  This can be done using you’re favourite hex editor.  To find the address you can either use otool or, as I did, use a fun little tool called MachOView.

Patched binary in MachOView

Figure 2: Patched Kik binary opened in MachOView

In Figure 2 you can see that the offset to the cryptid field is 0x848, so just go ahead and set that to 0x0.  Next, try class-dump again.

dean@BigBertha:~/Reversing/Apps/Kik_iOS $ class-dump patched_kik
/*
 *     Generated by class-dump 3.3.4 (64 bit).
 *
 *     class-dump is Copyright (C) 1997-1998, 2000-2001, 2004-2011 by Steve Nygard.
 */

#pragma mark Named Structures

struct CGAffineTransform {
    float _field1;
    float _field2;
    float _field3;
    float _field4;
    float _field5;
    float _field6;
};

struct CGPoint {
    float x;
    float y;
};
...

The output from class-dump should go on for a really long time; it turns out Kik is pretty huge!  Finally, we feel pretty confident in our efforts and it’s time to try opening in IDA.  When loading the binary make sure you switch the processor to ARM (it defaults to x86).

Patched Kik Binary in IDA Pro

Figure 3: Patched Kik binary opened in IDA Pro

There you have it folks, Figure 3 is the patched (and decrypted) Kik binary opened up in IDA Pro ready to be reversed.

In my next post I’ll dive more into this app, as said in the beginning I’m most curious about how it’s network stuff is done.  

To Tumblr, Love Pixel Union