iOS Crash Symbolication for dummies Part 1
Many developers use Bugsee for its great crash reporting capabilities. In fact, Bugsee crash reporting has recently been ranked the highest among all iOS crash reporting services when it comes to accuracy and the amount of details in the report. Bugsee doesn't stop there, however, it also presents video of user actions, console logs and network traffic that preceded the crash.
In the following series of posts we are actually going to focus on the crash log itself, explain the magic behind it and show how to properly set it up.
First post in the series is an introductory one.What is symbolication?
In order to answer that question we must briefly touch on the build process itself. Regardless of the language our project is written in (be that Objective C, Swift or any other), the build process translates our human readable code into machine binary code. Consider the following buggy code (can you spot the bug?).
void initialize() {
array = @[@"one", @"two", @"three"];
}
NSNumber* getElementFromArray(int index) {
return array[index];
}
void printAllElements() {
for (int i = 0; i <= 3; i++) {
NSLog(@"%@", getElementFromArray(i));
}
}
After build it will eventually become this:
0x100117dec: stp x29, x30, [sp, #-16]! ; <--- Start of the initialize() method
<...skipped...>
0x100117e9c: ldp x29, x30, [sp], #16
0x100117ea0: ret
0x100117ea4: bl 0x10022d83c
0x100117ea8: stp x29, x30, [sp, #-16]! ; <--- Start of the printAllElements() method
0x100117eac: mov x29, sp
0x100117eb0: sub sp, sp, #32
0x100117eb4: stur wzr, [x29, #-4]
0x100117eb8: ldur w8, [x29, #-4]
0x100117ebc: cmp w8, #3
0x100117ec0: b.gt 0x100117f08
0x100117ec4: ldur w0, [x29, #-4]
0x100117ec8: bl 0x100117f14 ; <---- this is where it calls getElementFromArray()
0x100117ecc: mov x29, x29
0x100117ed0: bl 0x10022d668
<...skipped...>
0x100117f0c: ldp x29, x30, [sp], #16
0x100117f10: ret
0x100117f14: stp x29, x30, [sp, #-16]! ; <--- Start of getElementFromArray() method
0x100117f18: mov x29, sp
0x100117f1c: sub sp, sp, #16
0x100117f20: adrp x8, 436
0x100117f24: add x8, x8, #2520
0x100117f28: adrp x9, 452
0x100117f2c: add x9, x9, #1512
0x100117f30: stur w0, [x29, #-4]
0x100117f34: ldr x9, [x9]
0x100117f38: ldursw x2, [x29, #-4]
0x100117f3c: ldr x1, [x8]
0x100117f40: mov x0, x9
0x100117f44: bl 0x10022d608 ; <--- Here we send message to NSArray to retrive that element
0x100117f48: mov sp, x29
0x100117f4c: ldp x29, x30, [sp], #16
0x100117f50: ret
As you can see from this example, the build process got rid of all the symbols (variable and method names), it also doesn't know anymore anything about the layout of our code, the amount if spaces we put to separate the functions, all that information is lost. So now when crash occurs (and it will occur, after all we access elements beyond the bounds of that array), if we don't have symbolication properly set up, this is the only crash information we will end up with:
NSRangeException: *** -[__NSArrayI objectAtIndex:]: index 3 beyond bounds [0 .. 2]
0 CoreFoundation 0x1857A51B8
1 libobjc.A.dylib 0x1841DC55C
2 CoreFoundation 0x1856807F4
3 MyApplication 0x100117f48
4 MyApplication 0x100117ecc
5 ...
This is pretty raw, and not very useful. We know it failed in some method inside the CoreFoundation system method, which was in turn called from some method in libobjc.A.dylib, which was in turn called from another method in CoreFoundation, which in turn was called from our application (finally!). But what is 0x100117f48? Where exactly is it? What file, function or line number is it? That is exactly where symbolication comes in.
Symbolication is the process of translating the return addresses back into human readable method/filename and line numbers.
Successful symbolication will result in the following report instead:
NSRangeException: *** -[__NSArrayI objectAtIndex:]: index 3 beyond bounds [0 .. 2]
0 CoreFoundation __exceptionPreprocess + 124
1 libobjc.A.dylib objc_exception_throw + 52
2 CoreFoundation -[__NSArrayI objectAtIndex:] + 180
3 MyApplication getElementFromArray (MyFile.m:22)
4 MyApplication printAllElements (MyFile.m:27)
Now it's pretty obvious that crash was caused by some improper array access in line 22 of MyFile.m, which happens to be within getElementsArray method. And if we need more context, we can easily see this one was called by printAllElements at line 27 of the same file.
What is a dSYM file?
Luckily for us, XCode can be instructed to keep a lot of the data that is being lost during the build process. It can put it inside the application itself, but that is not a good idea. We do not want to ship our application with all these extra debugging information, it will make it very easy for our competitors and hackers to reverse engineer the app. We would like to have it generated, but kept out of the AppStore. That's exactly what dSYM file is all about. During the build process, XCode strips all the debug information from the main executable file, and puts it inside a special file called dSYM. This helps to keep our executable small and easier to distribute to happy customers.
If our application is using frameworks, the product folder will have a separate dSYM file generated for each framework built. Eventually all of them are needed if we want to cover our bases and be able to symbolicate a crash in every possible location in our app.
Needless to say, a dSYM file generated while building a specific version of the application can only be used to symbolicate crashes from that specific version only.
dSYM files are identified by a Unique ID (UUID), which changes every time we modify and rebuild our code, and that ID is what is used to match a symbol file to a specific crash. A dSYM may be associated with more than one UUID, as it may contain debug information for more than one architecture.
The UUID of a dSYM can be easily retrieved using the dwarfdump command:
$ dwarfdump -u MyApplication.app.dSYM
UUID: 9F665FD6-E70C-3EB9-8622-34FD9EC002CA (armv7) MyApplication.app.dSYM
UUID: 8C2F9BB8-BB3F-37FE-A83E-7F2FF7B98889 (arm64) MyApplication.app.dSYM
The dSYM above has debug information for both arm7 and arm64 flavors of our application, each flavor has its own UUID.
These dSYM files can and should be manually stored for future symbolication of the crashes in the production build. Alternatively they can be uploaded to a crash reporting service like Bugsee, where they will be put in a vault and will get eventually used for processing a crash for that specific build. Typically, a special build phase is added to the build process that is responsible for uploading dSYM files to the vault.
What happens during iOS crash?
During crash the following information is being collected on the device:
- Crash/exception type and an exception specific message (if and when available)
- Stack trace for each thread (in raw form, the list of these unreadable return addresses that we saw before)
- List of all images (user and system frameworks and extensions loaded by the application. Each one has a unique UUID to help match it to the right dSYM file)
- Other information about the specific build, device, time of the crash, etc. These are less relevant to the symbolication process, but important nevertheless.
This information is sent for processing to the crash reporting service, where it will be matched with proper dSYM files that were already uploaded at build time, or will be uploaded manually at a later time. The symbolication process happens on the server and produces a nice, human readable crash report that can either be viewed through a web dashboard or downloaded as a file. The report will typically include the items listed above (basic info, crash/exception details and symbolicated stack trace if all the stars are aligned and all symbol files were properly uploaded and processed.
That is what a typical crash reporting service provides. Bugsee provides much more than that, to name a few, with Bugsee, these reports also include an interactive player that can play in a synchronized manner the following:
- Video of the screen and user interactions that preceded the crash
- Network traffic, with complete request and response headers and body
- System and application traces (disk space, cpu loads, top view and window names, etc.)
- Your custom traces
- Console logs
This gives much more context that is of tremendous help when trying to debug an evasive issue that is only happening for customers in the field.
In the next post of the series, we are diving deeper into symbolication process itself, and show how to manually symbolicate an address or a full Apple crash report.