Debugging a shortlived MacOS application
Yesterday I had to debug a MacOS X commandline program that segfaulted immediatly after starting. This program is 'rdnssd' (Recursive DNS Servers discovery Daemon, http://rdnssd.linkfanel.net/). 'rdnssd' implements the client part of RFC 5006 - IPv6 Router Advertisement Option for DNS Configuration. This function lets an IPv6 router send out DNS server IP address information as part of the Router Advertisment messages, helping client finding a DNS server without the need of DHCP or local configuration.
The 'rdnssd' source compiled without issues, but when started, it stopped immediatly with a segfault:
# sudo /usr/local/sbin/rdnssd -f [1] 1243 segmentation fault sudo /usr/local/sbin/rdnssd -f
On Linux I'm used to strace (and Solaris has truss) to trace the sysemcalls of a process, hoping that would point me to the error. MacOS X 10.5/10.6 has several very useful tools build on top of DTrace. Apple Technical Note 2124 lists a good part of these tools.
A run of the process using dtrace only showed that the process died after forking, with no additional information. errinfo also didn't show any suspicious syscalls failing. But the use of opensnoop (traces the files opened by a process) showed that acutally MacOS X is writing a crash dump for the failing process with additional information:
# opensnoop | grep rdnssd 0 1446 rdnssd 3 /dev/urandom 0 1446 rdnssd 3 /dev/dtracehelper 0 1446 rdnssd 3 /dev/urandom 0 1446 rdnssd 3 /dev/urandom 0 1446 rdnssd 3 /usr/share/locale/en_US.UTF-8/LC_CTYPE 0 1448 rdnssd -1 /etc/sysinfo.conf 0 1447 taskgated 4 /usr/local/sbin/rdnssd 0 1446 rdnssd 3 /usr/local/var/run/rdnssd.pid 0 1449 ReportCrash 5 /usr/local/sbin/rdnssd 0 1449 ReportCrash 5 /Developer/source/ipv6/ndisc6-1.0.1/rdnssd/rdnssd.o 0 1449 ReportCrash 5 /Developer/source/ipv6/ndisc6-1.0.1/rdnssd/icmp.o 0 1449 ReportCrash 5 /Developer/source/ipv6/ndisc6-1.0.1/rdnssd/rdnssd.o 0 1449 ReportCrash 5 /Developer/source/ipv6/ndisc6-1.0.1/rdnssd 0 1449 ReportCrash 5 /Developer/source/ipv6/ndisc6-1.0.1/rdnssd/icmp.o 0 1449 ReportCrash -1 /Developer/source/ipv6/ndisc6-1.0.1/rdnssd/../compat/libcompat.a(ppoll.o) 0 1449 ReportCrash 5 /Library/Logs/DiagnosticReports/rdnssd_2011-03-10-190618_localhost.crash 0 1449 ReportCrash -1 /Library/Logs/DiagnosticReports/rdnssd_2011-03-10-190618_localhost.crash 0 1449 ReportCrash 6 /Library/Logs/DiagnosticReports/rdnssd_2011-03-10-190618-1_localhost.crash 0 1449 ReportCrash 5 /Library/Application Support/CrashReporter/rdnssd_1472A851-7F63-5DAA-BB39-7AD15590893E.plist 0 1449 ReportCrash 5 /Library/Application Support/CrashReporter/rdnssd_1472A851-7F63-5DAA-BB39-7AD15590893E.plist 0 1449 ReportCrash 5 /Library/Application Support/CrashReporter/rdnssd_1472A851-7F63-5DAA-BB39-7AD15590893E.plist 0 1449 ReportCrash 5 /Library/Application Support/CrashReporter/rdnssd_1472A851-7F63-5DAA-BB39-7AD15590893E.plist 89 1450 mdworker 6 /usr/local/var/run/rdnssd.pid 89 1450 mdworker 6 /usr/local/var/run/rdnssd.pid 0 34 mds 57 /Library/Logs/DiagnosticReports/rdnssd_2011-03-10-190618-1_localhost.crash 89 34 mds -1 /Library/Logs/DiagnosticReports/rdnssd_2011-03-10-190618-1_localhost.crash 0 34 mds 57 /Library/Logs/DiagnosticReports/rdnssd_2011-03-10-190618_localhost.crash 89 34 mds -1 /Library/Logs/DiagnosticReports/rdnssd_2011-03-10-190618_localhost.crash 89 1450 mdworker -1 /Library/Application Support/CrashReporter/rdnssd_1472A851-7F63-5DAA-BB39-7AD15590893E.plist 0 34 mds 7 /Library/Application Support/CrashReporter/rdnssd_1472A851-7F63-5DAA-BB39-7AD15590893E.plist 89 34 mds -1 /Library/Application Support/CrashReporter/rdnssd_1472A851-7F63-5DAA-BB39-7AD15590893E.plist
A look into the crash report file revealed the information I was looking for:
more /Library/Logs/DiagnosticReports/rdnssd_2011-03-10-190618_localhost.crash Process: rdnssd [1446] Path: /usr/local/sbin/rdnssd Identifier: rdnssd Version: ??? (???) Code Type: X86-64 (Native) Parent Process: zsh [1161] Date/Time: 2011-03-10 19:06:17.998 +0100 OS Version: Mac OS X 10.6.6 (10J567) Report Version: 6 Exception Type: EXC_BAD_ACCESS (SIGSEGV) Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000008 Crashed Thread: 0 Dispatch queue: com.apple.main-thread Thread 0 Crashed: Dispatch queue: com.apple.main-thread 0 rdnssd 0x00000001000023d2 ppoll + 66 1 rdnssd 0x0000000100001fc2 main + 2418 (rdnssd.c:377) 2 rdnssd 0x00000001000010a4 start + 52 Thread 0 crashed with X86 Thread State (64-bit): rax: 0x0000000000000000 rbx: 0x00000000ffffffff rcx: 0x00007fff80b1ea02 rdx: 0x0000000000000000 rdi: 0x0000000000000003 rsi: 0x00007fff5fbffa4c rbp: 0x00007fff5fbff910 rsp: 0x00007fff5fbff8d0 r8: 0x00007fff70125f0c r9: 0x0000000000000000 r10: 0x00007fff5fbffa4c r11: 0x0000000000000202 r12: 0x0000000000000000 r13: 0x0000000000000001 r14: 0x00007fff5fbffa30 r15: 0x00007fff5fbff8dc rip: 0x00000001000023d2 rfl: 0x0000000000010246 cr2: 0x0000000000000008 Binary Images: 0x100000000 - 0x100002ff7 +rdnssd ??? (???) /usr/local/sbin/rdnssd 0x7fff5fc00000 - 0x7fff5fc3bdef dyld 132.1 (???) /usr/lib/dyld 0x7fff80abc000 - 0x7fff80c7dfff libSystem.B.dylib 125.2.1 (compatibility 1.0.0) <71E6D4C9-F945-6EC2-998C-D61AD590DAB6> /usr/lib/libSystem.B.dylib 0x7fff82f2c000 - 0x7fff82f30ff7 libmathCommon.A.dylib 315.0.0 (compatibility 1.0.0) <95718673-FEEE-B6ED-B127-BCDBDB60D4E5> /usr/lib/system/libmathCommon.A.dylib 0x7fffffe00000 - 0x7fffffe01fff libSystem.B.dylib ??? (???) <71E6D4C9-F945-6EC2-998C-D61AD590DAB6> /usr/lib/libSystem.B.dylib
The culprit is in the sourcefile rdnssd.c, line 377, in the function
call 'ppoll'. The reason was a bad memory access (EXC_BAD_ACCESS
).
This should be all information needed to find and fix the issue.
Stay tuned…