This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

how to troubleshoot a crash?

0

Hi,

Is there a good writeup on how to troubleshoot Wireshark crashes, preferably covering all major operating systems?

I have Wireshark running NOT capturing live traffic but analyzing a pcap file (less than 1MB). It was doing fine for a while. Then it crashed. I have a suspicion that it may be caused by an error in the Lua dissector I wrote, but I have no idea what's the error.

I didn't start from a console, so no info there. I did get the crash report, but it makes no sense to me, as seen below.

My questions: what evidences can we get after a crash, and how to make sense of them, without having 10 years of experience in this field?

partial crash report:

Process:         Wireshark [7020]
Path:            /Applications/Wireshark.app/Contents/MacOS/Wireshark
Identifier:      org.wireshark.Wireshark
Version:         1.11.3-1864-geef0fa6 (1.11.3-1864-geef0fa6)
Code Type:       X86-64 (Native)
Parent Process:  launchd [181]
Responsible:     Wireshark [7020]
User ID:         502

Date/Time: 2014-04-03 17:12:29.978 -0500 OS Version: Mac OS X 10.9.2 (13C64) Report Version: 11 Anonymous UUID: 3C1BACD2-261C-95A3-7B77-F6E256111417

Sleep/Wake UUID: 4CC28C83-FB62-4ACB-A214-181F8BE906E8

Crashed Thread: 0 Dispatch queue: com.apple.main-thread

Exception Type: EXC_BREAKPOINT (SIGTRAP) Exception Codes: 0x0000000000000002, 0x0000000000000000

Application Specific Information: wireshark 1.11.3-1864-geef0fa6 (wireshark-1.11.3-rc1-1864-geef0fa6-dirty from unknown)

Compiled (64-bit) with Qt 5.2.1 with GLib 2.36.0, with libpcap, with libz 1.2.3, without POSIX capabilities, with SMI 0.4.8, without c-ares, without ADNS, with Lua 5.1, without Python, with GnuTLS 2.12.19, with Gcrypt 1.5.0, with MIT Kerberos, with GeoIP, without PortAudio, with AirPcap.

Running on Mac OS X 10.9.2, build 13C64 (Darwin 13.1.0), without locale, with libpcap version 1.3.0 - Apple version 41, with libz 1.2.5, GnuTLS 2.12.19, Gcrypt 1.5.0, without AirPcap. Intel(R) Core(TM) i7-3615QM CPU @ 2.30GHz

Built using llvm-gcc 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.9.00).

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread 0 libglib-2.0.0.dylib 0x000000010811005b g_logv + 1371 1 libglib-2.0.0.dylib 0x00000001081102dd g_log + 381 2 libwireshark.0.dylib 0x00000001035d7289 proto_register_field_init + 57 3 libwireshark.0.dylib 0x00000001035d7fd1 proto_register_field_array + 161 4 libwireshark.0.dylib 0x00000001040feb33 Proto_commit + 579 5 libwireshark.0.dylib 0x0000000104114652 wslua_init + 1058 6 org.wireshark.Wireshark 0x0000000103199c17 main + 1879 7 org.wireshark.Wireshark 0x0000000103132cb4 start + 52

asked 09 Apr ‘14, 16:37

YXI's gravatar image

YXI
21182023
accept rate: 0%

edited 09 Apr ‘14, 17:09

Hadriel's gravatar image

Hadriel
2.7k2939


2 Answers:

2

I suppose I should actually try to answer the question being asked. :)

I don't know if there's a good write-up on troubleshooting wireshark crashes in particular, but I'll try to give the 30-second simplifies overview. Ultimately though, this requires some basic understanding of programming, and the C language in particular, because you'd have to look into the source code itself.

Basically what I do is start with the exception type and crash dump stack trace (or sometimes called a backtrace).

This is the exception type:

Exception Type:  EXC_BREAKPOINT (SIGTRAP)

That tells you a SIGTRAP was issued, either on purpose or due to an exception. In this case it's not that helpful, because you could guess that from the stack trace... but sometimes it is useful, for divide by zero, memory errors, etc.

This is the stack trace:

0   libglib-2.0.0.dylib             0x000000010811005b g_logv + 1371
1   libglib-2.0.0.dylib             0x00000001081102dd g_log + 381
2   libwireshark.0.dylib            0x00000001035d7289 proto_register_field_init + 57
3   libwireshark.0.dylib            0x00000001035d7fd1 proto_register_field_array + 161
4   libwireshark.0.dylib            0x00000001040feb33 Proto_commit + 579
5   libwireshark.0.dylib            0x0000000104114652 wslua_init + 1058
6   org.wireshark.Wireshark         0x0000000103199c17 main + 1879
7   org.wireshark.Wireshark         0x0000000103132cb4 start + 52

What a stack trace tells you is the last set of functions, in reverse order, that were called and are in the call stack. I.e., it stopped in g_logv(), which it got to from within g_log(), which it got to from within proto_register_field_init(), etc. (note that it's not all functions the program executed, just the ones still on the stack, meaning the ones to still be returned from and popped from the stack)

Those are literally the function names from within the source code - if the program was compiled with full debug symbols, you'd even see source code filenames and such. The hex numbers to the left of the function names are the program counter addresses, which won't be helpful.

The names on the left tell you which program/library it's executing for that function. "libglib-2.0.0.dylib" is the Glib library, while "libwireshark.0.dylib" is the core Wireshark library, and "org.wireshark.Wireshark" is the Wireshark application itself (or really a symbolic link to it). The "dylib" extension tells you it's a dynamic library in Mac (similar to a "DLL" dynamic linked library in Windows, and a "so" shared object in Linux).

So using the stack trace you gave, I can tell the crash happens during the internal registering of Lua fields. Because I know that's what Proto_commit() does, when it is calling proto_register_field_array().

Technically the "crash" occurs in g_logv() in the sense of that's the function that caused the SIGTRAP, but that's really just a Glib function that was called by g_log() which is another Glib function, and that one was apparently called by proto_register_field_init(). I say "apparently", because it was something that is really a macro or inlined, because there is no direct call to g_log() inside proto_register_field_init().

With full debug symbols it's a lot easier to figure out exactly where it happened, because you'd get the source code line numbers too. But ultimately it's not really telling you why it crashed, in the sense of what was the root cause - it's more telling you how it crashed.

My guess is it's actually "g_error()" being called, on purpose, inside proto_register_field_init()... because it encountered something that is a programming error and needs to be fixed. g_error() is a macro, not a function, so it won't show up in the stack trace. (a macro is a pre-processor directive during compilation) In fact proto_register_field_init() doesn't even directly call g_error(), so it could be being called via one of the other macros used in proto_register_field_init(), or via some function that the compiler chose to inline. (by "inline" I mean it didn't keep the function separate, but instead moved its logic into the calling function, to increase performance)

The fact that it crashes is really a C-code programming error, not a Lua one - the Wireshark Lua API C-code tries to prevent actual crashes for bad Lua scripts, by doing verification checking before calling the rest of the C-code.

So it's probably some incorrect Lua ProtoField in the script, that isn't being properly verified by the C-code. Of course ultimately the Lua script itself is probably wrong too, or maybe it's a fine Lua script and there's some bug in the Wirehark Lua API code. (it could just be a bug - that field registration code was changed recently... although more recently than your version I think... 1.11.3-rc1-1864 isn't very recent)

answered 09 Apr '14, 18:50

Hadriel's gravatar image

Hadriel
2.7k2939
accept rate: 18%

Thanks. That is a very helpful crash course (no pun intended) with an example.

Your mentioning of my build being old alerted me to check the build info in the crash report. Then I realized that this is not the crash report for what happened yesterday. However, when I went to ~/Library/Logs/DiagnosticReport yesterday, there was only one crash report, the one I uploaded. It was from last week! Why there was no crash reports for the crashes happened yesterday? Could they be found somewhere else on my Mac?

You also mentioned that "if the program was compiled with full debug symbols" ... I thought the Wireshark binaries we download are already compiled with debugging turned on. That's not the case in the nightly builds? What do we have to do to have the full debug symbols?

Thanks.

(10 Apr '14, 08:53) YXI

You asked:

Could they be found somewhere else on my Mac?

They might be in /Library/Logs/DiagonosticReports (note the lack of a ~ tilde), but they really should be where you said. You can also view them in the Console app instead of in a terminal - i.e., Applications->Utilities->Console.


What do we have to do to have the full debug symbols?

You have to compile it yourself. I get full debug symbols with the one I compile locally.

Most programs aren't released with full debug symbols, because it increases the size of the program quite a bit. (and commercial products don't do it so they don't reveal internal stuff)

(10 Apr '14, 10:55) Hadriel

BTW, please submit this as a bug - Wireshark really isn't supposed to crash even with a wrong Lua script.

You don't need to include your whole script - just the portion that creates the ProtoFields, i.e. the part that does this type of thing:

local foo = ProtoField.new(...)
local bar = ProtoField.uint16(...)
(10 Apr '14, 12:12) Hadriel

To compile it with full debug symbols, do you simply change the Makefile? I see in the Makefile there are two flag sets: CFLAG = -g ... CFLAG_FOR_BUILD = ... So do I just add "-g" to CFLAG_FOR_BUILD?

I'm not going to submit a bug now as I don't have a relevant crash report. The crash report I have is from last week (couldn't find another one even following your new advice) and I have since changed my ProtoFields quite a bit. I will submit a bug report if Wireshark crashes again after the rebuild with debug symbols and also with Wireshark launched through gdb. That way you guys will have some stuff to work with. It will probably be done in the Linux system. If it crashes in Linux, it should also do it in Mac and Windows, right (assuming they all come from exactly the same version of source code)?

(10 Apr '14, 12:50) YXI

You asked:

 So do I just add "-g" to CFLAG_FOR_BUILD?

No you shouldn't have to. The Makefile is auto-generated by the configure script, so as long as you did autogen.sh followed by configure, then just doing make after that should work.


If it crashes in Linux, it should also do it in Mac and Windows, right?

Not necessarily, no. It might not even crash consistently every time in Mac... for example if a Lua object being garbage collected is causing the issue, then it's a roll of the dice. (though I don't think that's the case here)

(10 Apr '14, 13:01) Hadriel

Just read the official INSTALL instructions came in the source tarball and there was no mention of running autogen.sh before running ./configure.
So is the act of running autogen.sh before configure that ensures the debug symbols will be added for the compile?

(14 Apr '14, 09:14) YXI

I don't know which exact script is necessary to eventually get the Makefiles to generate full debug symbols. autogen.sh runs libtool, automake, and autoconf, which are GNU tools for figuring out platform-specific stuff and generating the appropriate inputs to the rest of the make process (and thus eventually the Makefiles generated by configure).

If I recall right, just running configure might be ok because I seem to recall it will actually run autogen.sh itself if it doesn't find the right info already. Personally I always run autogen.sh myself instead of just only configure, because I'm used to doing it. :)

(14 Apr '14, 10:41) Hadriel

Yeah, I think autogen.sh is probably not needed. I just scanned through configure script and seems to have autoconf and automake stuff in there already.

(14 Apr '14, 11:47) YXI

FYI: The default CFLAGS from a ./configure include -g -O2 (debug symbols plus optimization).

Sometimes when debugging with gdb, it's useful to disable optimization: i.e., use -g -O0 so that the code being debugged matches the source code.

This can be done easily as follows:

CFLAGS='-g -O0' ./configure [your options]

Note that disabling optimization may possibly cause the code to work differently: e.g. not crash when using -O2 causes a crash.

Also: when -O2 is used, certain additional compile time checks are performed.

======

re: use of autogen.sh

I believe that downloading and building from the Wireshark "source tarball" does not require that ./autogen.sh be run.

If the Wireshark sources are downloaded from the git repository, ./autogen.sh will need to be run before running ./configure

As noted previously, any further source changes requiring ./autogen.sh to be rerun will be automatically handled when doing a make.

(14 Apr '14, 16:17) Bill Meier ♦♦
showing 5 of 9 show 4 more comments

1

Generally you're not expected to debug a crash yourself - I mean it's great if you can, but I don't think it's a big deal if you don't. Just submit a new bug in bugs.wireshark.org, with the info you provided above.

Usually it helps to have the capture file that caused the crash, but in this case it's not a capture file - it's something wrong happening inside Wireshark during startup, when it's trying to internally register the ProtoFields from your Lua script. (After all the Lua scripts are loaded and executed, Wireshark internally registers the protocols and fields created by the scripts)

It may well be something incorrect in your Lua script, but it still shouldn't crash. It should just generate an error and continue; preferably it should fail while loading your Lua script, before even trying to register the field(s) internally.

So to help debug this you need to attach the Lua script itself to the bug ticket - or some minimal version of the script with just enough to cause the crash. Like even just the portion of the script that creates ProtoFields will probably be enough to figure it out - you don't need to include the dissector function or anything else really.

answered 09 Apr '14, 17:24

Hadriel's gravatar image

Hadriel
2.7k2939
accept rate: 18%