This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

Wireshark, tshark - Out of memory problem

0

Hi all, I'm using wireshark and tshark to display data. I search on Internet and know that if we run wireshark day after day without stopping, it could cause two problem:

  • Temp Data written to disk: We have a workaround by adding an option to write into 5 files with default size, when the new 6 th file generated, it would replace the 1 st file.
  • Wireshark use a lot of memory and auto terminate itself when memory is not enough. We cannot use dumpcap, because it does not support our protocol camel message as wireshark. So, is there any way to fix this problem or I have to modify in the code. I don't want to modify the memory allocation in the code because it could cause many other problem so that I cannot control. But if it is the only way, do you have any suggestion or experience to share? Thanks a lot.

asked 22 Sep '13, 19:46

hoangsonk49's gravatar image

hoangsonk49
81282933
accept rate: 28%

P/S: i don't want to save data into file, I just process information in real time, after that we can throw it away, so no need to keep it any more.

(22 Sep '13, 19:48) hoangsonk49

because it does not support our protocol camel message as wireshark. So, is there any way to fix this problem or I have to modify in the code.

what are you trying to do exactly? Maybe there is an alternative.

(26 Sep '13, 02:17) Kurt Knochner ♦

I have a camel protocol in which "camel.local" is what I need. When a message comes, wireshark/tshark can dissect to get the value of "camel.local" and send it to server via socket automatically by source code. But the problem is that our system need to run in real time, day by day, no stopping while we are in trouble with out of memory of wireshark/tshark. So, we don't want to display or store anything. After sending the value, everything could be thrown away. That is our goal.

(26 Sep '13, 02:59) hoangsonk49

4 Answers:

1

I have a camel protocol in which "camel.local" is what I need. When a message comes, wireshark/tshark can dissect to get the value of "camel.local" and send it to server via socket automatically by source code.

O.K. here comes a proposed solution, at least I would do it this way ;-) Some ideas have already been mentioned in comments.

Proposed solution

Write a "management application" that handles everything. That application does the following steps in an endless loop.

  • mkfifo /tmp/dumpcap (needs to be done only once, not within the loop!)
  • spawn: tshark -ni /tmp/dumpcap -T fields -e camel.local | send_data
  • spawn and wait for exit: dumpcap -ni eth0 -f "host x.x.x.x and port xyz" -a duration:500 -P -w - > /tmp/dumpcap
  • After exit of dumpcap, goto 2.

Here is how it works.

  • We create a named pipe (FIFO) on Linux
  • We let tshark read from that named pipe, filter the data and pipe STDOUT to an application that sends the output to your backend database. The application send_data needs to be written as well!
  • spawn dumpcap and let the management application wait for the exit of its child.
  • dumpcap will run for 500 seconds (-a duration) and write the captured data in libpcap format (option -P - IMPORTANT) to the named pipe, from where tshark reads as soon as there is data.
  • As soon as dumpcap stops (after 500 seconds), tshark will be terminated as well, as it detects the EOF on the named pipe
  • Now you'll have to repeat these steps in a loop

This solves your problem because

  • dumpcap does not use a temp file
  • dumpcap and tshark do not run very long, hence no resource problem.
  • tshark needs to handle only the traffic of 500 seconds, which should be possible. If not, just reduce the amount of seconds that dumpcap writes data.

Caveats

  • instead of dumpcap, tshark now uses a temp file. However, if you ensure, that there is enough space for 500 seconds of data (or less), this should be no problem. Try to limit the amount of data as much as possible (see capture filter above: host x.x.x.x and port xyz).
  • there is a short time gap where you may miss some packets. It's the moment where tshark and dumpcap need to be restarted. However, that's a rather short time interval and if I had to choose between a non existant solution and this one, guess what ;-)). Furthermore, you might be able to work around that little gap, if you start two instances of tshark/dumpcap with an overlapping of a few seconds (start the second instance shortly before the first one terminates), so you won't miss packets. However you will then get duplicate camel.location data. You might be able to filter those duplicates in your backend maybe by using the IP ID, as that will be sent identical from both tshark process while they see the same frames.

tshark -ni /tmp/dumpcap -T fields -e ip.id -e camel.local | send_data

This sounds to good to be true. Does it really work?

Well, I did not write a management wrapper. I tested it manually and yes it works on Ubuntu 12.04 with tshark 1.10.

Will it work on Windows?

I have no idea. Probably yes as there are named pipes on Windows as well. There is no native mkfifo command, but there are other ways to create a named pipe on Windows (google or your local Linux/Windows hero will tell you).

Have fun!

Regards
Kurt

answered 26 Sep '13, 11:13

Kurt%20Knochner's gravatar image

Kurt Knochner ♦
24.8k1039237
accept rate: 15%

edited 26 Sep '13, 11:35

0

You could still use dumpcap for the actual capture process, and then do a batch script to run tshark/wireshark on the files that are complete to do the analysis you want, with either remembering what file was already processed or by deleting the ones that are complete.

answered 22 Sep '13, 22:42

Jasper's gravatar image

Jasper ♦♦
23.8k551284
accept rate: 18%

actually it is quite complicated, and it could affect our performance. I ever think about that, using tshark to extract data, and then I use Java program to read a text file, but it caused some problems because our system working in real with high performance. Anw, thanks for your suggestion, I should try it again to see if it works well. Btw, do you think code modification is a good way to eliminate the problem completely?

(23 Sep '13, 18:19) hoangsonk49

While I do not know exactly how much work it will actually be, my guess is that code modification is a huge task that will most likely require a lot of work and coordination with the other developers. So I'm not sure if it is possible in a short time frame to solve your current problem with that.

(23 Sep '13, 22:34) Jasper ♦♦

@Jasper: I have some questions: if we use dumpcap, can it solve the problem of memory? Dumpcap does not increase the memory, does it? Dumpcap just save the information into pcap file and not store anything else so that it could run in live network, is it right?@Jasper:

(26 Sep '13, 03:37) hoangsonk49

Yes, correct. I did a test once and wrote 34GByte into a single file without any memory problems.

(26 Sep '13, 03:39) Jasper ♦♦

0

What version are you using? If you are rotating files then the memory usage problem "should" not be a problem: each time Wireshark goes to a new file the memory used from the last one should be freed. If you're not using the latest release (1.10.2) you might want to upgrade.

If you're already on 1.10.2 you might want to try a buildbot build--a fair amount of work has gone into fixing memory leaks (although I thought most of them were not run time leaks).

answered 23 Sep '13, 06:59

JeffMorriss's gravatar image

JeffMorriss ♦
6.2k572
accept rate: 27%

Ary you sure memory is released at file rotation, I thought we retaned state to be able to reassemble packets in two different files for instance.

(23 Sep '13, 08:44) Anders ♦

Oh boy, maybe I'm completely losing my mind... I was pretty sure we did NOT do that (and that we released memory each time we closed a file). I don't know for sure...

(23 Sep '13, 12:27) JeffMorriss ♦

As I know, file rotation just avoid the problem of disk space but it cannot solve the problem of RAM, please correct me if i was wrong. By the way, I'm using the latest version of wireshark.

(23 Sep '13, 18:24) hoangsonk49

is it possible to add all information to proto tree with NULL value so that nothing stored in proto tree? I don't need to display, so do you think by doing this, the problem of memory could be solved ?

(25 Sep '13, 18:55) hoangsonk49
1

Nope. Dissectors also save lots of stuff in conversation tables and defragmentation tables.

It's a major undertaking to resolve this, hopefully the memory allocator changes currently underway might help.

If it was easy to do it would have been done already.

(26 Sep '13, 00:32) grahamb ♦

You broke my heart, grahamb, but thanks for your information :-). Hope that the memory allocator can help

(26 Sep '13, 00:52) hoangsonk49
showing 5 of 6 show 1 more comments

0

Hi all, After ~9 months since the last comment, I have some experience to share. With near 6 months of running, our service is still working without any problem of memory. Some related information:

  • Command: nohup tshark -i 5 -P -w /tmp/Log.pcap -b filesize:655350|split -b 655350000 -a 10 - /tmp/log/call_log- &
  • I changed the code to make sure that when a Log_xxx.pcap was processed, its name would be changed into Log_xxx.pcap.bak and call_log become call_log_xxx.bak (in order to remove by cronjob)
  • Each Log.pcap reaches the limited size (655350) in about 12 minutes. Service is running in real-time.
  • Memory of server: 8 GB
  • OS: CentOS 5.8
  • I use Top to check memory everyday: never greater than 15%
  • A cronjob to remove *.bak every 5 days.

Before running this service, I'm afraid of the problem of memory, but thank God, we are still alive. The service has not been stopped during 6 months, and still working. In theory, it should get the problem but until now, I'm a lucky guy. Maybe,it would get trouble in the future but 6 months of perfect running is a good result. So, sometime, practice is quite different from theory. I share this information to encourage anyone who is afraid of the problem of memory (just like me before:-)). Just try, all theory is grey, but, the glad golden tree of life is green :-)

answered 19 Jun '14, 02:51

hoangsonk49's gravatar image

hoangsonk49
81282933
accept rate: 28%

So, you are saying that tshark is running for 6 months, without restarting it a single time and you are not running into the out of memory problem?

Just try, all theory is grey, but, the glad golden tree of life is green :-)

Why 'theory'? Wasn't it you who reported a out of memory problem? ;-))

(19 Jun '14, 10:10) Kurt Knochner ♦

So, you are saying that tshark is running for 6 months, without restarting it a single time and you are not running into the out of memory problem?

Yes, I have not met any trouble of memory. It is running without any crash.

Why 'theory'? Wasn't it you who reported a out of memory problem? ;-))

I'm not the person who reported a problem of memory but actually before working with tshark, I searched on Internet and saw many warnings about problem of memory. It even has a report as a known bug. It means if I run tshark for a long time and big data, soon or late, I would met the problem of memory. I don't have much experience on wireshark, so, it is "theory" to me :-)

(19 Jun '14, 21:26) hoangsonk49

It even has a report as a known bug. It means if I run tshark for a long time and big data, soon or late,

yes, and it will/should happen, so the question is, why it does not happen in your environment.

What do you see on the line where tshark is listening (I guess eth0)?

Is that pre-filtered traffic, or the whole traffic on that link, including IP, UDP, TCP, HTTP, SMTP, etc. (actually all protocols)? If it's the later, you should run into a memory problem sooner or later, as almost every dissector creates at least an entry in the conversation hash tables. Some dissectors also add data to a conversation (e.g. HTTP). So, the hash table will keep growing as long as tshark is running. There should be other data structures as well in certain dissectors, which would increase the memory usage.

At least that's my understanding of the dissection engine. So, if you don't see any increase in memory usage after running tshark for 6 months, my understanding of the dissection engine might be wrong. I've started a new question about this 'issue'.

http://ask.wireshark.org/questions/34035/tshark-memory-usage

You are welcome to add your experience, especially about the following parts:

  • do you see an increased memory usage of the tshark process?
  • can you ensure, that its the same tshark process running for 6 months (same PID)?
  • do you pre-filter the traffic on eth0 (switch port/TAP filtering)?
(22 Jun '14, 08:27) Kurt Knochner ♦