This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

What does processing speed of Dissectors depend on?

0

Hi guys, I'm analyzing the processing speed of incoming data and dissector in order to know how they are different. I use the tshark command on Windows and surf Youtube to increase network speed rate :

tshark -i 1 -P -w D:/sonnh.pcap -b filesize:1000 -b files:4

I change the code to print out the number of incoming packet (which are written to .pcap file by dumpcap) and the number of outgoing packets (which are dissected by dissector together). Note that: dumpcap doesn't write packets one by one, it captures a group of packet (e.g: 10 packet a time) and then write this group to a .pcap file, then Dissector also take a group of packet from file to dissect. As I saw in the log, the number of packet in Dumpcap group and Dissector group are different:

    Line    Number of Incoming - Dissected packets
     1.     17:16
     2.     11:12
     3.     7:7
     4.     19:17
     5.     17:14
     6.     230:235
     7.     2012:637
     8.     89:839
     9.     2444:37
     10.        1:500
     11.        92:55
     12.        21:16
     13.        0:18

In theory, if tshark works well, the sum of the number of incoming and dissected packets should be identical (100% dissected, drop 0%). The difference of number of incoming packets is because of the speed of network, that is ok, nothing wrong. But I wonder why the number of dissected packets are so different from the number of incoming. As can be seen in line 1:

 Line   Number of Incoming - Dissected packets
  1.        17:16

There is 1 packet left are waiting for dissector and this one is dissected in the line 2:

Line    Number of Incoming - Dissected packets
  2.        11:12

so, totally,the sums are identical. But in the line 9:

Line    Number of Incoming - Dissected packets
9.              2444:37

The number of dissected packs is only 37 vs 2444 of incoming. It is not a real capability of Dissector because in the line 8:

Line    Number of Incoming - Dissected packets
    8.      89:839

The dissector can handle 839 packets vs 89 packets of Incoming. So it means dissector is able to handle a large number of packets but why it only dissect only 37 packets in line 9. From that point, I have some questions:

  1. Why does the number of dissected packets vary in every time?
  2. What does the number of dissected packets depend on? (What make the difference of number of dissected packets)
  3. Why doesn't it take all incoming packets ?
  4. Does this mean the processing speed of Dissectors is less than Dumpcap (i don't think so because we don't have enough evidence)? If Yes, is there any way to increase the number of dissector corresponding to the number of incoming. Sorry for asking too deep in code detail but I hope there 's someone in this forum who work as developer can help me with their experience.

Please, if you are expert or just have any idea, suggestion, or experience on that, please help me to answer. Thank you so much.

asked 14 Oct '13, 19:20

hoangsonk49's gravatar image

hoangsonk49
81282933
accept rate: 28%

edited 15 Oct '13, 02:54


2 Answers:

0

I'm not sure your experiment is going to be useful, because you're probably getting thrown off by side effects that will have a more or less great impact on your measurements. Keep in mind that, while dumpcap is writing to file, tshark is reading that file at the same time. Meaning: you have file I/O from two processes - one is writing, one is reading, so there is going to be some serialization of who accesses the file when. Also, as you noticed, dumpcap will often buffer frames in memory and write them to disk in a bunch, which means that tshark will not be able to read them as soon as they really arrived.

All in all, dissecting packets may be slower than writing them to disk, but I doubt it. File I/O is usually a lot slower than in-memory processing, so I guess your test is basically a measurement of how fast your capture file is written and read again, biases by serialization of allowing access to the file.

Now, I'll sit and wait for Guy's answer - this is probably more in his area of expertise :-)

answered 14 Oct '13, 22:24

Jasper's gravatar image

Jasper ♦♦
23.8k551284
accept rate: 18%

edited 14 Oct '13, 22:25

Hi Jasper, I agree with you about

while dumpcap is writing to file, tshark is reading that file at the same time

we can consider they are doing at the same time, but actually, they have an order: write-read-write-read... at least in my experiment, I'm able to see this in the log.

dumpcap will often buffer frames in memory and write them to disk in a bunch, which means that tshark will not be able to read them as soon as they really arrived

In the question number 4, I also say that I don't have enough evidence to say which is faster. But it doesn't matter because my objective is not measurement. My concern is why Dissector only take a number of packet differently. For example: in the line 9, there are 2444 incoming packets but Dissector takes only about 37 outgoing packets to dissect even they are able to do more as they dissect 839 packets in the line 8. In line 1,2, the number of packet incoming and outgoing are often similar but sometime they are too different. It makes sense to me if dumpcap write x packets into pcap, then Dissector take y packets to dissect and y = x or close to x as much as possible. I'm trying to find out what decide the number y in stead of x and how to make it close to x. And one more question: How to start debug mode to print out g_log (both of tshark and dumpcap). Thanks

(14 Oct '13, 23:54) hoangsonk49

Is there any possible reason related to the pipe? Because as I check, the number of outgoing packets was read from the header of message from pipe (1 byte indicator, 3 bytes for message length)

/* convert header values */
pipe_convert_header((guchar*)header,4,indicator,&required);

--> To have "required" : number of byte to read

/* read the actual block data */
newly = pipe_read_bytes(pipe_fd, msg, required, err_msg);

--> To have "msg" : value reading from "required" bytes. It is also the number of packets which are going to be dissected

(15 Oct '13, 03:00) hoangsonk49

Keep in mind that dumpcap only alerts tshark to there being additional packets every once in a while (every 500 msec: see the DUMPCAP_UPD_TIME macro in dumpcap.c). So you may have times when tshark does less work than dumpcap simply because tshark is waiting for packets. Theoretically tshark should catch up after the next time tick.

(15 Oct '13, 10:30) JeffMorriss ♦

Thanks for your comment. As I understand,the DUMPCAP_UPD_TIME is used for non-overload slow displays. During this duration, dumpcap does nothing and wait for tshark and display, right? Now, if I don't care about what is printing on display, should I increase the DUMPCAP_UPD_TIME (for example: 750 ms or 1s) so that Dumpcap have a "longer delay" for tshark does its work. Is there any problem? Thanks.

(15 Oct '13, 18:55) hoangsonk49

I have just done with DUMPCAP_UPD_TIME = 1000 and DUMPCAP_UPD_TIME = 100 ms. When DUMPCAP_UPD_TIME = 1000, the number of dissecting packets increase a lot but the number of incoming packet to dumpcap each time also increase more rapidly (I use speed limiter = 2000 kB/s to make sure that the network always stabilize at speed rate ~ 2 MB/s). It also happens similarly to DUMPCAP_UPD_TIME = 1000. So totally, we still got the problem.

So you may have times when tshark does less work than dumpcap simply because tshark is waiting for packets

In the line 9: 2444 packets already written by dumpcap and THEN only 37 packets are dissected by tshark (even 2444 packets ready for dissecting and tshark does not need to wait for anything) while Dissector can dissect 839 packets as it have done in the line 8.

(16 Oct '13, 00:51) hoangsonk49

During this duration, dumpcap does nothing and wait for tshark and display, right? Now, if I don't care about what is printing on display, should I increase the DUMPCAP_UPD_TIME (for example: 750 ms or 1s) so that Dumpcap have a "longer delay" for tshark does its work.

Actually I think you're going for the opposite: you want tshark to dissect packets as soon as they're available, right? So there is less variability between how many packets dumpcap processes and how many tshark processes. To do that, decrease DUMPCAP_UPD_TIME to a small number.

(16 Oct '13, 06:23) JeffMorriss ♦

I have done with 2 cases: DUMPCAP_UPD_TIME = 1000 ms and DUMPCAP_UPD_TIME = 100 ms, let 's see the difference. With DUMPCAP_UPD_TIME = 1000 ms, the number of incoming packets: (8000 ~ 19000) vs the number of outgoing packets: (1000 ~ 2000). With DUMPCAP_UPD_TIME = 100 ms, the number of incoming packets: (2000 ~ 8000) vs the number of outgoing packets: (300 ~ 500). So, we have a less variability between how many packets dumpcap processes and how many tshark processes, but if we focus on the percentage of outgoing packets/incoming packets, the performance might be decrease so that when we stop tshark (by using -a duration:120), only dumpcap stops while tshark is still running and the delay is still long.

(17 Oct '13, 19:36) hoangsonk49
showing 5 of 7 show 2 more comments

0

Does this mean the processing speed of Dissectors is less than Dumpcap (i don't think so because we don't have enough evidence)?

well, just by applying logic, I would answer: Yes

Reason: Both dumpcap and tshark have the same amount of I/O work (if we ignore file system caches for a while). So, that amount of time is the same (dumpcap writing data, tshark reading the same amount of data). Then tshark has a lot of additional work due to the dissection of frames. So, yes dissecting packets with tshark will always be slower than just writing the packets to disk with dumpcap. This is due to the way how dumpcap 'delivers' packets to tshark, through a file they use both.

Thus, I believe the specific problem you have found is a structural problem and I don't know if there is a general or an easy solution for this. There will always be situations where dumpcap will be ahead of tshark, due to the time tshark needs to dissect the frames and it gets worse the longer tshark runs, due to larger lists and hash tables tshark needs to fill, search and possibly reorganize (hash table collisions - although I did not check where exactly that might happen!!). File system caching might help, as the I/O work of tshark (reading what dumpcap just wrote) has a much lower impact than for dumpcap (writing new data), but still....

I change the code to print out the number of incoming packet (which are written to .pcap file by dumpcap) and the number of outgoing packets (which are dissected by dissector together).

Can you please post the code change, so we can check if the changes are appropriate to measure what you are trying to measure ;-))

Regards
Kurt

answered 15 Oct '13, 05:58

Kurt%20Knochner's gravatar image

Kurt Knochner ♦
24.8k1039237
accept rate: 15%

edited 15 Oct '13, 07:12

So, yes dissecting packets with tshark will always be slower than just writing the packets to disk with dumpcap

OK, I agree with you that it is logical answer. With this thought, I think it would cause a delay increasing between incoming and outgoing packets but in fact with my experiment, there is no delay and it only happens to the high speed rate incoming. That 's why I have to print out and compare the number of incoming and outgoing packets. And after seeing the log, maybe the important thing is not a processing speed, it is how Dissector takes packets to dissect. As in block quote:

8.     89:839
 9.     2444:37

Why Dissector takes only 37 packets in line 9 while it is able to do more as it has just done in line 8. Now, It is not a problem of the speed, it is problem of how it decide the number of packets which are going to be dissected. If I can do as line 8, the problem could be solved.

Can you please post the code change, so we can check if the changes are appropriate to measure what you are trying to measure ;-))

Of course, here you are. In the dumpcap.c, function capture_loop_start(), I print out:

print_log_1(global_ld.packet_count);

In the tshark.c , function capture_input_new_packets(), I print out:

print_log_2(to_read,packet_count);

Please correct me if I did something wrong. Thanks.

(15 Oct '13, 18:40) hoangsonk49

and what's the code of print_log_1() and print_log_2()?

(16 Oct '13, 07:12) Kurt Knochner ♦

Here is the code of print. In both of function print, we open the same file and insert new values

tshark.c , function capture_input_new_packets()

fprintf(sOut,"\nTo_read: %d/%u",to_read,packet_count);

dumpcap.c, function capture_loop_start()

fprintf(sOut,"\nPacket_count: %u",packet_count);

packet_count in tshark.c is an ID of the latest packet since the last processing of tshark while packet_count in dumpcap.c is an ID of the latest packet which is written into pcap by dumpcap.

(16 Oct '13, 18:35) hoangsonk49

In both of function print, we open the same file and insert new values

O.K. I'm sorry to be annoying but you did not post the whole code.

If you reopen a file and write things to it, couldn't there be a bug in that code that leads to a mismatch of tshark and dumpcap packet count?

Without the full code of print_log_1() and print_log_2(), it is impossible to check (or even understand) your results.

(17 Oct '13, 01:47) Kurt Knochner ♦

Ok, sorry, here is the code:

    void print_log_1(gint packet_count)
{
    FILE *sOut;
    sOut = fopen("D:/asonnhx.txt","ab");
    fprintf(sOut,"\nPacket_count: %u",packet_count);
    fclose(sOut);
}
        void print_log_2(int to_read, guint16 t)
{
    FILE *sOut;
    sOut = fopen("D:/asonnhx.txt","ab");
    fprintf(sOut,"\nTo_read: %d/%u",to_read,t);
    fclose(sOut);
}

Thanks.

(17 Oct '13, 02:14) hoangsonk49

Hm.. the same file name? Isn't there a race condition if both (tshark and dumpcap) try to open and write to the file at the same time?

(17 Oct '13, 02:23) Kurt Knochner ♦

No, there isn't. I use the same file name just to see the order of output. Sometime I see it has unusual organization (according to what I expect) but most of output going in expected order: incoming-outgoing-incoming-outgoing...

(17 Oct '13, 03:29) hoangsonk49
showing 5 of 7 show 2 more comments