I am attempting to capture approx 20mbit/sec worth of radius traffic continuously with tshark. If I capture packets with tshark on CentOS 6.5 I get around 4% to 66% packets dropped. If I do the same thing on CentOS 7 it never reports any dropped packets. I've actually tried to get it to drop packets by doing crazy stuff like outputting large amounts of traffic to xml. As far as I can tell it is not dropping packets. My question is, does CentOS 7 have some sort of feature that makes dropping packets impossible? Or is it dropping packets and not telling me? As an example, I execute commands like this:
For the first command CentOS 6 reports 4% dropped packets, CentOS 7 reports none. For the second command CentOS 6 reports 66% dropped packets but CentOS 7 reports none. Note that both machines are running tshark 1.12.7 compiled from source. linux versions for CentOS 6 and 7:
Libpcap versions for CentOS 6 and 7:
hardware:
Both capture on VMXNET3 10G optical connection. Same Hard disk. asked 27 Aug '15, 20:29 MikeKulls edited 27 Aug '15, 22:09 showing 5 of 9 show 4 more comments |
One Answer:
No, but it has two features that make dropping packets far less likely:
Libpcap uses Linux 2.4, I think, introduced the "turbopacket" mechanism (that's what the "T" in "TPACKET" stands for - "turbo"), which provides a memory-mapped buffer shared by the kernel and userland. With that, fewer copies are needed when delivering packets, and the packet-reading loop in userland can process multiple packets per wakeup (to wait for packets to arrive, userland makes a In some 3.x version (3.6?), answered 28 Aug '15, 16:23 Guy Harris ♦♦ Thanks, very knowledgeable answer. If you are on serverfault and can be bothered cutting and pasting your answer to the same question there I can mark it as the correct answer there too. http://serverfault.com/questions/717161/tshark-packet-capture-performance-centos-6-v-centos-7 (29 Aug '15, 01:35) MikeKulls
Being a libpcap core developer helps here. :-)
Done. (29 Aug '15, 11:20) Guy Harris ♦♦ |
What linux kernel and libpcap version does the two systems have? Are the HW similar; cpu, memory seize and speed, disc speed etc
I've updated the question with the details. Both of these are virtual machines running on the same largish physical server. One has twice as much ram and CPUs assigned. I could assign more and test again. I'm not sure it will make much difference.
I just tried with 4 CPUs and 8GB ram and got the same result. I am thinking it must be a feature of CentOS/Redhat 7 as it is absolutely rock solid on 7. I cannot make it drop packets no matter how hard I try.
There are some improvements inlibpvap and the kernels between the versions but I don't know if it would make such a big difference I manage 500 mb/s on a fresh Ubuntu system. Perhaps it's related to the virtualizaton support between the versions. I think system calls may bee expensive.
Have you taken a look at the captures to see if there really are no packets missing in RH7? It may be that the kernel simply isn't reporting the drops to libpcap anymore...
Jeff, I am not 100% sure but I did get some interesting results. I got it to capture 100,000 packets to XML. It took over 60 seconds to complete the capture but the time difference in the file from the first to the last packet was 10 seconds. This had me a bit stumped. It's like it buffered up the 100,000 packets somewhere. There's only 2 options for that, memory or disk. Maybe it writes a binary file first and then converts it? I guess that would make sense.
I should add that you've hit the nail on the head there. Is it not dropping packets or not reporting dropped packets. That has given me the solution I believe. I just need to ramp up the traffic until it does report drops. Then I'll know at what point it starts dropping and whether or not it is keeping up with 20mbit
If you do something such as
tshark -i ens224 -c 100000 -w /tmp/delme.pcap
, TShark runs dumpcap to do the capture and write the file, and pretty much all it does itself is report the arrival of packets.If you do something such as
tshark -i ens224 -c 100000 -T pdml > /tmp/delme.xml
, TShark runs dumpcap to do the capture and write to a temporary file, and it reads the capture file as packets arrive and write out PDML to the file (which you should probably not call ".pcap", as it's not a pcap or pcapng file).Yep, that explains what I am seeing. We have a pretty interesting use case for tshark here, capturing all radius messages from 4 million or more devices and storing it in big data. Converting the pcap files to text and processing them takes longer than 60 seconds per 60 seconds of capture so we capture in 60 second chunks and use multiple cores to convert the files to text and extract the data we need.