This started while trying figure out why loading a web page in a browser from apache via HTTP was so very slow going through a pinhole from an untrusted network to a trusted one. I thought it was smoothwall related, see my post here http://community.smoothwall.org/forum/viewtopic.php?p=297864#p297864 but not so sure and drove me to using wireshark and tcpdump. This is what I am seeing and lead to my main question of why does wireshark show 3 packets but tcpdump only shows 1 big packet? Is something combining them? Why do the ID's go in order for both wireshark and tcpdump (39771, 39772, 39773) and then wireshark goes to 700x but tcpdump goes to 39774? ug, out of my league here. Wireshark on the wire shows 3 packets, id 7006, 7007, 7008 with a Total Length of 1500, however, on the router running tcpdump, I see ONE packet, id 39774 with a size of 4420 which causes the router to generate an ICMP Destination unreachable (Fragmentation needed).
If Debian is the source machine, very many ICMP Fragmentation Required packets are generated by the router. If Windows XP is the source machine, transfer is very fast and no ICMPs. vmware-tools and vmxnet3 driver is installed on everything (debian, windows, router) Edit: 9-may-2011 While the issue was not related to wireshark or tcpdump, it was the strange packets I was seeing that sent me here to ask “why” which may have got me looking in the right place. asked 05 May ‘11, 08:53 gregg edited 08 May ‘11, 09:17 |
3 Answers:
So, it looks like I found a fix: VMware_Networking_Speed_Issue Cause: There is a bug related VMware network adapters or their drivers related to "segmentation offload" (TSO and GSO). I have heard one mention that the checksum feature may also have problems. Solution: Turning off TSO/GSO on the guest OSes is the fool-proof solution. You could try installing a newer OS or try playing with the adaptor type (eg: E1000, VMxnet2) but that seems to work only some of the time. Linux solution (worked for me)
NB: This linux fix does not persist through reboot. The article talks about how you might do that. Windows solution (did not try, never had a problem with WinXP)
answered 08 May ‘11, 09:08 gregg edited 08 May ‘11, 09:11 |
When capturing on the same device sending packets, you will often see outbound packets greater than the MTU because the capturing software, tcpdump in this case, sees the packet before it's packetized on the wire. You may also encounter packets with less than the minimum number of bytes required because the padding hasn't been added yet. As for the IDs, are you looking at the IP header's identifier field and trying to compare it with Wireshark's relative sequence numbers, perhaps? If you want to see the identifier, you can add a custom column to display it using ip.id as the display filter. As for the fragmentation issue, what's the path MTU? Most likely it's lower than 1500 bytes, thus the ICMP destination unreachable message. answered 05 May '11, 09:45 cmaynard ♦♦ Yes, the MTU is 1500, which is why the ICMP does make sense. The tcpdump ID I was going by: 10:52:04.077647 IP (tos 0x0, ttl 64, id >> 39774 << and in WireShark, Internet Protocol -> Identification. The two matched until I hit the 3 vs 1 packet. which after adding ip.id as you mentioned, showed what I was seeing in a more convenient place. There's 3 virtual adapters across 3 virtual machines in play. - 1 on web server - 1 on wireshark - 1 on router Wireshark records 3 packets, each 1500, coming out of the webserver, but the router only sees one big one coming in. (05 May '11, 10:15) gregg also, your comment about capturing on the same device as sending packets i think answers this: [bad hdr length 16 - too short, < 20][|icmp] (which i was originally ignoring since it didn't seem relevant to my real question) -- but it's good to know! (05 May '11, 10:19) gregg Where does the ICMP destination unreachable message originate from? Is it possible you are attempting to send a big 4420 byte packet, but due to PMTUD, as described in RFC1191, the VM responds with the ICMP destination unreachable message and the next hop MTU of 1500, then the fragmentation naturally occurs as a result, but the fragmentation is done at a lower layer so tcpdump never captures the fragments? (05 May '11, 10:57) cmaynard ♦♦ The router is originating the ICMP message. Something is attempting to send a big packet. Specifically debian systems. I started using netcat to just push data around for testing and if the debian systems are the source, then I get Fragmentation Needed ICMPs on nearly every packet. This does not happen when the source is Windows (XP in this case) If Debian is the source, the xfer takes a couple minutes. If Windows is the source, the xfer takes 2-3 seconds. I'm still working through this, but running out of ideas. (07 May '11, 22:00) gregg thanks for helping think about the problem! (08 May '11, 09:20) gregg |
Let me get this straight, you have a client (192.168.5.199) connected to the Smoothwall (gate) through a virtual switch vSwitch-X and the Smoothwall (gate) is connected with a second interface to the server (10.0.0.16) through vSwitch-Y. When you try to open a webpage on the client, on the webserver you can see 3 outgoing packets in Wireshark, while tcpdump on the server interface of the Smoothwall shows you 1 incoming packet from the webserver. Or are you running tcpdump on the client facing interface? The behavior is weird, but it looks like somehow one of the systems is trying to create jumbo frames out of the 3 separate packets. And smoothwall does not like it. The "[bad hdr length 16 - too short, < 20]" part of the output is due to the fact that the ICMP message contains part of the original packet that caused the unreachable message. tcpdump is interpreting this payload and sees a packet that is cut in the middle of the tcp header (which should at least be 20 bytes long, but only 16 bytes are present). answered 05 May '11, 16:17 SYN-bit ♦♦ All of your assumptions are correct re: .199 client -> vSwitch-X -> Smoothwall -> vSwitch-Y -> .16 server WireShark is on a separate machine on vSwitch-Y tcpdump is on the server interface of the smoothwall (vSwitch-Y facing) In testing, I'm discovering that the direction does not matter. I also discovered that if the source is a debian system, the problems occur (ICMP on nearly every packet). Windows systems as the source are very fast; no ICMPs Destination machine does not matter debian->debian or debian->win (slow) win->win or win->debian (fast!) (07 May '11, 22:13) gregg Also, if WireShark is on vSwitch-X: I see tons of TCP Previous segment lost TCP Dup ACK TCP Out-Of-Order (07 May '11, 22:17) gregg thanks for helping think about the problem! (08 May '11, 09:21) gregg |
i forgot to mention that everything is inside an ESXi 4.0 server running on a dell poweredge 2900 with a broadcom netxtreme iibcm5708 1000 base-T card. someone thought it might be “an over-enthusiatic 1Gb NIC”