We are troubleshooting connectivity issues across a layer two connection between sites across a provider. Users experience application hangs and timeouts when passing across this specific link. Packet captures do NOT show packet DROPS, but show RE-TRANSMITS. It is the strangest thing. Complete details here, I would have put the pcaps here, but i couldn't figure out how to attach. Any assistance in explaining this behavior is appreciated. Thanks, Tarek asked 16 Apr '12, 14:19 tarjall edited 16 Apr '12, 17:47 Guy Harris ♦♦ |
2 Answers:
I had only a few minutes to look at your trace, but what happens in your trace is something not so uncommon (I looked at the server trace). Frame 72 is a real retransmission (not a duplicate, since the IP Identification changes), so question is, why does the server retransmit when there is no loss. The answer is: a timer on the server ran out, because there was no acknowledge coming back in time. When you look at the timings you'll see that the client usually acknowledges incoming packets within a few milliseconds, but frame 71 isn't acknowledged before the server becomes impatient and retransmits. This happens again, on a larger scale, in between frames 88 and 93. Frame 90 even contains some sort of a complaint by the client that it got data twice: if you look at the TCP options you'll see a SACK range that is within the full acknowledged range (6061-6273, which is included in the ACK on 6273). That way the client tries to tell the server that it got a needless retransmission. I have no idea why the client has troubles acknowledging data in time; this is something the trace doesn't tell. It just looks like the network is working fine, but somehow client and server stacks seem to ignore each others packets sometimes. answered 16 Apr '12, 17:00 Jasper ♦♦ edited 17 Apr '12, 04:45 |
Retransmissions, missing acknowledgements and other problems without explicit dropped packets can be a symptom of a mismatched duplex setting on a link somewhere in the transmission path. As long as Wireshark itself is listening on a properly configured full duplex or half-duplex link, you might not see dropped or corrupted packets. If the server packet is being blocked this would not explain the client complaining about getting a packet twice. That could happen if it was the initial ACK from the client that is sometimes being blocked before it reaches Wireshark. That is possible if the client-to-server path is being blocked by another packet en route from the server to the client. The cause would be that the client side thinks the link is full duplex while the server side thinks it is half-duplex. That problem link would be somewhere between Wireshark and the client. answered 19 Apr '12, 11:52 inetdog |
I've made some progress on this but still need help. Apparently specific packets are being modified after leaving one end of this connection, and before arriving at the other end.
If you look at packet 88 on the outlookclient cap (receiving end), look at the bottom of the hex view and you see 08 bf e0, look at the same packet on the 3750 capture (sender side of packet) and its all zeros.
Has anyone ever seen behavior similar to this or know what might cause this, the packets are going across a provider that is doing QnQ and MPLS, but no inspection.
I don't think the packet got modified on its way. Yes, the bytes ARE different, BUT you captured the data at the client side ON the client. And that means that Wireshark (or whatever capturing software you used) grabbed the packet before the NIC finalized it for the wire. You'll notice that also most of your outgoing TCP checksums are incorrect at the client as well - that is a sign for someone having captured on a system that is part of the communication.
To be able to see what really happens you'll have to capture with additional PCs, not ON the PCs/Servers that are part of the problem.