Hi everyone. I have a persistent problem between my local machine and an external HTTP server. Everytime I try to download a page the connection resets and I have to retry with the remaining bytes. In wireshark I see TCP dup acks, retransmissions, and RST for every segment!
Can somebody help me to debug it?
Thanks a lot!
asked 12 Dec '16, 11:11
retagged 20 Dec '16, 20:08
As usual for "hard to solve” problems, this one involves a combination of behaviours in various devices.
A. The TCP connection from the client ends at the load balancer. A different connection is made from the load balancer to the “real” server.
B. The first “server” ACK at the start and the final Resets all originate from the load balancer.
C. The second ACK and all data packets originate from the “real” server.
D. The real server sends the full response to the load balancer very quickly.
E. The load balancer buffers the full response and takes responsibility for delivering the data to the client.
F. The load balancer is ignoring the client’s Receive Window value of 5888 and transmitting the full 10KB response in one large burst – having 10 KB of “packets in flight”. Could the load balancer be using the Receive Window value in the client’s SYN packet rather than the client’s GET packet, then applying the client’s Windows Scale factor (x 128)?
G. A device in the path is dropping packets that exceed the client’s Receive Window (but can “catch” one or two packets in excess). We’ll call this the “dropping device” and it could be a router, firewall, etc.
H. From “web2-iana-nosack-full-bis” we can guess that this “dropping device” is within 1.3 ms of the client (i.e., an RTT of less than 2.6ms).
I. The failed and successful transactions seem to indicate that the load balancer knows when the “dropping device” has dropped packets, because only then does it enter the “Reset everything” state.
If the “dropping device” notifies the load balancer, what form might such a notification take? ICMP? Reset?
The first hypothesis was related to the separate connections between the client-load balancer and then load balancer-server. However, the additional capture file uploaded by @huguei, "web2-iana-nosack-full-bis", contained successful transactions that provided evidence against it. Just for information and discussion, I've included the diagram for this first hypothesis at the end of this post.
The second hypothesis is now the one I believe to have the most chance of being closer to the truth.
The HTTP header information tells us:
The HTTP header does not ask for persistent connections, that is, does not contain:
POSSIBLE FIXES OR WORKAROUNDS:
The apparent “root cause” is that the server (load balancer) appears to be ignoring the client’s advertised Receive Window of 5888 bytes. It is transmitting the whole response of 10 KB in a single packet burst.
The correct fix for this would need to be made at the IANA server or load balancer.
We don’t have the ability to do anything at the server/load balancer end. All we can do is try some configuration changes at the client:
PROBLEMS AT THE IANA SERVER END:
Apparently, the load balancer and/or server appears to be ignoring the client’s advertised Receive Window and transmits a burst of in-flight data that significantly exceeds it.
Apparently, both the load balancer and server appears to transmit their own ACK to the client’s GET request. We observe both ACKs arriving at the client.
We observe an unnecessary retransmission of the last smaller data packet of the response (when SACK is enabled). This arrives just 10ms or less after the original. It occurs for different files retrieved from the same site.
Here are the Wireshark packets lists, with notations, for each of the capture files:
Finally, the diagram for the First Hypothesis:
answered 17 Dec '16, 22:31
edited 19 Dec '16, 14:38
Building on the other answers, there must be something in the path reacting to your all of your ACK packets with a TCP/RST. I just tried connecting to the same server and did not have the same issue. I assume our paths towards the server are different, but the same in the last part, so it must be something in your packets. I closely examined your packets and see that all the incoming packets have
Do you have a QoS policy in your network in such a way that packets leave your network with DSCP markings? Maybe there is a device near or at IANA that does not like your markings after the 3-way handshake.
answered 15 Dec '16, 04:38
The client seems to have a bug in the SACK implementation.
1424 bytes (segment 7265-8689) got lost after frame #16. First duplicate acknowledgement is issued for this missing block at frame #18 which also selectively acknowledges bytes from 8689 to 10137. Second duplicate acknowledgement is issued at frame #20 which again selectively acknowledges bytes 8689 to 10562. Once the frame #21 arrives third duplicate acknowledgement is transmitted which has 2 SACK blocks 8689 to 10562 and 10137 to 10562.
Even if these segments are contiguous they are represented as 2 different blocks.
answered 12 Dec '16, 12:09
Frame #21 is a retransmission of #19 - which arrives just 10.47ms later. There is no apparent indication of why this data is retransmitted.
Frame #22 is not just a SACK, it is a D-SACK (Duplicate SACK) which is an extended way of using the SACK mechanism to report that an unnecessary duplicate data has been received. This extension functionality is documented in RFC 2883.
The Resets appear one RTT after the client sends the D-SACK.
My guess would be that the server doesn't understand D-SACKs at all and issues the Resets in response to this one. The RFC specifies that the duplicated data should be in the first SACK block. This is correct in this case, the first SACK block is the sequence numbers for #19 and #21. The second SACK block is the larger range that is still being selectively ACKed (the gap between #15 and #17).
Disabling SACKs would be a workaround for this. The "proper" fix would be to stop the unnecessary retransmissions. Note that #21 (IP ID = 37041) is a genuine retransmission of #19 (IP ID 37040).
The RTT seems to be just under 142ms but the Resets appear just 134ms after the D-SACK. Perhaps Packet_Vlad is onto something?
answered 14 Dec '16, 02:57
edited 15 Dec '16, 18:36
(Sorry this isn't an answer but I wanted to post my comment here to let everybody knows!)
I tried to change the initrwnd but is not possible in my server (CentOS 5.11, with a 2.6.18 linux kernel). So I tried the second workaround of @Philst, disabling the TCP window scaling option, and it works! I can download the entire page in a single connection every time I test it. Here's a capture with 3 attempts:
So, I just wanted to thanks everyone for your time and help... and @Philst for its great answer. I'll surely keep debugging it and let the server side knows their load balancer problem!
answered 19 Dec '16, 06:21
I think the problem is due to different MSS sizes negotiated between the client and a load balancer (1460) and the load balancer and the server (1448) and the load balancer not very robust in handling
You can try disabling tcp_timestamps via
Here is the picture that kind of explains what is happening in the first trace you provided.
The three way handshake (that we see in your trace) is Linux client (RHEL) asking for timestamp option but the 'server' does not allow it. So the net MSS on this TCP session is 1460 bytes.
answered 17 Dec '16, 05:48
edited 17 Dec '16, 08:39