Why is server (CentOS 6) sending RST after TCP three-way handshake?

Question

Hello,

Many times, my server is responding the way described is this question title (some other times stuff work just fine).

My topology is something (simplified) like this (I'm not the network admin):

Mobile web (ajax based) clients
    |
    |
FW (NAT/PAT)    
    |
    |
Internet <--> Eudeamon FW <--> Cisco ACE LB <--> Web server farm (CentOS 6)

According to Apache stats there are still free workers/threads to receive requests and CPU/RAM usage is quite normal.

Please take a look at the following image (from a tcpdump capture on one of web servers & open in Wireshark) and provide some ideas about what the issue might be (I've struggled with this for several weeks now)

alt text

Capture file uploaded here: https://www.cloudshark.org/captures/4a87072e66c5

Answer 1

There are a couple of things out-of-the-ordinary me in this trace:

Retransmission of the SYN/ACK
As already mentioned by other people, it seems the ACK of the 3-way-handshake seems to get lost on the way to the server. As we are capturing on the server, a sugestion was made that it went south somewhere in the TCP/IP stack. This does not seem plausable. After some googling I found another occurrence of this issue. So it seems this could be "normal" behavior under certain circumstances. So googling a bit more resulted in an article discribing this behavior. In short, there is an option TCP_DEFER_ACCEPT on linux that tells the TCP/IP stack to not pass a new session to the application on the ACK of the 3-way-handshake, but to wait for incoming data and then present the new session with the data to the application, which reduces context switching on the server. The result is that the SYN/ACK is being retransmitted as the server basically ignores the ACK packet. This happens only when the first data does not follow the SYN/ACK quickly, which is the case in this trace file.

Long delay before first HTTP request
There is a ~1,5 sec gap between the ACK and the first GET request from the client. Is the client opening up multiple connections to be used later on when needed? Is the ACE loadbalancer opening up connections to the server before actually needing them (don't know if this is functionality that the ACE offers though). A trace on the clientside and on the serverside of the ACE would be useful to determine the cause of this delay.

Wrong ACK number on HTTP request
Then there is the ACK number on the HTTP request that is not correct, it should have been 2962498563 as in frame 72, but instead it is 2106390967. This does not ACK the SYN/ACK, so I assume the server discards this packet as it does not complete the 3-way handshake as it should (taking into account that is discarded the ACK in the first place because of TCP_FEFER_ACCEPT). It sends out a TCP/RST in response to all packets that do not have the right ACK and tries to establish a correct session by keep retransmitting the SYN/ACK until the timer set in the TCP_DEFER_ACCEPT option runs out.

Next troubleshooting steps could be:

Confirm retransmission of the SYN/ACK
Open a telnet to the server port and don't send a request. Look at the trace, do you see the same retranmission behavior of the SYN/ACK?
Test whether the SYN/ACK retransmission is triggering a bug on the ACE
It could be that the retransmission of the SYN/ACK is triggering a bug on the Cisco ACE in sending the wrong sequence number? Test this by manually telnetting to the VIP address/port on the ACE, wait until the retransmission of the SYN/ACK and then send a http request. Observe the ACK of the request on the server.
Analyze more instances of this issue
In all other cases, is the ACK in the first request incorrect? Does that happen sometimes when there is no restransmission of the SYN/ACK too? If you would like, you can post a larger tracefile with a couple of problematic sessions and we could have a look at the pattern some more.

Answer 2

As the SEQ of the TCP/RST does not seem to match the SEQ and ACK of the 3-way-handshake, I would like to see the real tracefile to look at the SEQ and ACK of the http request. I would also like to look at the IP TTL to see whether an intermediate device might be sending the TCP/RST and I would like to look at the ip.id to help in the analysis. In short, no good analysis could be done (at least not by me) just based on the screenshot. Too much important information is missing...

Answer 3

0

It looks like the server connection is never opened. That's why the server is sending those resets.

In Frame 71, the server sends it's SYN-ACK, but then resends it, according to the screenshot you provided, in Frame 73. But it also looks like the server did get the ACK from the client, according to the capture.

Was this capture taken directly on a specific web server?

What's going on with the SYN-ACK being resent and the ACK not being acknowledged by the server suggests that the capture was taken outside of the web server farm, likely on the FW or LB.

And there seems to be some retransmissions as well.

So if the capture was not taken on a specific web server and there are retransmissions, it's likely that Frame 72 is never received by any server, and is actually dropped, but you just don't see it. Therefore, when the client sends it's HTTP GET, the server doesn't OK it because the connection's not opened yet. That's why in Frame 80, the server resends the SYN-ACK in another attempt to open the connection.

So I would verify that you're capturing data from a specific web server, if you can, and also look into those retransmissions. What causing that? Since you're not the network admin, this is something you can probably bring up with that person.

As Jaap mentioned, a capture you can share would help to take a better look at this, but I think that's what is happening here.

answered 09 Nov '16, 07:41

jeantunis
21●3
accept rate: 0%

@jeantunis It looks to me (2nd) client ACK made its way to the server NIC, but somehow that ACK never climbs up the TCP stak.

Capture was taken directly on a specific web server.

I have shared the needed (original) capture made by using tcpdump at https://www.cloudshark.org/captures/4a87072e66c5

(09 Nov '16, 08:26) diazdw

Hm, if the capture has been taken at the web server, then problem is maybe inside the system or after the point capture.

Because, I came to the same finding like @jeantunis.

(09 Nov '16, 09:11) Christian_R

This is very interesting capture. It seems that we're on the server side (on the server itself actually). I think that because: 1) timing analysis of 3-way handshake; 2) TTL of outgoing packets = 64; 3) IP packets of 2960 Bytes in size (in working connection sample, that means we're capturing before NIC does LSO).

But how in that case an ACK (frame no.72) that we've already seen in the capture could be dropped? Only somewhere up the server's IP stack, after capture point.

And two more points: - Packet 74 (GET) has TTL of 59, not 127 as packet 72 had. Also packet 74 has wrong ACK of 2106390967, whereas initially ACK packet 72 had 2962498563. It looks like these two packets have different sources? How could it be?

Why after receiving RSTs client (which one of the two?) keeps sending and sending it's requests? Probably it does not see or does not process RST packets. It will not process RSTs in a case if they are not in window bounds, so it is expecting to see another SEQ.

(10 Nov '16, 02:46) Packet_vlad

Just looked at the working stream. It contains the same TTL transition, probably some proxy is involved on the path.

(10 Nov '16, 02:50) Packet_vlad

@packet_vlad: seems that there is somekind of virtual environment. The lost ACK: Yes it is strange. Maybe it is some kind of a driver issue.

(10 Nov '16, 02:55) Christian_R

@Christian_R is correct. There does seem to be some sort of virtual environment with VSS Monitoring. And in that environment, you need to be careful how and where you capture data.

The second thing is around the Retransmission Timeout and IP ID. If you look at frame 74, originally sent at 161.2 seconds, we would expect a retransmission anywhere between 1 and 3 seconds, depending on TCP implementation, if there's no response from the server. The retransmission happens about 1.5 seconds later at 162.6. With the backoff algorithm, we should expect a retransmission after 3 seconds, then 6 seconds, then 12 seconds, and so on. And the IP IDs should increment accordingly.

Everything happens the way they should until after the retransmission in frame 83 at 165.5 with an IP ID = 3334. The next frame we see from the client is frame 85 at 182.5 with IP ID = 3336.

What happened to the frame that should have occurred around 171.5 (or so) with IP ID = 3335? That doesn't exist. The packet capture doesn't have it. And there could be a number of reasons for this.

So wherever this capture was taken, whether on a physical or virtual server, it doesn't look like you can completely rely on the capture taken there.

My suggestion would be to 1) capture as close to the server as possible, but not on it, and 2) capture at multiple points.

Based on the diagram you showed, you have a FW and ACE boxes that are manipulating the packets, and that's just what we know. There could be other things we don't know. You want to be able to trace a stream of packets going from the edge of your network across the FW, ACE and anything else all the way to the server.

Last, I could be wrong, but I don't think frame 72 was seen by the capture and then got dropped on its way up the stack. That's unlikely to happen for just one client and one TCP connection. I also don't think it's a NIC driver issue because other communication between the client and server is happening without any problems earlier in the capture.

If this is a physical server, and you are sure the issue is there, you should narrow your focus on that box with a profiling tool along with tcpdump. But don't just capture the communication between the server and client -- capture everything to see what's happening to other clients as well. That could clue you in to whether this is server-related (like any recent changes to the server farm) , network-related (like any recent network changes) or something else entirely.

(10 Nov '16, 08:52) jeantunis

showing 5 of 6 show 1 more comments