Web server prematurely closes connection and not sure why.. Client : 10.x.x.6 Server: 10.x.x.99 (Host OS: Win 2003 Std Edition SP2)
Capture from Client
No. Time Source Source Port Destination Dest Port Protocol Length Info
199039 2014-08-28 12:29:21.743519 10.x.x.6 19018 10.x.x.99 8080 TCP 62 19018 > 8080 [SYN] Seq=0 Win=4380 Len=0 MSS=1460
199044 2014-08-28 12:29:21.745172 10.x.x.99 8080 10.x.x.6 19018 TCP 64 8080 > 19018 [SYN, ACK] Seq=0 Ack=1 Win=16384 Len=0 MSS=1460
199045 2014-08-28 12:29:21.745176 10.x.x.6 19018 10.x.x.99 8080 TCP 58 19018 > 8080 [ACK] Seq=1 Ack=1 Win=4380 Len=0
199213 2014-08-28 12:29:22.719074 10.x.x.6 19018 10.x.x.99 8080 HTTP 489 GET /VIS_504_xxxv1/-/next?Action_5e6ed7a6b1bf436bb453dfdf845f6227=success.filled&MODE=GVP8 HTTP/1.1
199214 2014-08-28 12:29:22.729915 10.x.x.99 8080 10.x.x.6 19018 TCP 1518 [TCP segment of a reassembled PDU]
199216 2014-08-28 12:29:22.729934 10.x.x.99 8080 10.x.x.6 19018 TCP 1518 [TCP segment of a reassembled PDU]
199219 2014-08-28 12:29:22.730333 10.x.x.6 19018 10.x.x.99 8080 TCP 58 19018 > 8080 [ACK] Seq=432 Ack=2921 Win=65535 Len=0
199220 2014-08-28 12:29:22.731974 10.x.x.99 8080 10.x.x.6 19018 HTTP/XML 853 HTTP/1.1 200 OK
199228 2014-08-28 12:29:22.982290 10.x.x.6 19018 10.x.x.99 8080 TCP 58 19018 > 8080 [ACK] Seq=432 Ack=3716 Win=4380 Len=0
199679 2014-08-28 12:29:43.243733 10.x.x.6 19018 10.x.x.99 8080 HTTP 527 GET /VIS_504_xxxv1/-/next?MODE=GVP8&Action_50439cce58b7476387d6bfdf4e5c01ff_e91fe4004370445db983b7ec06a14a58=error.input.noinput HTTP/1.1
199680 2014-08-28 12:29:43.254350 10.x.x.99 8080 10.x.x.6 19018 TCP 64 8080 > 19018 [FIN, ACK] Seq=3716 Ack=901 Win=64635 Len=0
199681 2014-08-28 12:29:43.254355 10.x.x.99 8080 10.x.x.6 19018 TCP 64 8080 > 19018 [RST, ACK] Seq=3717 Ack=901 Win=0 Len=0
Capture from Server
No. Time Source Source Port Destination Dest Port Protocol Length Info
125233 2014-08-28 12:29:21.407191000 10.x.x.6 19018 10.x.x.99 8080 TCP 60 19018 > 8080 [SYN] Seq=0 Win=4380 Len=0 MSS=1460
125234 2014-08-28 12:29:21.407265000 10.x.x.99 8080 10.x.x.6 19018 TCP 58 8080 > 19018 [SYN, ACK] Seq=0 Ack=1 Win=16384 Len=0 MSS=1460
125235 2014-08-28 12:29:21.409137000 10.x.x.6 19018 10.x.x.99 8080 TCP 60 19018 > 8080 [ACK] Seq=1 Ack=1 Win=4380 Len=0
125917 2014-08-28 12:29:22.382409000 10.x.x.6 19018 10.x.x.99 8080 HTTP 485 GET /VIS_504_xxxv1/-/next?Action_5e6ed7a6b1bf436bb453dfdf845f6227=success.filled&MODE=GVP8 HTTP/1.1
125922 2014-08-28 12:29:22.391445000 10.x.x.99 8080 10.x.x.6 19018 TCP 2974 [TCP segment of a reassembled PDU]
125926 2014-08-28 12:29:22.393651000 10.x.x.6 19018 10.x.x.99 8080 TCP 60 19018 > 8080 [ACK] Seq=432 Ack=2921 Win=65535 Len=0
125927 2014-08-28 12:29:22.393663000 10.x.x.99 8080 10.x.x.6 19018 HTTP/XML 849 HTTP/1.1 200 OK
126064 2014-08-28 12:29:22.645648000 10.x.x.6 19018 10.x.x.99 8080 TCP 60 19018 > 8080 [ACK] Seq=432 Ack=3716 Win=4380 Len=0
137092 2014-08-28 12:29:42.906730000 10.x.x.6 19018 10.x.x.99 8080 HTTP 523 GET /VIS_504_xxxv1/-/next?MODE=GVP8&Action_50439cce58b7476387d6bfdf4e5c01ff_e91fe4004370445db983b7ec06a14a58=error.input.noinput HTTP/1.1
137096 2014-08-28 12:29:42.916900000 10.x.x.99 8080 10.x.x.6 19018 TCP 54 8080 > 19018 [FIN, ACK] Seq=3716 Ack=901 Win=64635 Len=0
137097 2014-08-28 12:29:42.916929000 10.x.x.99 8080 10.x.x.6 19018 TCP 54 8080 > 19018 [RST, ACK] Seq=3717 Ack=901 Win=0 Len=0
The second GET request got a [RST, ACK] because the server host sent a [FIN, ACK] as part previous request just after the second GET. Why would the server close the connection while there is still request pending? The server host is running Tomcat webserver, the tomcat log shows 200 OK for the second GET request. Could someone shed some light on this?
asked 02 Sep '14, 13:18
JackTCP
11●1●1●3
accept rate: 0%
I bet the key is that there was ~20 seconds of idle time between the last client ACK and the first segment of the next post. In other words, a ~20s gap between 199679 and 199680 on the client side, and 126064 and 137092 on the serverside.
My guess is that the HTTP Keep-Alive timer expired. I think in Apache, this is 5s by default. Sometimes I see artifacts of this behavior in the error log. Do you see any messages in your web/app logs?
Thank you smp for your comment. I found the timeout in tomcat
<connector port="8080" protocol="HTTP/1.1" connectiontimeout="20000" redirectport="8443"/>
I can increase the timeout to a higher value to 40 or 60 sec but looks like I may have to do more than just increasing the timeout or?, the requesting client is a F5 LB and after I increase the timeout for tomcat if i still get request close to the timeout expiry, most likely it will fail again..? any other options to tackle this? Thank you again.
I also manage F5s for our company, so I am quite familiar with how they work. Is this health monitor traffic?
Maybe I missed it, but I don't believe you have clearly articulated exactly what the problem is. It seems we have an explanation for your question, which is why the connection gets closed before the next request arrives. So perhaps you could re-state your question to help clarify?
Thanks smp. Let me give some background, Its a voice application (GVP) which access VoiceXML (VXML) code hosted on the webserver (Apache Tomcat), GVP can directly talk to webserver but we want to have HA for webserver, so we hosted the VXML code on two Tomcat webservers. To load balance GVP request to webservers we introduced F5. The flow is like this, when a new call arrives ( lets call it as CALL 1 and the life of call would be around 3 mins), GVP sends a initial VXML page fetch via F5 LB to Tomcat 1/2 (lets conside it was sent to Tomcat 1) which returns the page with JSESSIONID, GVP uses the JSESSIONID to communicate with F5 (not necessarily the same port) till the life of CALL 1 and it should always go to tomcat 1 till end of the call. F5 team has setup Cookie persistence that way the request for same JSESSIONID is always sent to same Tomcat. Next call (CALL 2) can go to tomcat 2.
GVP request to F5 would look something like below
main.C:299 01DB013F-10006CBD-00011B54 .\FMUserRequest.cxx:519 GET /VIS_xxxv1/-/next?MODE=GVP8&Action_50439cce58b7476387d6bfdf4e5c01ff_e91fe4004370445db983b7ec06a14a58=error.input.noinput HTTP/1.1
main.C:299 01DB013F-10006CBD-00011B54 .\FMUserRequest.cxx:488 User-Agent: GVPi/GVP Common Lib (Build: 8.1.410.29)
main.C:299 01DB013F-10006CBD-00011B54 .\FMUserRequest.cxx:488 Host: Gxxxxxx.xxxx.example.com:8080,
main.C:299 01DB013F-10006CBD-00011B54 .\FMUserRequest.cxx:488 Accept: /main.C:299 01DB013F-10006CBD-00011B54 .\FMUserRequest.cxx:488 Cache-Control: max-age=0, max-stale=0
main.C:299 01DB013F-10006CBD-00011B54 .\FMUserRequest.cxx:488 Cookie: $Version="0"; JSESSIONID=3FA23FA8441345725BA2CA676D5B206F; $Path=/VIS_xxxxxv1; BIGipServer=1674440970.36895.0000; $Path=/main.C:299 01DB013F-10006CBD-00011B54 .\FMUserRequest.cxx:488 X-Genesys-FMSessionID: 01DB013F-10006CBD-00011B54
main.C:299 .\FMResponseData.cxx:326 HTTP/1.1 200 OKmain.C:299 .\FMResponseData.cxx:345 Server: Apache-Coyote/1.1
Now the problem is between F5 and webserver (tomcat). Tomcat has 20 secs HTTP idle timeout and it closes the connection after 20 seconds if its not active. In my above example (server side capture), Tomcat was about to close the connection (Frame 137096) and a new GET request arrived from F5 few milli sec before connection close (Frame 137092) and it got rejected anyway as the conection is closed. If F5 would have sent the same request to same tomcat using a new connection, it would have been successful, do i need ask my F5 team to see if they can set some timeouts lower than 20 secs that way F5 opens a new connection? let me know your thoughts. Thank you!
It seems you are falling into the common mental mistake of thinking that the F5 is somehow the source of the traffic - it's not. The F5 is simply forwarding what it receives from the real clients. So in the trace you caught, it is the real client - not the F5 - that waited 20 seconds between HTTP requests. The trace clearly shows that the F5 did not initiate the close.
So, the way I see it, you have basic two choices. Either somehow make the clients send HTTP requests at a frequency greater than the Tomcat Keep-Alive (which is not likely), or make Tomcat wait longer between successive HTTP requests from a client. I think increasing the Tomcat Keep-Alive timeout is really your only realistic option.
I agree F5 is forwarding traffic but it can also control how connections are opened and closed. Example: for each request there will be 2 legs. leg 1 - b/w source and F5, leg 2 - b/w F5 and target server. target server has 20 sec connection timeout, we have set the connection timeout as 15 secs on F5 to resolve this issue.