Hi all! I've been scratching my head debugging this issue I have when accessing my SSL server. Very often Firefox stalls on the connection and would hang for a minute or even indefinitely. I analyzed traffic from both sides - one packet gets lost on the way from server to the client and this somehow breaks the further connection. What's interesting, is that all re-transmissions from the server then never reach the client. How this can be possible? I am not seeing any packet loss on this route and I am certain wireshark is able to capture all traffic without dropping any packets. The proof is that Firefox stalls and it means it also never sees the packet too. First, here what client does: And how server replies: You can see that client sends 3 "HTTP/1.1" requests to which server replies "304 Not Modified". Now, out of those 3 replies, only 2 reach the client (packet size:311, packets #102 & #109). One packet is lost. When packet #109 arrives, Wireshark marks it as "[TCP Previous segment not captured] Application Data" because he knows by Seq/ack number that one packet wasn't seen. Server then begins re-transmission attempts - packets #155 - 193. None of them appear on the client! How this is even possible? It happens with about a 1/10 chance on the page load. I suspect it could be NAT in my cable router as can't find any other viable explanations. Do you have any ideas? asked 09 Mar '15, 08:25 vizzah |
2 Answers:
Do you have any security software installed on the client that hooks itself into the TCP stream, like AV software, Endpoint Security, etc. If so, malfunction of that piece of software could cause the described behavior. Please try to disable (better uninstall) that software and repeat your test. BTW: Are you sure you've uploaded the correct capture files? Both files on cloudshark are identical if I do a binary comparison and couldshark is using the same file for both links (see "more info" -> File on Disk: 93655f5d-46aa-459e-ab3f-b22ce44199fc.cap) Regards answered 09 Mar '15, 13:37 Kurt Knochner ♦ Sorry, my bad! I did upload identical files. I used anonymiser software and it failed on the second file and didn't overwrite it with the new data. There are no firewall/antivirus or other filtering software. My dedicated server runs Ubuntu 12.04LTS and I, the client, run the same OS, Desktop edition. Here are proper dumps: Client(33.153.91.6) -> Server(178.17.163.8): https://www.cloudshark.org/captures/b2960a8d8654 Server(178.17.163.8) -> Client(8.208.192.106): https://www.cloudshark.org/captures/aaeeedd2c919 Please note - 8.208.192.106 and 33.153.91.6 are the same (it was just anonymised differently due to local LAN IP and external IP when seen by server). Look how server attempts to re-transmit that lost packet: #134, #135 ... #139. It correctly delays every re-transmission.. but - None of those multiple packets arrive to the client! If it's due to packet loss - it can't be that bad, as other sockets are doing fine.. it only happens once per socket out of 8 max connections Firefox keeps to the server. I suspect this to be an issue with ISP or NAT in my router. But I'd like some pros to confirm it.. and what else this could be?! Many thanks for your help! (09 Mar '15, 14:48) vizzah if you look at "tcp.stream eq 2" there are some frames in the server file, but not in the client file, as you said. I guess we will need more details about your environment:
(09 Mar '15, 15:23) Kurt Knochner ♦ Kurt, Yes, those frames.
Thanks, appreciate your help! (09 Mar '15, 15:44) vizzah Oh, and when I turn PPTP VPN between those hosts the issue seemingly goes away.. (09 Mar '15, 15:46) vizzah 1 Well, as I said: maybe some IPS or any other security device (see my list above) that blocks those frames for whatever reason. Hard to troubleshoot if you don't have access to these devices. (09 Mar '15, 17:11) Kurt Knochner ♦ And hard to troubleshoot further without access to the TCP payload (especially the SSL handshake of a full SSL session setup) (09 Mar '15, 17:18) SYN-bit ♦♦ showing 5 of 6 show 1 more comments |
I am looking at your traces with the filter So, there is a device in the path between the server and the client that actively blocks this tcp segment. I suspect it to be an IDS or WAF where the content of this TCP segment triggers a detection rule. Too bad you zeroed out the content of the SSL handshake, as I suspect that the certificates in both trace files are most likely not the same, as I suspect there is a device in between that does SSL decryption and re-encryption to be able to inspect the traffic. Are you able to share the SSL handshake packets? Or would that expose to much sensitive data? The fact that passing this traffic through a PPTP tunnel makes the problem go away supports my theory, as the decryption/re-encryption can't be done anymore so the inspecting device does not see the content of the packet and can not trigger on it. answered 09 Mar '15, 16:56 SYN-bit ♦♦ Hi SYN-bit, This is very interesting! Thanks for looking into this. I was suspecting something like you say could be happening and that was one of the reasons I wanted to discuss this in public with some experts.. Again, to me it also doesn't look like a random packet loss as it happens exactly under identical situations.. in exactly similar "one packet lost and never re-transmitted" way. It doesn't happen always but very often, like the inspecting device isn't coping with the load or simply buggy at times. Yes, I can share full SSL dump (though I'd rather do it in private). Please let me know if there is a way to contact you out of the forum. Btw, since I control both server and it uses SSL certificate which I bought and I see it's validated on the client correctly.. wonder if it still be subject to mitm decryption..? (10 Mar '15, 05:48) vizzah I've checked certificates on both sides and they match. Public key/signature in the certificate which browser sees is identical. It doesn't seem that anyone between can decrypt this SSL traffic. So the question then turns to whether it could simply be a misconfigured IDS/WAF on the ISP side, which incorrectly analysing and blocking some packets. Or whether it's my cable modem's NAT blocking re-transmissions. I will be investigating more. PS: If you still think looking at SSL dump might bring some clues, please let me know and I'll e-mail to your profile`s address. Thanks! (10 Mar '15, 09:06) vizzah NAT devices can't make a distinction between original packets and retransmissions. They just keep a list of ip/port translations and alter the packets accordingly. So I don't think it is your cable modem. Since it seems your SSL session can not be decrypted (as I assume you did not share the private key with your ISP), the blocking is done on the encrypted packets, not the HTTP traffic inside the SSL session. It would be interesting to see a couple of occassions of the drops (at different times) to look for a pattern. If you'd like, you may send me those at the address in my profile and I have a look at it too. (11 Mar '15, 01:49) SYN-bit ♦♦ |
I think do did do the detective work. Crappy cable could be it. So you move on the test from different locations / other access networks.
Please don't use screenshots as they do not contain the necessary information like sequence and acknowledgement numbers for all frames. We like using wireshark instead of an image viewer ;-)
If you can supply a capture file (on Cloudshark for instance), that would be great. You can anonimize the file with TraceWrangler if you need to remove the ip addresses and/or the TCP payload of the packets.
Thanks for the tips!
I've uploaded anonymized captures.
Server: https://www.cloudshark.org/captures/d552813938bc
Client: https://www.cloudshark.org/captures/bd5ad8cc927b
Packet is lost on the socket client (8.208.192.106) port 52627 --> server: https (443)
This is a log of Firefox sending pipelined HTTP/1.1 request to the server via TLSv1.2.
Thanks for looking into this.