This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

Some detective expert help needed: puzzle with lost packets

0

Hi all!

I've been scratching my head debugging this issue I have when accessing my SSL server. Very often Firefox stalls on the connection and would hang for a minute or even indefinitely.

I analyzed traffic from both sides - one packet gets lost on the way from server to the client and this somehow breaks the further connection.

What's interesting, is that all re-transmissions from the server then never reach the client. How this can be possible? I am not seeing any packet loss on this route and I am certain wireshark is able to capture all traffic without dropping any packets. The proof is that Firefox stalls and it means it also never sees the packet too.

First, here what client does:

And how server replies:

You can see that client sends 3 "HTTP/1.1" requests to which server replies "304 Not Modified".

Now, out of those 3 replies, only 2 reach the client (packet size:311, packets #102 & #109).

One packet is lost.

When packet #109 arrives, Wireshark marks it as "[TCP Previous segment not captured] Application Data" because he knows by Seq/ack number that one packet wasn't seen.

Server then begins re-transmission attempts - packets #155 - 193.

None of them appear on the client!

How this is even possible? It happens with about a 1/10 chance on the page load. I suspect it could be NAT in my cable router as can't find any other viable explanations.

Do you have any ideas?

asked 09 Mar '15, 08:25

vizzah's gravatar image

vizzah
1112
accept rate: 0%

I think do did do the detective work. Crappy cable could be it. So you move on the test from different locations / other access networks.

(09 Mar '15, 09:59) Jaap ♦

Please don't use screenshots as they do not contain the necessary information like sequence and acknowledgement numbers for all frames. We like using wireshark instead of an image viewer ;-)

If you can supply a capture file (on Cloudshark for instance), that would be great. You can anonimize the file with TraceWrangler if you need to remove the ip addresses and/or the TCP payload of the packets.

(09 Mar '15, 10:10) SYN-bit ♦♦

Thanks for the tips!

I've uploaded anonymized captures.

Server: https://www.cloudshark.org/captures/d552813938bc

Client: https://www.cloudshark.org/captures/bd5ad8cc927b

Packet is lost on the socket client (8.208.192.106) port 52627 --> server: https (443)

This is a log of Firefox sending pipelined HTTP/1.1 request to the server via TLSv1.2.

Thanks for looking into this.

(09 Mar '15, 12:22) vizzah

2 Answers:

0

None of them appear on the client!
How this is even possible?

Do you have any security software installed on the client that hooks itself into the TCP stream, like AV software, Endpoint Security, etc.

If so, malfunction of that piece of software could cause the described behavior. Please try to disable (better uninstall) that software and repeat your test.

BTW: Are you sure you've uploaded the correct capture files? Both files on cloudshark are identical if I do a binary comparison and couldshark is using the same file for both links (see "more info" -> File on Disk: 93655f5d-46aa-459e-ab3f-b22ce44199fc.cap)

Regards
Kurt

answered 09 Mar '15, 13:37

Kurt%20Knochner's gravatar image

Kurt Knochner ♦
24.8k1039237
accept rate: 15%

Sorry, my bad! I did upload identical files. I used anonymiser software and it failed on the second file and didn't overwrite it with the new data.

There are no firewall/antivirus or other filtering software. My dedicated server runs Ubuntu 12.04LTS and I, the client, run the same OS, Desktop edition.

Here are proper dumps:

Client(33.153.91.6) -> Server(178.17.163.8): https://www.cloudshark.org/captures/b2960a8d8654

Server(178.17.163.8) -> Client(8.208.192.106): https://www.cloudshark.org/captures/aaeeedd2c919

Please note - 8.208.192.106 and 33.153.91.6 are the same (it was just anonymised differently due to local LAN IP and external IP when seen by server).

Look how server attempts to re-transmit that lost packet: #134, #135 ... #139. It correctly delays every re-transmission..

but - None of those multiple packets arrive to the client! If it's due to packet loss - it can't be that bad, as other sockets are doing fine.. it only happens once per socket out of 8 max connections Firefox keeps to the server.

I suspect this to be an issue with ISP or NAT in my router. But I'd like some pros to confirm it.. and what else this could be?!

Many thanks for your help!

(09 Mar '15, 14:48) vizzah

if you look at "tcp.stream eq 2" there are some frames in the server file, but not in the client file, as you said.

I guess we will need more details about your environment:

  • Where and how did you capture the traffic?
  • Are there any (security) devices (firewalls, vpns, loadbalancer, wan accelerators, etc.) between both systems? According to the TTL delta in the client/server SYN, there are quite some hops between both (if the TTL was not anonymized). If so, any of these devices could have dropped the frames, for whatever reasons (maybe an IPS detected a signature (false positive))
  • TCP offloading seems to be enabled on both systems (see frames > 1500 bytes). Can you try to disable offloading (see ethtool) and then repeat the test?
  • Is iptables enabled on either system?
(09 Mar '15, 15:23) Kurt Knochner ♦

Kurt,

Yes, those frames.

  1. Captured on the server with: sudo tcpdump -li eth0 -w server_issue -s 65535 host ser.ver.ip.adr and port 443 on client with Wireshark, default capture settings.

  2. There are no security devices which I am aware of. There are 14 hops between client and server, so yes, it's a bit of a distance (60 ms RTT). The packets are being dropped randomly. The session you see is a https page reload. It could work fine for hours, then it could result in described issue almost every load.. then works again. It all sounds like ISP/connectivity issues, however, I can't catch any packet loss doing ping/mtr tests at the same time..

  3. I'll read about TCP offloading and experiment with it.

  4. Iptables is running on the server, yes. The client IP though isn't in any rules..

Thanks, appreciate your help!

(09 Mar '15, 15:44) vizzah

Oh, and when I turn PPTP VPN between those hosts the issue seemingly goes away..

(09 Mar '15, 15:46) vizzah
1

Well, as I said: maybe some IPS or any other security device (see my list above) that blocks those frames for whatever reason. Hard to troubleshoot if you don't have access to these devices.

(09 Mar '15, 17:11) Kurt Knochner ♦

And hard to troubleshoot further without access to the TCP payload (especially the SSL handshake of a full SSL session setup)

(09 Mar '15, 17:18) SYN-bit ♦♦
showing 5 of 6 show 1 more comments

0

I am looking at your traces with the filter tcp.stream==2 && ip.src==178.0.0.0/8. This is not just random packet loss. As you can see all frames get through, except the ones with seq==423 (and length 311). All frames before that frame get through and all other frames after that frame get through, but all retransmissions of that frame never make it through.

So, there is a device in the path between the server and the client that actively blocks this tcp segment. I suspect it to be an IDS or WAF where the content of this TCP segment triggers a detection rule.

Too bad you zeroed out the content of the SSL handshake, as I suspect that the certificates in both trace files are most likely not the same, as I suspect there is a device in between that does SSL decryption and re-encryption to be able to inspect the traffic. Are you able to share the SSL handshake packets? Or would that expose to much sensitive data?

The fact that passing this traffic through a PPTP tunnel makes the problem go away supports my theory, as the decryption/re-encryption can't be done anymore so the inspecting device does not see the content of the packet and can not trigger on it.

answered 09 Mar '15, 16:56

SYN-bit's gravatar image

SYN-bit ♦♦
17.1k957245
accept rate: 20%

Hi SYN-bit,

This is very interesting! Thanks for looking into this. I was suspecting something like you say could be happening and that was one of the reasons I wanted to discuss this in public with some experts..

Again, to me it also doesn't look like a random packet loss as it happens exactly under identical situations.. in exactly similar "one packet lost and never re-transmitted" way. It doesn't happen always but very often, like the inspecting device isn't coping with the load or simply buggy at times.

Yes, I can share full SSL dump (though I'd rather do it in private). Please let me know if there is a way to contact you out of the forum.

Btw, since I control both server and it uses SSL certificate which I bought and I see it's validated on the client correctly.. wonder if it still be subject to mitm decryption..?

(10 Mar '15, 05:48) vizzah

I've checked certificates on both sides and they match. Public key/signature in the certificate which browser sees is identical.

It doesn't seem that anyone between can decrypt this SSL traffic.

So the question then turns to whether it could simply be a misconfigured IDS/WAF on the ISP side, which incorrectly analysing and blocking some packets.

Or whether it's my cable modem's NAT blocking re-transmissions.

I will be investigating more.

PS: If you still think looking at SSL dump might bring some clues, please let me know and I'll e-mail to your profile`s address. Thanks!

(10 Mar '15, 09:06) vizzah

NAT devices can't make a distinction between original packets and retransmissions. They just keep a list of ip/port translations and alter the packets accordingly. So I don't think it is your cable modem.

Since it seems your SSL session can not be decrypted (as I assume you did not share the private key with your ISP), the blocking is done on the encrypted packets, not the HTTP traffic inside the SSL session.

It would be interesting to see a couple of occassions of the drops (at different times) to look for a pattern. If you'd like, you may send me those at the address in my profile and I have a look at it too.

(11 Mar '15, 01:49) SYN-bit ♦♦