This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

TCP Retransmissions, reassembled PDU, TCP Out-of-order, issue with slow connectivity

0

Hello everyone, I am troubleshooting an issue with slow and sometimes no connectivity, when a end user tries to open a file (PDF) on a file server over the internet. The users has internet connection, the problem only happens when accessing a specific site to view files. I've confirmed the issue is not on the end users workstation as the issue happens from my PC as well.

Network is as shown : User-> switch -> (in)firewall(out) -> switch -> gateway router

Here are some of the packets: The packets shown are from a trace on the switchport connected to the outside port of the firewall.

No. 1093 | time (Since prev pkt) 0.010808000 | src addr 23.200.30.6 | desti addr 10.10.1.45 | Strm indx 4 | src prt 443 | protocol TCP | info **[TCP Spurious Retransmission] [TCP segment of a reassembled PDU]**

No. 1838 | time (Since prev pkt) 0.011681000 | src addr 23.200.30.6 | desti addr 10.10.1.45 | Strm indx 4 |src prt 443 | protocol TCP | info : [TCP Spurious Retransmission] 443>56886 [ACK] Seq=990774 Ack=1358 Win=31872 Len=1380[reassembly error, protocol TCP: New Fragment overpaps old data (retransmission?)]

No. 1782 | time (Since prev pkt) 0.038045000 | src addr 23.200.30.6 | desti addr 10.10.1.45 | Strm indx 4 | src prt 443 | protocol TCP | info : [TCP Retransmission] [TCP segment of a reassembled PDU]

No.283 | time (Since prev pkt) 0.001030000 | src addr 23.200.30.6 | desti addr 10.10.1.45 |Strm indx 4 | src prt 443 | procotol TLSv1.2 | info : [TCP Retransmission] Application Data

No.358 | time (Since prev pkt) 0.000533000 | src addr 23.200.30.6 | desti addr 10.10.1.45 | Strm indx 4 | src prt 443 | procotol TCP | info : [TCP Out-Of-Order] [TCP segment reassembled PDU]

No. 317 | time (Since prev pkt) 0.000001000 | src addr 23.200.30.6 | desti addr 10.10.1.45 | Strm indx 4 | src prt 443 | procotol TLSv1.2 | info : [TCP Out-Of-Order] Application Data

Here is the location of the PCAPS and network diagram: https://drive.google.com/folderview?id=0B41R9G9kL5L-S1JWZ0Z2bTFKN3c&usp=sharing

asked 06 Jan ‘16, 09:25

Lewis%20Travieso's gravatar image

Lewis Travieso
6113
accept rate: 0%

edited 08 Jan ‘16, 12:53

The packets (as shown by the frame number) seems to be in an odd order.

Apart from that there’s not enough info in the text dump to diagnose the issue. A capture file shared in a public spot will help immensely.

(06 Jan ‘16, 09:42) grahamb ♦

@grahamb The packets shown are only one a few of the packets that seem to have errors. There are quite a few of them.

Here is a link to the pcaps. Ive also included a diagram of what we are working with. Ive named the pcaps based on where the traces were taken on the network. https://drive.google.com/folderview?id=0B41R9G9kL5L-S1JWZ0Z2bTFKN3c&usp=sharing

PS. The orignal post was based on the pcap named “Trace from Outside Firewall Port”

(07 Jan ‘16, 07:45) Lewis Travieso

The sessions in “Inside Firewall” and “My Switchport” show different source ports, which means you did not capture the same session in parallel. With that it’s not possible to compare the sessions in order to figure out if one of the components drops frames. Can you please repeat your test and then post the capture files taken in parallel on all three or four locations?

(09 Jan ‘16, 12:16) Kurt Knochner ♦

Packets are arriving out_of_sequence already in the “Trace from Outside Firewall Port” causing a high number of SACK packets flowing out which trigger (spurious) retransmissions.

The ip.ids on the inbound packets look kind of strange as they suggest the packets were already out of order when they left the source.

May I suggest you turn on TCP timestamps in your windows by setting Tcp1323Opts to 3 https://technet.microsoft.com/en-us/library/cc938205.aspx so we can get some clues as to when they left the source and, while you’re at it - temporarily disable SACK by setting SackOpts 0 (maybe it confuses some NAT devices on the path.

Also for the next trace action I suggest you in addition trace between your Firewall and the gateway router to see the sequence of arrival from the internet.

(10 Jan '16, 23:10) mrEEde

hi, did you able to solve the issue? i happen to face the same exact issue

(05 Aug '16, 12:15) zareefaqmar

@zareefaqmar, if you want assistance, you'll have to follow the advice of @mrEEde to the OP regarding the points where you capture at the same time (so that the same session can be observed at both sides of the firewall) and what settings in the Windows TCP stack to use, and post the resulting capture files.

(05 Aug '16, 13:25) sindy

@sindy hi sindy. thank you for the reply. i have taken the pcap at both ingress & engress at the same time. i can see that after client are sending GET, the client would one again send TCP Spurious Retransmission.

As per quoted previously, the issue in real life is that the website would load up normally. Only that when I tried to view the pdf file onto the page, it would see an error.

PCAP file are shared as below:

https://www.dropbox.com/sh/50djblz04h4m1fq/AABtZ9WdqklVydF6-g-b9o9Ma?dl=0

(06 Aug '16, 14:04) zareefaqmar
showing 5 of 7 show 2 more comments

One Answer:

3

I feel a bit weird answering a Question based on a capture file provided by a 3rd party, but the world isn't black and white.

The first question would be what kind "firewall" do you use (and @Lewis Travieso, if you are still interested, the same question applies to you)? Looking at the captures, I can see that the device is as if filtering out the turbulences happening at the WAN side so that they wouldn't affect the LAN side. But obviously the primary goal is not the filtering but either caching or, even more likely, analysing the files as they are downloaded for possible malicious contents.

If you apply a display filter tcp.stream == 9 at both files and have a closer look at the packets which Wireshark's TCP analysis marks as unusual ones in the WAN-side capture, you'll see the following:
while at WAN side, the incoming packet with seq 21240, next seq 22620 (in frame 344) is immediately followed by an incoming packet with seq 44700, next seq 46080 (in frame 346), at LAN side the "same" packet with seq 21240, next seq 22620 (in frame 348) is immediately followed by a packet with seq 22620, next seq 24000 (in frame 350) after that packet has arrived to the WAN side (in frame 350).
The WAN-side packet in frame 346 has arrived out of sequence most likely because between the firewall and the server, there are multiple paths with significantly different propagation time.
EDIT: as @mrEEde has noted for the capture provided by the OP, the ip.id field suggests that the weird order of WAN-side packets mentioned earlier is not caused by multiple routes between the server ad client ends. As the same behaviour can be observed in the capture provided by @zareefaqmar, and lacking any knowledge about the network architecture at the server end, I hereby guess that there is some NAT/firewall/load balancing device there which assigns its own values of ip.id to packets it forwards from the actual server.

The "firewall" device at the client end is also responsible for the crash of the TCP session. The last incoming packet from WAN side to be properly retransmitted to LAN side is the one in frame 462 (WAN)/frame 464 (LAN); after that, the firewall sends (inevitably impersonating the remote server) a locally generated RST, although at the WAN side, subsequent packets (frames 463 and 464) have arrived from the remote server in the meantime. But the firewall stops talking to the server completely, so after a short while, the server starts retransmitting the last packet it didn't get acknowledged, and it sends a RST itself after still getting no acknowledge for almost 20 seconds.

So my conclusion is that the "firewall" is actually a more sophisticated security device, which analyses the HTTP responses as it forwards them, and it either does not like or merely does not understand the contents of the pdf file, so it causes the TCP session to fail as a consequence, either intentionally or due to a bug. Notably, Wireshark's File->Export Objects->HTTP didn't recognize the pdf payload either.

The url (gltopdf) suggests that the pdf file is generated, so maybe its contents is really malformed?

answered 07 Aug '16, 03:13

sindy's gravatar image

sindy
6.0k4851
accept rate: 24%

edited 07 Aug '16, 12:55

hi @sindy thank you for the lead. the firewall is cisco ASA. upon furher inspection on the logs and firewall configuration; the culprit is the 'http inspection policy' that dropping the packet prior to policy violation and mismatch body size (200). currently are seeking principle advise on fine-tuning the configuration.

(10 Aug '16, 17:13) zareefaqmar

I'm not an expert on http, but as Wireshark was unable to identify the pdf file in the HTTP 200, can you check whether a browser on a PC bypassing the ASA can display the pdf properly? I mean, the ASA is most likely right and there is really something wrong about the way the http body is encoded, so the fix should be done at the server rather than the ASA.

In particular, the HTTP Content-Type header of the 200 OK doesn't mention a multi-part body (it says just Content-Type: application/pdf) but in reality, the pdf part is immediately followed by a html one, while the size indicated by Content-Length: 87851 is of the pdf part alone.

(10 Aug '16, 23:24) sindy

FYI, the pdf itself is fine (Acrobat Reader DC doesn't complain about anything), so only the http appendix to the 200's contents seems to cause the trouble.

(11 Aug '16, 07:26) sindy