This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

TCP Retransmission - Server sends wrong SeqNo?

1

Hi

I had to analyze network packets because of an application that crashed quite a lot. So I stumbled upon TCP retransmissions, and did a simple setup to start 'troubleshooting'. I connected two Laptops running SystemRescueLinux two the same switch (Gigabit Switch, Cat6 cables) and opened an SSH connection from one machine to the other, running 'tcpdump -i <netif> -s65535 -w file.pcap' on both client and server.

I was analyzing the packets just before the TCP retransmission happens in both the PCAPs, and if I did not make any mistake, the server sends an invalid Sequencenumber causing the retransmission...

ClientIP: 10.41.1.87 ServerIP: 10.41.1.88

Client side looks like: alt text

Server side looks like: alt text

Client-Side starting with packet no 1890:
2018416+5792 --> 2024208 --> Correct!
2024208+7240 --> 2031448 --> Correct!
2031448+4836 --> 2036284 --> WTF? Got 2035792?

Server-Side starting with packet no 1878:
1989456+24616 --> 2014072 --> Correct!
2014072+22212 --> 2036284 --> Seems to match what I calculated on client side, but not what the server actually sent?

So,...I'm at a loss here. Is this the network driver? The NIC? The...I don't know? Did I miss something? I suppose it's definitely not the switch in between, at least that's what I'm trying to find out,...but what is it then?

Kind regards

EDIT

Ok, as requested, here's the PCAP of the TCP stream:
Server: Server PCAP
Client: Client PCAP

asked 12 Nov '15, 04:37

esc4rg0t's gravatar image

esc4rg0t
26227
accept rate: 0%

edited 13 Nov '15, 00:02

Hi,
as the captures are encrypted and coming from a lab test, would you mind placing them somewhere to the cloud and posting links to them here to allow more comfortable analysis?
BR, Pavel

(12 Nov '15, 08:02) sindy

Or, if paranoid, run them through a sanitization task with TraceWrangler, and tell it to cut payloads after layer 4 :)

(12 Nov '15, 09:11) Jasper ♦♦

2 Answers:

1

I don't think the packet size or offloading is the cause, at least not the direct one.

The packet with ip.id == 0x8a75 has 24682 octets including all headers (24616 tcp payload) when sent by server, but only 4410 (4344) when received by client, and another 3 directly following packets (with different ip ids) carry the rest of its tcp payload, but all four of them are acknowledged by a single ACK seen in the captures from both ends.

The same situation repeats for the immediately following "jumbo packet" with ip.id == 0x8a86, except that this time the client does not send any ACK, and so the retransmission takes place in about 10 ms.

But there is a significant difference between these two "jumbo packets": the second one, which causes the retransmission, carries the PSH (push) flag in it, which is logically only available in the last packet of the series.

So I would vote for some issue in buffer handling (or the ssh client maybe?): when the receiving side (client) gets the Push command, it should send the buffer contents to the application immediately, and in some cases something goes wrong and it takes too long so the stack cannot send the backward packet with ACK on time.

A similar situation can be seen with frames 1651, 1652 at client side and 1742, 1743 at server side, and at least one more time (use display filter tcp.analysis.retransmission). Or, in yet another words, in these captures, no packet without the PSH flag needed to be retransmitted.

answered 13 Nov '15, 06:00

sindy's gravatar image

sindy
6.0k4851
accept rate: 24%

Can u tell how to send last packet without push flag ???

what is the best solution ????

(18 Nov '15, 06:49) srinu_bel

Not the right place to ask, I'd ask that at Stackoverflow programmers' Q&A, because it is an application and/or driver related question, not wireshark or network traffic analysis one.

(18 Nov '15, 06:55) sindy

e-mail id please ???

(18 Nov '15, 07:06) srinu_bel

0

Can u limit packet size to 1448 Bytes in place of 24616 / 22212

answered 13 Nov '15, 02:28

srinu_bel's gravatar image

srinu_bel
20151620
accept rate: 0%

Ahm...what exactly do you mean by that? Do I have to make a new trace, or...? As I put the pcap files for download now (see first post)...

(13 Nov '15, 02:32) esc4rg0t

Application server is sending very big TCP segments, Is it possible to limit the size of TCP segment to 1448 insted of 24616 / 22212 ???

(13 Nov '15, 02:36) srinu_bel

Application server is sending very big TCP segments, Is it possible to limit the size of TCP segment to 1448 insted of 24616 / 22212 ???

from your application and then take fresh capture

(13 Nov '15, 02:39) srinu_bel

I will have to check that, as I just default booted into a Linux live image (SysRescueCD)...

(13 Nov '15, 02:44) esc4rg0t

Don't know if it is a good idea to disable TCP offloading at that moment. Seems to that it is main part of the question maybe with SACK or DSACK. More precise answer could be achieved with sharing us a capture. Like Pavel and Jasper have mentioned earlier.

(13 Nov '15, 02:53) Christian_R

Please see original post, I already shared the captures. Or here again: http://ianfe.dyndns.org/server_tcp.pcap http://ianfe.dyndns.org/client_tcp.pcap

Right, I guess that offloading is active, so the NIC segments everything. I could use a tap, but would need to buy one first ;-)

(13 Nov '15, 02:55) esc4rg0t

If you want to use the tap only to confirm that the server's NIC segments the packets autonomously, maybe it would be enough to disable tcp offloading only at the receiving machine?
That way, you should see the packets as they look on wire, i.e. as the sending side's NIC has modified them. In the same step you could get an info which direction causes the issue if switching tcp offloading on one end would miraculously heal the retransmissions.

(13 Nov '15, 03:41) sindy

yes, maybe I will do another dump today with tcp segmentation disabled...if I find the time :-/ But maybe somebody already got an idea with the current dump. I just thought it through again and again,...NIC DOES tcp offloading, cool. So one of these "split" packages could be lost...but then I would no calculate the exact same and perfectly right! sequence number on client and server side dump, would I? Which differs from what the server sent along as a sequence number...

(13 Nov '15, 03:48) esc4rg0t

Okey, I disabled the offloading on both client AND server, no retransmissions anymore. Whatever that means...

(13 Nov '15, 05:56) esc4rg0t

So if that's the case (what you described above), I would not have to worry about my network intfrastructure, and TCP retransmissions from time to time seem to be quite normal,...because of things like what's happening here.

(13 Nov '15, 06:13) esc4rg0t

yes, stop worrying about your network and start worrying about the task scheduler of your OS :-D

But for me it is more interesting that the tcp offloading affects the behaviour... maybe it does so indirectly and the application (ssh client in this case) has problems to handle large buffers, not expecting to get 10 pages of text in a single update?

Also, this effect, if it exists, might explain your initial issue:

I had to analyze network packets because of an application that crashed quite a lot.

Or you've already found some other reason of that issue and the analysis of retransmissions was just a spin-off project?

(13 Nov '15, 07:35) sindy

1) Yes U r right 100% sindy.... 2) To overcome the problem of buffers.... i am tying to send smaller PDUs.. Which reduces the burden on tcp stack...

Regards

(13 Nov '15, 18:29) srinu_bel

Sindy: Well, I came to the conclusion that the application that crashes seems to be crap, yes... But apart from that, I will put more effort into analyzing this behaviour, try other software, other Linux builds and so on.

Right now I got an indication that it might be SSH, as I have similar behaviour when connecting to a pfSense box over SSH from the same client...

(15 Nov '15, 02:02) esc4rg0t
showing 5 of 13 show 8 more comments