This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

TCP Retransmission with a delay time of two seconds

0

Hello,

I have a VNC client (192.168.0.66) and a VNC server (192.168.0.10) in my network. On the client runs Windows CE 5.0 and on the server VxWorks. The problem is, that the respnse to a touch click on the client is in some cases delayed. The delay time is 500 ms to two seconds.
For problem analysis, I made a capture with wireshark. In the capture you can see that the clients reports to the server several zero windows (packet 185, 202, 211, 220,...). With packet 239 start the first Retransmission, then follows many ZeroWindow and Out-of-Order packets. The problematic retransmission is the packet 297. It retransmit packet 240 after 500 ms.
I am surprised that the server send the retransmission after 500 ms and not faster. Can I found an explanation for this reason in my capture?

Capture

asked 02 Sep '13, 01:43

parfant's gravatar image

parfant
11114
accept rate: 0%


3 Answers:

2

Hello, the server did "fast retransmit" (2) after the third duplicate ack but the client still didn't acknowledge the segment. So it waited for the retransmit timer to pop and 'Slow Retransmit'-ed the segment (3). Unfortunately wireshark tells us the "fast retransmit" is an out-of-order segment which I think is incorrect: See pmtud-and-retransmission-out-of-order

The question is why does WIN_CE not ack the packet. It might be due to the zero window condition keeping the stack too busy.

The trace shows that the client is not offering higher windowsizes than 6 MSS. So to avoid the zero window conditions, you need to speed up your client application (VNC client) or increase the tcp receive buffer to a more appropriate value.

Here is the link to Microsofts document: TCP/IP Registry Settings (Windows CE 5.0)

alt text

answered 02 Sep '13, 23:18

mrEEde's gravatar image

mrEEde
3.9k152270
accept rate: 20%

edited 03 Sep '13, 22:23

mrEEde is right in pointing out the already fired Fast Retransmission (in frame 249 to be found btw)

@parfant: What you have in this specific situation is very interesting:

Please observe that in packets 249 to 253, the server is resending its data packets although the client signals constant Zero Window. A regular TCP stack must not send data during Zero Window periods, so I'm wondering if that's really the server misbehaving or some device inbetween.

Anyways - the zero window periods from your client signal a massive performance problem at the client device in handling network data, which for me is an obvious indicator of the slow behaviour you are analysing.

(03 Sep '13, 00:12) Landi

Thank you for your fast and detailed answers!

@Landi: I agree with you, the client has only few performance, it is an ARM PXA270 CPU. Did you also agree with me, that the slow client is not the sole problem? The server sends also data packets when the client reports a zero window. This unneccessary data packets are recived by the client and generate additional system load, that aggravate the problem. In my opinion, the server shouldn't send data packets on reported zero window by the client. In frame 254 reports the client zero window and makes a retranssmission of frame 205. Is the reason of this retransmission, that the client is so busy and haven't handled the ack for this packet yet?

(03 Sep '13, 03:50) parfant

The server does not send any new data segments when the zero window condition is reported. The problem is that it treats the zero window packets as duplicate ACKs and triggers fast retransmission. This might not be beneficial for THIS scenario.

What would be the alternative? NOT to fast retransmit - and rely on the retransmission timer to pop.

Would that make it any better? You would not see the zero window condition that often but in the end you'd have to wait for RTO. (Your initial query was about the server retransmitting after 500 ms and higher!)

So the server's decision here was to fast retransmit after the third dup_ack even though the window is closed. For better performing clients - where a zero window condition is only a rare and temporary condition - there is a good chance that the application reads the data while the retransmitted data is in transit and the window might open up.

So, to summarize: If you fix the client (increasing the rwin to more than 6 MSS) the problem should go away

(03 Sep '13, 21:10) mrEEde

1

Ok, here my interpretation of the new trace:

Let's look at frames 67/68. Notice that Frame 67 (ip.id==0xc927) arrives before frame 68 (ip.id==c926) so those packets are arriving out-of-order and #68 is NOT a retransmission as wireshark tells us.
67 is a naked ACK acknowledging everything up to frame 66. The windowsize shrank to 20 bytes below the MSS of 1426 That does not impress the server and within 33 µs it keeps on sending its 5 full MSS segments + 1 smaller segment - all data that had been sitting on the send_queue.

The client acks just the part of the first segment (MSS- 20 bytes: tcp.ack==65577) when the server was expecting to see an ack of 65597. All follow on ack numbers stay at 65577 but do not lead to a fast retransmission (due to a mismatch of tcp.ack and tcp.next_seq?) . After the retransmission timer pops, only one segment is being retransmitted.

The VxWorks TCP stack obviously does not handle "out of segment boundary" acknowledgments and only retransmits one segment off the rxmit-queue after a fixed RTO of ~500 ms . This is why you see increasing RTOs.

A note to your comment on using a small windowsize on a weak CPU.

The smaller the receive window is, the more read() activity is generated at the receiver causing a higher CPU load than necessary. So if you really mean to get as efficient as possible (= to save CPU cycles), you should make sure that a single read() can get as much data as possible from the receive buffer.

I'd try to eliminate the "less-than-1-MSS-window-size" scenario that is triggering this sub-optimal behaviour of the VxWorks TCP stack by increasing the receive buffer in WinCE even more. With RTTs going as high as 4.7ms due to the weak CPU you would need 58k at a 100Mbps bandwith, if you're on 1GbE you need TCP1323Opts enabled to get higher than 64k.

In addition to that it might be worth disabling SACK SackOpts = 0 at WinCE and see if this changes the retransmission behaviour at the sender.

Regards Matthias

answered 04 Sep '13, 14:00

mrEEde's gravatar image

mrEEde
3.9k152270
accept rate: 20%

edited 04 Sep '13, 20:37

0

I set the window size to 8k, because the network performance of short packets (1 - 2k) is much better on this weak cpu with a smal window size.

I set the window size back to 32k and made a new trace:
Capture

There are also ZeroWindow packets and more and longer retransmissions. The retransmission time of frame 232 is 2,99 seconds!

It is unclear to me, why the server waits 500 ms for each retransmission. In the meantime, the server sent window updates to the server, so the server should know, that the client can recive new packets.

answered 04 Sep '13, 01:59

parfant's gravatar image

parfant
11114
accept rate: 0%

edited 04 Sep '13, 02:00