This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

Delayed ack requirements for server

I have a NIC vendor that claims the server is not sending frames soon enough and the delayed ack from the client is timing out causing large latency.

I've uploaded part of a conversation which shows the delayed acks. See trace:

http://www.cloudshark.org/captures/2d1653abbaeb

Frame 21 is an example where the delayed ack waits 200 ms before acking. Note this ack is to frame 19 which had the push bit set. Frame 19 is also not a full size frame.

Normally we turn off delayed ACK at the client and this resolves our latency problem. However, in this case the TCP stack is in the NIC card and the vendor said turning it off is not supported.

My question being is the server not obeying TCP? Below is what the NIC vendor is saying. In summary, we believe that this latency phenomena is a direct result of the TCP/IP stack implementation inside the target. Looking from the outside in it is hard to say, but it almost appears that there may be some form of deadlock inside the target as it sends a single segment and then waits for an ACK before transitioning to full transmit. With delayed ACK enabled, the VendorXYZ initiator sends this ACK on every other segment or unless 200ms timer expires (note: this is in accordance with the RFC guidelines for delayed ACK). With the target not sending the next frame until the 200 msec timer expires on the initiator side, it is this delay that explains the overall latency.

ack push bit delayed

asked 31 Jul '12, 09:08

gipper
30●12●12●16
accept rate: 0%

One Answer:

The delay at frame 21 seems to be because the server has nothing more to send, not because it's waiting on a delayed ACK. Frame 1 is a SCSI Read request. In the Packet Details, there is "ExpectedDataTransferLength" of 0x00004000. Translated to decimal, that's 16,384 bytes, which would bring us to sequence number 16,385. (I don't know the SCSI protocol, so I'm guessing at what the "ExpectedDataTransferLength" means, but it seems reasonable that it is the amount of data that will be sent in response to the Read request.) As soon as the client issues the Read request, the server starts sending data.

Every packet the server sends has a full-sized (1,460 byte) TCP segment, until frame 19. The fact that frame 19 doesn't contain a full-sized segment and the server doesn't send any more data suggests that #19 is the last packet of the data stream, and the server doesn't have any more to send. At that point, we're up to ACK 16,433. That's not the 16,385 that we calculated earlier, but it's awfully close, and 0x00004000 is a suspiciously round number.

The last frame before #19 that the client acknowledged was #18, so the client waits for #20 before ACKing. No frame #20 is received, so the client ACKs frame #19 when the delayed ACK timer expires.

Note that the server does NOT resume sending at this point. Almost a full second goes by, then in frame #22 the client issues another SCSI Read request. Only then does the server begin sending data again, and it does so immediately.

So, there was 218 ms delay due to the delayed ACK, but 981 ms delay waiting for the client to issue another Read request.

There seem to be two things going on here.

The same amount of data is being transferred every time (16,432 bytes), and it happens to cause every transfer to end with one unacknowledged packet, which triggers delayed ACK every time.
The client takes almost a full second to make the next Read request.

If you add up all the delays waiting on delayed ACKs, you get 3.49 seconds. If you add up all the delays waiting for the client to issue a SCSI Read request, you get 7.85 seconds.

answered 31 Jul '12, 12:36

Jim Aragon
7.2k●7●33●118
accept rate: 24%

edited 31 Jul '12, 15:38

I notice that there is not only a 200 ms wait for the delayed ACK timer at the end of each data transfer, but again after each Read request. After the Read request, the server sends a single packet, waits for an ACK, and then starts sending multiple packets without waiting for an ACK after each one. I wonder if this is TCP Slow Start in action, and if the server parameters could be tuned to start with two packets instead of one, which would eliminate another cause of delay.

(31 Jul '12, 12:53) Jim Aragon

There are an even number of packets in the data transfer (12). If you could eliminate the single packet at the start, that would also eliminate the single packet at the end. It would eliminate BOTH occurrences of the 200 ms delayed ACK timer delay.

(31 Jul '12, 14:39) Jim Aragon

Jim

I was looking at the trace again yesterday. I see an issue in what traces I filtered/saved which probably blurred the picture.

Here's the new trace : http://www.cloudshark.org/captures/80772c0b0e6b

Now I saved the first 70 frames only with no display filter. The picture looks different now that the server has an additional IP = 10.99.27.32. You can now see the delay is when it talks to the second IP address of the server for that additional 16,384 byte transfer. Note that you were spot on with the iSCSI 16384 byte transfer as the client is using IOmeter with 4 worker processes and an IO size of 16384. Let me know your thoughts now that you have the full conversations. I'm not sure why delayed ACK happens in frame 43. Note that this ACK occurs before the ACK to frame 39 where the PUSH bit occurs.

(01 Aug '12, 06:58) gipper

I don't think the PUSH bit has anything to do with. Frame 39 is part of a different conversation; different IP address and different port numbers. Frame 43 comes immediately after a Read request. See my comment above about this possibly being TCP Slow Start.

(01 Aug '12, 08:35) Jim Aragon

AFAIR "slow start" and etc should only happen at the beginning of a TCP connection.

In this capture (ack1) the pattern is seen in every response by the server following the "read" request from the client.

I must say that this pattern is reminiscent of "Nagle's Algorithm" kicking in although I don't quite see if it really fits in this case.

See the Wikipedia entry for Nagle's Algorithm especially the section "Negative effect on non-small writes".

In any case, I do agree that the server (target) behavior seems incorrect.

I would not expect that disabling "delayed ack" should be required.

(01 Aug '12, 09:29) Bill Meier ♦♦

gipper, do you mind to name the vendors (not just the nics, but the client OS and the storage vendor)? Maybe it's a known problem with the firmware, that can be found by searching google :-)

However, in this case the TCP stack is in the NIC card and the vendor said turning it off is not supported.

Is that an "intelligent" iSCSI HBA (I don't believe so, as Wireshark should not see any traffic in that case)? If no: How should I interpret the statement above? Why would turning off some "dvanced" features in a NIC driver be unsupported? If you CAN disable it through the GUI, why would using that option void support?

(01 Aug '12, 09:43) Kurt Knochner ♦

The Nagle Algorithm should only apply when the server has less than a full MSS ready to send, which is not the case here. You're probably right about slow start. I know that it should be used at the start of a connection; I'm wondering if some implementations might use it after the connection has been idle for a while, even though the RFC doesn't call for this.

(01 Aug '12, 09:44) Jim Aragon

showing 5 of 7 show 2 more comments