This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

TCP sender behaviour

/PUSH

anybody an idea?

Hi everybody,

during my recent analysis job, analyzing a TCP sender's burst behaviour was a challenging task.

Short form of my question: Does anybody know if TCPs congestion management algorithms concerning cwnd, ssthresh and the congestion avoidance / fast recovery techniques apply to every single TCP connection regardless of its burstiness ?

Example leading to my question: Following the TCP related RFCs, congestion management / avoidance mechanisms in TCP and the extended definitions like e.g. NewReno - the sending device follows several algorithms to determine its "sending speed" respectively the number of packets it puts onto the wire before waiting for an ACK.

During recent analysis I came across a typical downlink from 1 Gig to 100M where the sending servers on the Gig Link startet by sending loads of packets (50+ packets) at full line rate which hurt buffers on the access switch.

If i get RFC specification of congestion avoidance right, every acknowledged packet (for slow start) increases cwnd and by that the number of bytes the sender can put on the wire - only limited by client window size.

Key question: Is that algorithm also used when several small request/response packets are exchanged ? I can hardly imagine, that this is the way it should work - like for example 200 requests and 200 responses with a decent time between would push cwnd up to a very high value, resulting in a tsunami-like burst of packets when afterwards a big file for example would get requested...

There are lots of more detailed questions concerning the whole sender's behaviour, but that one is one of the most important to me.

Has anyone dealt with TCP sender's behaviour in that depth and could give me a hint ?

Regards, Landi

drops cwnd tcp packet congestion

asked 20 Oct '10, 03:04

Landi
2.3k●5●14●42
accept rate: 28%

edited 09 Nov '10, 02:46

One Answer:

Good question!

Not every TCP session will follow the RFC guidelines for TCP behaviour. Why? Because of the diversity of logic amongst the various applications, stack shims, etc. It's been my experience that MOST of the application development environments (use whichever acronym you're used to) take vastly different stances on just how much control they take over the TCP stack. Most apps (again, IME) don't get into stack level controls - they let the stack do what it does best. In these situations you'll see TCP slow start, active window size adverts, CWND, etc. Another thing to consider is the age of the stack. In the OLD days TCP SlowStart would only be used if the remote end was part of a different subnet, now it seems that SS is almost always used in the current stacks.

When it comes to buffer size (advert'd window size), CWND, and send buffer - the least of the three is what TCP uses. You will USUALLY only see the receive buffer (advert'd window size) in the capture. You can't see the send buffer in the capture, but sometimes you can derive info related to CWND. To answer part of your question regarding CWND - it will grow exponentially with each round trip BUT once it reaches the size of the SEND or RECEIVE buffer the stack should choose whichever is smallest.

Have I muddied the waters enough? To put it simply - the RFCs are nice and informative, but it's rare that they are strictly followed in every implementation. It's also quite possible for programmers (app, kernel, stack, etc) to control communications directly and avoid all built-in TCP stack controls. MS CIFS is a prime example of this - it follows no rules but it's own.

answered 09 Nov '10, 08:16

GeonJay
470●5●9●22
accept rate: 5%

THANKS a lot for stepping into the deep water where my question is located at ;)

Your answers are 100% clear to me - what I was looking for is a step further into the inside of the TCP stack doing "flow control".

Let me split my question into 2 smaller but more specific ones:

(09 Nov '10, 08:35) Landi

Do you know IF (!) slow start and later congestion avoidance after reaching sstresh also apply and by that enlarge CWND when there is a request-response oriented TCP session without loads of traffic ?

Example: TCP Handshake and after that Client requesting something every 500ms, Server answering every Request with < 1000 Byte. Would that grow CWND to the max. (which is recieve window of client) ?

That would mean, the next request for a HUGE file inside that sessions would result in the server sending a full r-wnd w/o waiting for a single ACK --> Vista 2MB r-wnd --> "Tsunami" :)

(09 Nov '10, 08:38) Landi

2nd: I heard the term "ACK-clocked" so often: When a server on a Gigabit link sees incoming ACKs every ~ 125msec (100MBit intervals) why the hell would he continue sending more and more and more data only because the recieve window of the client (e.g. 2MByte on Vista) allows it? I have the trace file where the bytes in flight are sky-rocketing and the server (SMB) is still pushing, because r-wnd. ALLOWS it in theory...

Is there a key point I am missing, or is there really nothing else then making the recieve window really small to stop those servers from overfilling switch buffers?!

(09 Nov '10, 08:41) Landi

Both of my answers are based on an RFC compliant stack. 1) Yes, SS is applied to each connection. Once you reach the SSThresh OR the Client's Advert'd RCV window size you will switch from SS to CA. If we start with an MSS of 1460 then each recv'd ACK will double the window (1460, 2920, 5840, etc) until we hit the RECV window. If Window Scaling is used (evidenced by Vista's 2MB r-wnd) then it's fully possible for the server to drop enough data on the wire to fill the advert'd RECV window - the 2MB Tsunami that you mentioned. Most workstations can't handle that kind of load being dropped.

(09 Nov '10, 10:07) GeonJay

2) My friend, this is the world we live in. Should we limit the sale of beer to an individual because they're alcoholic? The servers assume that client can handle the data because the window is so large. We have a large Solaris system here that will, quite often, drop ~50MB of data on the wire to a user laptop. In the trace you can see the large R-WND Advert coming from the client, you then see the HUGE data dump come down the pipe, then you see nothing but ZeroWindow adverts coming from the client for a few cycles. Your option is to artificially limit the Windows on either side.

(09 Nov '10, 10:12) GeonJay

Another option is implement qos on the switch. The rate-limiting or policing can help by dropping enough packets to slow down TCP. Although QoS can be complicated (in Cisco world, you have line card, supervisor, and IOS dependencies to worry about), rate limiting or policing traffic so you don't overrun the receiver can be done. Especially if you're not talking about n-way combination. One question though, is the spiral of death (caused by a torrent of zero windows?) causing performance issues? If there's no contention on the wire, letting it run as fast as possible may not be an issue.

(09 Nov '10, 15:14) hansangb

@ hansangb: QoS on the switch is absolutely no option in this current case. The interesting thing is, that the TCP Stack of the client Vista is not impressed at all by large packet drops and keeps advertising it's huge recieve window. This results in a very predictable packet drop load, once the server recovers from the fast recovery and/or slow start.

What I read about rate limiting officially from MS is that the way to handle it is limit sending rate by software QoS on the MS servers... pretty interested how and if this works...

Thus your point is interesting, thanks a lot for the input.

(15 Nov '10, 03:03) Landi

@GeonJay Thanks for keeping up with my questions. I already feared that user r-wnd would be the only current issue - though (refer to my comment above) officially server rate limiting should do as well but is not so easy applicable.

Now I look forward to seeing huge loads of SMB(2) traffic bombs after SMBs initial "smalltalk" blows the cwnd for Vista/7 clients to > 100k per session :-/

I might post another topic, if something else comes up - will analyse this issue the next month in detail.

THANKS a lot so far

(15 Nov '10, 03:09) Landi

showing 5 of 8 show 3 more comments