I am running UltraVNC on windows 7 as server and Mac OSX as client. I have connected them to the same network so effective latency was zero but to simulate real life like conditions, I introduced a latency of 100ms using ipfw on Mac. Now the performance was really bad and I took a dump which is uploaded at cloudshark.org. I saw that the data transfer gets capped at about 37 kilobytes per 100ms in this graph. I want to understand why is this happening. The tcp window size on both the mac and windows 7 machine is much bigger. What is it not utilizing the entire tcp window before waiting for an ack or is there anything fundamentally wrong in the way I am understanding it? Also another interesting observation is that the initial announced tcp window size was 65k for Mac and 8K for windows 7 but both gets reduced drastically. Why would this happen? asked 19 Nov '12, 07:27 Aditya Patawari |
3 Answers:
From the trace you provided, there is indeed no reason for the sending stack to not use more than ~ 37k at once for sending data, especially since the amount allowed by the receive window is about 4 times that big. The sender should after a certain period of time start sending data close to the recieve window advertised by the client if a) the application delivers data fast enough b) there is no packet loss decreasing sent packet rates So from my guess it's either based on some whatever issue with ipwf or it might simply be an application issue but that's hard to tell. From network perspective there is no reason for that behaviour. BTW: I don't see a dramatic reduction of recieve windows -> both are > 64k all the time answered 19 Nov '12, 08:49 Landi UVNC is capping the data and this is verified by them on their forums. Thanks for the help. (21 Nov '12, 06:15) Aditya Patawari Can you please provide the link to that statement. It will make the whole discussion here useful for others. BTW: You can verify the statement about uvnc by removing the delay of 100ms in ipfw and then post a new capture file of that connection for the benefit of all. (21 Nov '12, 06:52) Kurt Knochner ♦ |
Cloudshark shows a pretty hard limit for the throughput at ~ 40 KByte/s. But if you look at
This shows an average throughput of ~ 1.65 MBit/s for this stream.
There is a different picture as well. There is are "clusters" around 100 and 250 Kbyte/s, with some variation around 750 KByte/s as well. These values are consistent with the IO Graphs of Wireshark (Bits/tick), which show ~ 1.6 - 1.7 MBit/s. I'm not sure what cloudshark shows in that graph, however it's 'kind of' different than what Wireshark shows for the throughput. I would rather rely on the statistics of Wireshark instead of the graph of cloudshark (until I fully understand how that graph was created). Regards answered 19 Nov '12, 09:27 Kurt Knochner ♦ edited 19 Nov '12, 09:28 It's not about the throughput in kByte/s, but instead the problem here is the burst rate of TCP data being sent does not match the advertised recieve window. That's why the performance of course is higher than 37kByte/s but doens't even closely reach the throughput possible on the local LAN. (19 Nov '12, 09:37) Landi I did check wireshark. It says 157334 Bytes/sec which is 15.7k per 100ms which is somewhat consistent with cloudshark. (19 Nov '12, 09:41) Aditya Patawari Ah, right. I did not realize the "resolution" of 0.1 seconds of the Cloudshark graph. Sorry, and forget what I said ;-) (19 Nov '12, 10:50) Kurt Knochner ♦ Yes, the default time interval for this graph is 0.1 seconds. You can select the "open in editor" option from the graph and change the resolution to 1 second. You can't "save" these settings currently without having a CloudShark appliance. (20 Nov '12, 04:34) cloudshark |
After the "slow start" the maximum "bytes in flight" is always 35124 (before waiting for an ack). (xxxxNot surprisingly, given the 100 ms delay time, this matches the pretty closely the "37 Kilobytes per 100 ms" mentioned in the questionxxx) Correction: Actually, as noted previously, the actual throughput is on the order of 200KBytes/sec). Given the hard limit seen for "bytes in flight" I would have to believe that there's a "congestion window" mechanism or something similar going on. I would expect that the throughput would increase as the artificial ack delay is reduced. answered 19 Nov '12, 10:05 Bill Meier ♦♦ edited 19 Nov '12, 10:17 That cannot be correct, because otherwise you could not get speeds up over WAN links, or do you mean before using congestion avoidance? Where do you have that information about slow start limiting congestion window? (19 Nov '12, 10:42) Landi I'm only noting the hard (repeated) limit for "bytes in flight" I see in the capture (35124) and speculating as to the reason being a congestion window limit or something similar. I am not suggesting that the "slow start" at the beginning of the connection is related to what follows. (I certainly haven't made a study of the arcane details of TCP "slow start", "congestion windows of various flavors and etc)). As noted, it would be interesting to see what happens as the artificial delay is reduced. (19 Nov '12, 10:57) Bill Meier ♦♦ 1 Looking at the time-sequence graph, one can also see that there's an additional delay every 90K bytes or so. After looking in detail at the capture, it appears that there's also an application layer ack thing going on wherein the sender waits for a 10 byte reply from the receiver before resuming after every 90K bytes sent. This seems to add an additional 100ms delay for every 90k bytes sent. So: if 90K takes around 400 ms to send, then this more or matches the 200KB/sec thruput observed. (19 Nov '12, 11:08) Bill Meier ♦♦ Hmmm... I wonder what the TCP send buffer size is.... (assuming this applies). (@Landi: maybe, in effect, there is an "application issue" ....) (19 Nov '12, 12:00) Bill Meier ♦♦ TCP send buffer grows exponentially and later linear after slow start until it either reaches recieve window size or until packet loss occurs - that's why I was referring to the send window aka congestion window being capped not by stack behaviour but instead by the amount of data being delivered by the application. This - from my perspective - is the only reasonable point for TCP not to send more data than seen here on block (20 Nov '12, 05:04) Landi |
Could you reproduce the scenario without ipfw and upload to cloudshark? I wouldn't rely on the stack to get confused by artificially introduced delays and stuff like that.
I created the dump using http instead of VNC at same 100ms artificial latency but there the window size and data transfer were almost equal. So I am guessing that this is not the problem of ipfw.
So if I get your comment right, you are saying that with http the throughput is fine? That would even more point to the fact that there is an application issue which is not network related.