Hello All, Thanks for reading. I was wondering if you could take a look at these logs and tell me what's happening.
I'll admit straight away I know almost nothing of networking other than the basics like setting up routers and port forwarding. Nothing low level.
The problem is that I get very different transfer speeds throughout the day. The logs show a transfer of about 200KB and it can range from ~2 seconds to 20+ seconds.
Here's what I know / what I've tried:
I've made a client/server program in .NET using TCPClient/Listener and it transfers byte arrays of UFT8 encoded strings as messages on the network stream object.
The read buffers are set to 1460 (not sure it makes a difference but I understand it that's the default packet size for the internet and it helps in debugging when I can see if there's a delay).
After some research I found that it could be a problem with the the packets not getting acknowledged quick enough so I followed the Microsoft KB article and set the registry value of TcpAckFrequency to 1.
The server uses non-blocking ports and I read there can be a problem with them if the information you're sending in the write is larger than the buffer on the NIC so I also added the NonBlockingSendSpecialBuffering registry value to 1.
I've also used setSockOpt to change the send buffer size so I know it's larger than the information being sent.
All the logs show information from the same code and same data, nothing has changed, it's just at a different time of day.
The problem always seems to start when a DUP ACK is sent to the server. This seems to put it in a spin and it doesn't recover.
At midday I usually get the average transfer speed (2-7 seconds), in the afternoon I get the slow speeds (20+ seconds) and at night it's all nice and fast (1-2 seconds).
I've tried looking at the logs but other than noticing the DUP ACKs I don't really know what I'm looking for (or if there's something missing that should be there).
So my questions are these: Is there anything I can do about this from my code? Or is it simply traffic on the line causing drops?
Help me Obi-Wan, you're my only hope,
asked 10 Feb '13, 05:05
For me this looks like you've got different line qualities depending on the time of day. I have seen this happen in a couple of cases where the fastest speeds where between midnight and noon, and getting worse from noon to midnight. As far as I can tell the reason for this is usually that - even though everybody has his own internet connection - the service providers bundle them at at "last mile" connection nodes/hubs. This means that when more people are at home surfing the web connectivity gets worse for all customers connected to a local hub, while it gets better when they go to sleep or are off to work.
I looked at your traces for a bit and as far as I can tell you suffer from packet loss (notice the "previous segment not captured" messages) in the slow AND average trace, but not the fast one. Your trace files are just excerpts from a larger capture, so unfortunately there is no TCP session handshake in the files to see the setup parameters, but I don't think they will tell us much. So without having done a full scale investigation I'd say your line has different qualities over the course of a day and you'll probably have to live with it. You could try and switch the internet service provider, but often you'll still end up on the same leased line with just a different address from where the invoice is sent.
By the way - please let the TcpAckFrequency stay where it was... acknowledging every packet will not help in most cases, and most certainly does not in yours. It's putting additional work on your nodes and line, and even though it's minimal effort you should go back to the usual ACK frequency of every other packet (and the trace is a little awkward to read if you're used to less frequent acknowledges, like I am). My advice would be to leave the stack settings alone until you are sure that you have a symptom that can be fixed with this. If it doesn't help (as it didn't in your case) go back to what it was.
Oh, and another thing: "The problem always seems to start when a DUP ACK is sent to the server. This seems to put it in a spin and it doesn't recover." - this isn't true. First of all, the problem starts with packet loss (DUP ACKs being a result of that, not the cause) indicated by "previous segment not seen" messages. And it does recover - if you check what segments are missing when something was "not seen" you can also see that the retransmissions coming in are filling up the gap each time. But the next packet loss happens shortly after the previous was recovered from, so it looks like there is constant trouble. But still, it does recover - otherwise, your transfer would never see the end of it at all :-)
answered 10 Feb '13, 17:29
edited 10 Feb '13, 17:35
a possible cause for the packet loss is simply another customer system up-/downloading large amounts of data, thus filling up the available bandwidth (ISO up-/download, P2P traffic, Web Radio, Youtube, Online Backup, Malware uploading the customers data/files, etc.). Unless you capture the whole traffic at the router interface, you won't see that and in your capture file (created at the internal client) you will only see the result: packet loss and slow transfer speed.
Did you check the utilization of your internet link during the times you run into this problem?
answered 12 Feb '13, 05:10
Kurt Knochner ♦
edited 12 Feb '13, 05:11