This is my second attempt at trying to find out what is going on. My first attempt was here: I'm now fairly convinced that the Retransmission and Dup ACK packets were a red herring. If so, sorry for the confusion - as I said the first time my knowledge of TCP is not good. The situation is still this: When I write PC-to-PC I get around 10 MB per second. Android device on WiFi to PC is about half that, and that's OK. But when I try sending from PC to the Android device the transmission rate is down below 1/2 MB per second, which is not good. I'm trying to use Wireshark to see if it can tell me why this is. Here are two Wireshark captures: https://www.cloudshark.org/captures/7133e5f96577 https://www.cloudshark.org/captures/47ff5505d1e1 As far as I can see the big difference is that in the PC-to-PC version I'm sending 5 KB and 15 KB packets every 1 ms., while in the PC-to-Android version I'm sending typically 1.5 KB packets every 2 ms. Is this because the WiFi router (a SonicWall firewall / WiFi router) is telling my PC it doesn't want more data? It does specify an initial window size of only 8 KB, but after that it typically indicates a window size of around 30 KB. Is it the low initial window size that's the problem? Or something else? And is there anything I can do to improve this performance? Thanks in advance. asked 13 Oct '15, 06:50 RenniePet edited 13 Oct '15, 06:52 |
One Answer:
After reviewing both captures, the slow transfer rate in PC-to-Android capture is due to the long time duration between packets after the Android device (172.16.31.178) sends a TCP Window update. If you add a column in Wireshark displaying time between frames, this will become evident (frame.time_delta). Them sort from highest to lowest in the Delta time column. In the PC-to-PC capture, the largest time delta is about 12.5msec and this occurs at the beginning of the data transfer. However, in the PC-to-Android capture, there are multiple instances in which the time delta between frames is around 300msec! In fact, there are 67 instances in which the time delta between frames exceeds 250msec (67 * 0.25sec = 16.75sec in which no data is being transferred between endpoints - no wonder your throughput suffers). What is even more interesting, is that the following occurs:
This appears to be an issue at a lower layer - maybe WiFi issues? Can you post a capture which includes all the data (no filters and make sure to include the WiFi frames)? Make sure the WiFi capture is decrypted before posting. answered 13 Oct '15, 12:11 Amato_C showing 5 of 11 show 6 more comments |
Thank you very much for your insights.
The "no filters" part I get. How do I include WiFi frames, please?
Also, if you'd be so kind to take a look at my first question, posted a couple of days ago? There I talk about the fact that the app running on the Android device is my own programming, and that I'm using something called Java NIO.
In the Android logcat I'm seeing some messages like this:
10-12 03:25:45.777 21339-21371/com.Merlinia.MMessaging_Test I/System.out﹕ [CDS]EAGAIN or EWOULDBLOCK in Recvfrom
Does any of this ring any bells with you as to what I'm doing wrong? (If it is me that is doing something wrong.)
Thanks again.
How do I include WiFi frames, please? I thought you were capturing over WiFi. If you are not, then skip this for now.
Also, in your original post, you asked the following question: "Why does my PC choose to send small packets to the WiFi router and large packets to the other PC?" So I have additional questions: 1. Is the PC-to-PC transfer also made over WiFi? 2. You also mentioned that Android-to-PC transfer speeds is OK, but PC-to-Android is not. Are you using the same PC to send and receive data in these tests? 3. In your PC-to-PC tests, do you use the same PC as the previous testing? Have you tested PC1 to PC2 and then reverse the transfer, PC2 to PC1?
Hi Amato_C: Seems that the trace is a local PC trace with TCP Offloading enabled. And the Wifi stuff is done by the Sonicwall.
So I think the most probably reason for that packet loss is the WiFi network. (But it is just guessed)
Thanks for your help, both of you.
The Wireshark capture is being done on the same PC as my PC program runs on, i.e., 10.2.2.20. So I guess I'm only seeing the traffic between my PC and the SonicWall WiFi router.
Until I started researching this problem I was assuming the WiFi router was just passing packets back and forth blindly and without intervention, so my PC and the Android device were talking "directly" with each other. But it isn't that simple, is it?
Sorry, my understanding of this is poor.
Answering Amato_C's questions ...
No, they're both on the company's internal network. (Old-fashioned copper.)
Yes, exactly the same setup and the same test programs. (Both programs can be told to send and receive, or just to receive.)
Yes, 10.2.2.20 is my development/test PC in all of the tests.
Yes, it's the same.
Thanks again for your help.
I've now uploaded the unfiltered version of the PC-to-Android capture file. (But I had to delete the filtered version to do it due to my storage quota as a guest on that site.)
https://www.cloudshark.org/captures/d708bfed5490
Let's focus on the PC-to-Android setup. I assume that the PC is connected via a wire to the network (wired Ethernet to the switch/wireless router) and the Android is connected to the network through WiFi. Is this correct?
I assume this setup remains the same during all your testing (Android-to-PC and PC-to-Android).
Since you are capturing the packets on the PC (10.2.2.20), then you cannot provide the WiFi capture frames which is unfortunate. But if you can access the configuration of the WiFi router, then some things to check:
Fragmentation threshold = 2346
RTS Threshold = 2347
If your Android supports WiFi at 5Ghz, change the WiFi network to 5GHz. I don't think it is an RF issue since the Android-to-PC speeds are okay, but it never hurts to remove as many variables as possible.
I also looked at your new capture and unfortunately it does not provide any new information.
OK, I think I've figured it out. Amato_C's suggestion of creating a column showing DeltaTime was very revealing.
Things start off running fairly OK except that the Android device keeps reporting a larger and larger window size, until it hits what is apparently its limit at 522176 bytes (1/2 MB!) around packet number 658. Then it starts sending [TCP Window Update] packets at packet number 698. Then everything goes horribly wrong, with long delays of approx. 300 ms. again and again.
So then I realize that the problem is presumably something as banal as my Android app not being able to process the incoming messages as fast as they're arriving, so the input buffer fills up. I've tried running a cpu profiler (Google's Traceview), and if I'm interpreting the results correctly, the Android device is running with 100% cpu time, partly in my app, partly in Java NIO, partly doing garbage collection, plus some other minor things.
I'll change my test programs (PC and Android) to use fewer and larger messages, and see if that doesn't fix, or at least significantly alleviate, the performance problem.
Amato_C: First off, sorry for not having seen the comment you posted yesterday, before I posted my last comment.
And thank you very much for your time.
This is driving me CRAZY.
I changed my test programs, and now the Android app is not cpu-bound, and it no longer reports increasing window size in the ACKs - the window size now varies between 25 and 29 KB.
But the performance problem remains. Throughput is miserable, and there are many, many 300 ms. pauses.
As for your recommendations:
WiFi router - Fragmentation threshold = 2346 - yes.
WiFi router - RTS Threshold = 2347 - no, it was also 2346. I changed it to 2347 but that didn't seem to change anything.
Android device - change the WiFi network to 5GHz. - I don't see any setting like that on the two Android devices I'm playing with.
I've uploaded another Wireshark log.
https://www.cloudshark.org/captures/07005a922a42
One thing I think I'm seeing is that as long as the PC is sending packets of 1514 bytes (Len=1448) things are marching along OK. But as soon as the PC sends a packet larger than 1514 things go wrong.
For example, packets numbers 286, 288, 290 and 292 are accepted OK, Then packet 294 is larger than 1514 bytes, and this causes Dup ACKs and a retransmission (packet 302).
And about packet 302, it's apparently a retransmission of packet 294 (Seq=206001), but it's only half the size????
I'd really appreciate it if you have any new suggestions, thanks.
"But as soon as the PC sends a packet larger than 1514 things go wrong. " Well, as posted in your initial thread my suggestion is to turn off TCP Segmentation Offload if this is causing you problems in the communication towards the android device
@mrEEde Yes, I think the same.
Thaks again to all who answered. I've apparently been going around in circles despite having been given a correct suggestion by @mrEEde a couple of days ago. (But I've certainly learned a lot about TCP and Wireshark, so that's good.)