We have this small application that goes out to a web server and it pulls down a report (about 57K on average) which works fine on a standalone workstation. The whole process is a couple of seconds long. Put that application on a Citrix server and you're waiting just over 2 minutes. And you don't even need to do it through citrix. I can RDP to the citrix server and reliably reproduce the issue over and over from the console. In the app you click Print to bring down the PDF and opens it in Adobe, Foxit, doesn't matter the result is the same. Running Wireshark on the citrix server (which is a Win2003 virtual server running on Hyper V) I can see where and why it hangs, I'm just at a standstill....Can someone help? Here's the relevant info(I think) filtered for just HTTP:
Then boom, pdf opens and all is right. I'm new at this so please bear with me.... THANK YOU! asked 03 Mar '12, 08:59 Willmeister edited 03 Mar '12, 09:06 SYN-bit ♦♦ |
One Answer:
I doubt it has anything to do with the server being a Citrix server. Unfortunately, your trace isn't telling much in terms of timings and TCP sequence/ack behaviour, so it is kinda hard to tell what happens and where the time goes. I guess it is caused by something that is different for the server machine than on a workstation - a firewall, a Intrusion Prevention System, other IP subnet ACLs somewhere, DNS name resolution failures, that kind of thing. If you could create and provide a trace file it might be easier to tell, but I'm not sure if you can do that without disclosing company internals (or sensitive information in general). If you can, you might want to upload it to CloudShark and post the URL so I/we can take a look. answered 03 Mar '12, 11:47 Jasper ♦♦ I don't know what else to do as this still isn't working, so here it is http://cloudshark.org/captures/29ddf69aafee I know the answer is in there, I just can't find it. Anything is much appreciated.... (15 Mar '12, 16:32) Willmeister As far as I can tell it looks like your server with the IP 96.44.176.10 has a problem. If you filter your trace on "tcp.stream==0" you can see that packet 13 coming from the client is not ACKed by the server anymore, and is retransmitted a couple of times in packet 15 thru 26 after waiting longer and longer. In 27 the client gives up and sends a reset packet. Right after that in 29 the server finally answers and ACKs packet 13, but its too late. My guess is that there was a big holdup on the path to the server, which is why its answer (in 29) arrives very very late. Or it was just very busy. (15 Mar '12, 17:00) Jasper ♦♦ To check that you'll need to capture both at the client and at the server at the same time. That way you can compare the traces and tell where the holdup is. Interestingly enough the communication in stream 2 works just fine as far as I can tell, and the server responds quickly. (15 Mar '12, 17:02) Jasper ♦♦ Wow I wasn't too far off in my interpretation of what's going on. That 96.44 ip is the vendor of the software that's hanging up. The source in packet 13 is our server, with a small client piece. I gave them the trace earlier and told them to follow the TCP stream from 13. Everytime you run a report from this software the trace is the same, we push an XML file to them then the retransmits start. Their argument is I'm the only customer with an issue so it has to be on my side. There's logic there since I can put the client on nearly anything and it will work. (15 Mar '12, 18:44) Willmeister But our citrix servers are virtual 2003 standard running in 2008R2 Hyper-V. If you run this client from host or virtual the same thing happens every single time. (15 Mar '12, 18:44) Willmeister Oh and thank you. I will try and see if I can get them to do a trace from their side. Right now I'm not even thinking about the virtual citrix servers, I'm just running this client software and wireshark on the hyper-v host trying to narrow it down...latest NIC drivers helped some, but it's obviously still hanging up... (15 Mar '12, 18:46) Willmeister Ok, what you need to do is to track down where the packets are delayed. It's still possible it has something to do with your servers, so if you suspect it is still a problem in your network you might want to capture at your ISP uplink to see if you see the same packets as on the server, or if there's already delay inside your network. (16 Mar '12, 00:59) Jasper ♦♦ I don't really suspect it's anything on the network, other than something with the virtual nics on the Hyper-V. I have other servers, both citrix and non citrix, both windows 2003 and 2008, sitting on the same switch, same vlan, and there are no problems. The app works flawlessly. Vmswitch.sys is the virtual NIC driver and it's very suspect to me. Thanks for all your input. (16 Mar '12, 05:55) Willmeister Got a traffic shaper/prioritizing system in the path somewhere? This kind of thing reminds me of some strange behaviors I have seen in the past when that kind of device was in use. And especially with Citrix servers those shaping appliances are often deployed to guarantee bandwidth and fast latencies to terminal session users. Which will put your XML traffic in a very low class and delay it, sometimes for ages. (16 Mar '12, 07:51) Jasper ♦♦ The proxy. But I couldn't find anything in the logs. I got the vendor to agree to a packet capture on their end, hoping it yields some info as to why we go unacknowledged. But after that I'll revisit the proxy. (16 Mar '12, 08:10) Willmeister showing 5 of 10 show 5 more comments |
The Proxy was the problem. And I was never going to find that until the vendor put Wireshark on their side in a test environment, had me connect directly to that server and we captured it all from both sides. We were posting an XML page and apparently not receiving ACKs back, but they were in fact ACKing. Over and over again. Enabled verbose logging in the proxy and saw exactly why those were dropping. Turned out adding an exception for their web servers was no sufficient, I had to disable filtering completely and everything worked. Thank you so much for all the help!!!!!