This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

Webservice seems to pause in wireshark and then wake up and respond

0

Our https webservice response time has gone from about 3 seconds to 16 seconds. Running wireshark (which I am new to so I can't read efficiently but am trying) what I seem to see is that we get an encrypted handshake msg from our data vendor (no. 1231). We ack it in no. 1240. Then it almost seems that our webservice goes to sleep or something (don't know what), then about 15 seconds later it finally at no. 1486 sends the request to the data vendor (ACK PSH) and then all processing works fine as it should. I don't know why we are waiting 15 seconds before we send the request out to the data vendor. The only thing I really see happening in the intervening time is some communication with our database but doesn't look like date being transmitted just mostly keep alive communication. Any help in reading the wireshark capture for why we now have a long wait before sending the request to our datavendor would be greatly appreciated. Anyway to post the wireshark capture here? Or any suggestions at what to look at for why we wait? Thank you.

thank you. here is the url to the capture: http://www.cloudshark.org/captures/ba340820a9f4 note that 10.6.62.125 is our app server, 64.129.51.113 is our data vendor, database is 216.109.70.144

asked 27 Jan '13, 10:14

sgaf's gravatar image

sgaf
21226
accept rate: 0%

edited 27 Jan '13, 12:41

If you can, please post it at http://www.cloudshark.org and add the URL to your question - please make sure that you only post trace files that do not contain any sensitive data.

(27 Jan '13, 11:16) Jasper ♦♦

One Answer:

4

I took a look at your trace and filtered on the session starting in frame 1218. You've read it correctly as far as I can tell - your app server takes about 15 seconds after it acked the handshake message before it continues to send data. Question is, why. So I marked frames 1240 and 1486 (last frame before the gap and first frame after), and cleared the filter to see what happens in between.

I guess you (or someone else) was logged into the app server using Microsoft terminal services, because there are a lot of packets on TCP port 3389. I ignored them by filtering them away with "not tcp.port==3389" and kept looking for other more interesting things. I tried to find connections that start after the first marked frame and finish just before the second, which would indicate that the app server has to do something else first before it can continue using the session to the data vendor.

I ended up with a filter like this: "(ip.src==10.6.62.125 and tcp.flags==2) or frame.marked==1". That would show me the two marked frames plus any syn coming from your app server. The only thing that shows up between the marked frames are three SYN packets to the same IP (192.143.241.184) that do not get answered at all. Filter on "(ip.addr eq 10.6.62.125 and ip.addr eq 168.143.241.184)") and you'll see there is nothing coming back.

Now for the interesting thing: the third unanswered SYN happens just 1.4 seconds before the TLS session to the data vendor finally continues, which is a bit too obvious to be a concidence. There is a pretty good chance that this failed connection attempts to 168.143.241.184 are what is holding up the TLS session, so here are your next steps:

  • find out what that IP 168.143.241.184 does, and why your app server wants to connect to it
  • if it should do that, find out why it gets blocked. Three SYNs without a RST or SYN/ACK coming back usually means that there is a Firewall/Packet Filter blocking the SYN, maybe even a local firewall running on that node if someone forgot to open the port
  • if it shouldn't connect to that IP, get the app server fixed

answered 27 Jan '13, 16:26

Jasper's gravatar image

Jasper ♦♦
23.8k551284
accept rate: 18%

(Nice analysis ...)

(27 Jan '13, 16:34) Bill Meier ♦♦

thank you :)

(27 Jan '13, 17:31) Jasper ♦♦

Hi Jasper, thank you for your thorough analysis. That's great information. That IP address is not one we deal with and we shouldn't be connecting to. But to see if that solved the issue, I opened the firewall for port 80 and it brought down our response time to about where it should be. Each time I run it connects to a different ip address but always through port 80. Some of the various ips come back as deploy.akamaitechnologies.com. I don't know if that is something valid (say trying to validate the ssl certificate of our vendor) or something totally bogus. Thank you so much!!!

(27 Jan '13, 19:01) sgaf

Akamai is a provider that has a lot of different IP addresses as distributed access points to content someone hosts at their infrastructure, so if you see connections to various IPs it isn't suprising that much. Question is, what does the app server want from them? When you've opened the firewall you might want to take another capture to look at the information beining exchanged - since it's on port 80 it should be clear text. Maybe a certificate revocation list (crl) or something like that is checked.

Oh, and please accept my answer with the checkmark next to it to show that it helped ;-)

(28 Jan '13, 01:03) Jasper ♦♦

I did another capture just need to analize it. Thanks again for your help.

(28 Jan '13, 08:19) sgaf

Its doing a get of /msdownload/update/v3/static/trustedr/en/authrootstl.cab but not sure what certificate is out of date? Most of the information transferred is not readable.

(29 Jan '13, 13:48) sgaf

As far as I can tell it is an update file for the list of root certificates, and it doesn't have to mean that the existing ones are out of date. Some applications just pull the latest list regularly to be sure to be up-to-date with it.

(29 Jan '13, 16:21) Jasper ♦♦

Who da man??? Jasper's da man! :)

(29 Jan '13, 19:32) hansangb

Also, you can see things like this visually by using the "Statistics, Flow Graph" (thanks Sake!). You can see how the various connection start ups may be impacting your "real' connection. I presented this exact type of a problem in 2009's Sharkfest session (slow SSH Login). Maybe I'll do this as my next blog post!

(29 Jan '13, 19:39) hansangb

Thx Hansang :)

(30 Jan '13, 08:15) Jasper ♦♦
showing 5 of 10 show 5 more comments