This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

How many TCP warnings should there be on a cross-datacenter connection

0

I'm trying to diagnose some issues between an Azure VM and a Google Compute Engine VM. Every now and then, the Azure server reports it cannot connect over HTTP to the GCE machine. There's no errors logged on the GCE machine.

I ran a PCAP for a while, and if I filter with _ws.expert.severity >= note, over 2% of all packets are flagged. 0.1% of all TCP packets are flagged as a retransmission. Apart from that, it seems that there's a repeating pattern of "TCP Previous segment not captured, TCP Dup ACK, then TCP Out-Of-Order". I see those groups of 3 packets repeated all over, with apparently no real effects, like increased http.time.

Does this sound typical? Could the fact that it's a VM under KVM on GCE be causing some confusion here?

asked 11 Feb '15, 16:52

MichaelGG's gravatar image

MichaelGG
6112
accept rate: 0%


One Answer:

1

How many TCP warnings should there be on a cross-datacenter connection

in an ideal world there should be zero, but in the real world there are always errors, no matter what type of connection it is (btw. you did not mention that). Without knowing the link type, I'd say 0.1% retransmissions are more than O.K.

Regarding the rest of your reported problems:

  • not every problem the wireshark expert marks as >= "note" is a real problem. You have to look at the problems yourself to classify them.
  • please take into consideration that errors in Wireshark are not always real errors on the link. If you try to capture on a heavily loaded link your capturing system might drop packets (NIC, OS, etc.) or is unable to write all frames to disk at the required speed (disk speed slower than network speed). Those missing frames are only missing in the capture file and thus will create false positives while you are analyzing the capture file! So, it's always good practice to check of there were drops during the capture process (dumpcap will show it at the end).

To troubleshoot your problem, I suggest to run dumpcap with ring buffer files (see man page of dumpcap) and with a capture filter for the destination IP address and port 80. Then monitor the error logs of the Azure server (with a script) and as soon as you see the error messages stop dumpcap. Then take a look at the last capture file and try to find failed TCP connections (RESET, etc.) and/or HTTP error messages.

Regards
Kurt

answered 12 Feb '15, 06:15

Kurt%20Knochner's gravatar image

Kurt Knochner ♦
24.8k1039237
accept rate: 15%