This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

Potential Packet Loss

0

2 sites connected by VPN over WAN link. We are replicating our VMware VMs over link from our head office (Site A) to a hosted service provider (Site B). 10MB fibre at our end and 100MB at their end.

VPN coming from our MS TMG 2010 box (edge firewall mode) to the Hosted Providers Cisco ASA. Have followed recommendation for TMG-Cisco ASA VPN settings (encryption, integrity, DH group etc).

The VPN is stable, no disconnects.

But we are getting some unusual connection behaviour. VMware SRM connection issues - various errors/disconnects and RDP session hangs amongst other intermittent things.

So ran a Wireshark trace on both Virtual Center Servers (one at each site) during the same time period so that I could analyse packets leaving one side and arriving at the other side. Fairly new to Wireshark but will list my findings:

  • Seem to have intermittent packet loss in both directions. Can see packets leaving Site A for example and not arriving at Site B. And vice versa.
  • Seem to be packets being sent ‘out of order’. Pardon my poor description but when matching up packets on both captures, I can see packets leaving in sequence but upon checking destination server they are in different order. Again happens in both directions. Not sure if this is normal behaviour.
  • Packets arriving with different ACK number? Not sure if this is possible but looks like the same packet which has left Site A and arrived at Site B but with different ACK number.

I tried carrying out simultaneous captures on both Virtual Centers and on the TMG box to see if I could pin down exactly where the packet loss is occurring but I can only see flow of traffic in the one direction on the TMG. Can see traffic coming back the way from Site B on the TMG but obviously it is the encrypted VPN traffic so am unable to see if the packet loss is on the TMG or somewhere else.

Can anyone offer any hints or tips which would aid me nailing this one down? I believe I can capture traffic on my laptop for a non-windows device (like the Cisco ASA) , would this be my best bet, to run simultaneous captures on the 2 VC’s, the TMG and for the Cisco ASA??

Should point out, we have other VPNs setup to less bandwidth and higher contended links via VPN and don't have any connection issues.

Many thanks Steve

asked 06 Aug '13, 09:35

tebers's gravatar image

tebers
11112
accept rate: 0%


One Answer:

0

If I read your question correctly, you made traces on the VM guests and on the TMG box itself. As these boxes will process the traffic themselves and may have some optimizations (TCP checksum offload, TCP segmentation offload, etc), you will not see exactly what is put on the network (capturing takes place between the IP stack and the NIC driver). It is better to use mirror/span ports to copy the packets found on the network to Wireshark.

I would proceed with one Wireshark system per location:

  1. At Site A I would span the port on which the TMG is connected. You will see the unencrypted data both ways and also the encrypted data both ways (you might want to use a capture filter like "arp or icmp or host <VM site A> or <VPN endpoint site B>"

  2. At Site B I would span both ports of the ASA (the Internet side and the WAN side) and then (if needed) use a capture filter of "arp or icmp or host <VM site B> or <VPN endpoint site A>"

Now you can follow the whole flow and see where the packet-loss and re-ordering is occurring. Please note that IPsec ESP packets have a sequence number too, so you can check those too for packet-loss and re-ordering.

answered 06 Aug '13, 22:35

SYN-bit's gravatar image

SYN-bit ♦♦
17.1k957245
accept rate: 20%

edited 06 Aug '13, 22:38

Thanks for your reply.

Well the first trace I carried out was just on VC's. Simply because the VC's hold the SRM role and it is SRM connection issues we are seeing.

The second set of traces was on the VC's and the TMG and that was where I discovered that I was only seeing flow of traffic in one direction.

And I had disabled checksum offload on all the NICs on the VC's and the TMG before running all the traces.

But regardless of all that, what you have suggested will still be the case. Now... I understand the logic of it perfectly but being fairly new to Wirehsark I am not 100% sure how to put it into practice! So I may come back to you. :)

Thanks again, Steve

(08 Aug '13, 08:25) tebers