NFS and TCP ACKed Unseen Segment

Question

Hi,

Environment consists of VMware ESXi 5.0 (Patch 9) servers and NetApp NAS/Filer.

Experiencing periodic drops of NFS exports from the ESXi hosts. The exports automatically reconnect after 3 minutes.

Storage vendor has reviewed tcpdump captures from the VMware hosts and believes they indicate network issues.

Seeing many "TCP ACKed unseen segment" and "TCP Previous Segment not captured" messages in Wireshark.

Captures uploaded to - https://www.cloudshark.org/captures/a0fdd8dbca3e

tcpdump configuration used -

# tcpdump-uw -i vmk3 -s 1514 -C 1M -W 10 -w /vmfs/volumes/test.pcap
tcpdump-uw: WARNING: SIOCGIFINDEX: Invalid argument
tcpdump-uw: listening on vmk3, link-type EN10MB (Ethernet), capture size 1514 bytes
tcpdump-uw: pcap_loop: recvfrom: Interrupted system call
189449 packets captured
189449 packets received by filter
0 packets dropped by kernel

Thanks for you assistance!

Answer 1

What you have is massive packet loss, meaning, that your capture wasn't able to record all packets coming in. This comes as no surprise, because capturing intense storage traffic cannot be performed without this kind of loss with normal PCs. You'd need a special high performance appliance for this kind of thing.

There are many places where packet loss can occur, with the kernel being only one of them. So even if TCPdump says 0 packets dropped by kernel it doesn't mean there is no drop.

What you could do is reduce the amount of bytes per packet captured to 64 or 128 bytes, because in situations like this the payload doesn't matter. You want to look at TCP behavior, and for that 64 bytes are more than enough.