Hi,
I am experiencing slow performance from my VNX NFS server (primarily reads). When performing a TCP capture on the VNX (server) and HP (client) I notice there are many "dup ack" and "out of order" packets. Currently I have jumbo frames configured on both the VNX and HP servers, I have been assured that jumbo frames are configured on all switches between the devices. The only verification I can perform is a ping? (I'm not sure if this is a valid test?)
[[email protected] ~]# ping -s 9000 10.230.61.77
PING 10.230.61.77 (10.230.61.77) 9000(9028) bytes of data.
9008 bytes from 10.230.61.77: icmp_seq=1 ttl=255 time=0.384 ms
9008 bytes from 10.230.61.77: icmp_seq=2 ttl=255 time=0.380 ms
9008 bytes from 10.230.61.77: icmp_seq=3 ttl=255 time=0.364 ms
Here is a sample output from the client and server, any ideas what is causing this?
NFS Server (VNX 5300)
New Column Time Source Destination Protocol Length Info No.
163 4.527343 10.230.61.77 10.230.61.9 TCP 62702 [TCP segment of a reassembled PDU] 62702
164 4.527343 10.230.61.77 10.230.61.9 NFS 3098 V3 READ Reply (Call In 162) Len:65536 3098
165 4.527343 10.230.61.9 10.230.61.77 TCP 66 760 > nfs [ACK] Seq=6741 Ack=545641 Win=10183 Len=0 TSval=326998566 TSecr=30912434 66
166 4.527343 10.230.61.9 10.230.61.77 TCP 66 760 > nfs [ACK] Seq=6741 Ack=563537 Win=10183 Len=0 TSval=326998566 TSecr=30912434 66
167 4.527343 10.230.61.9 10.230.61.77 TCP 66 760 > nfs [ACK] Seq=6741 Ack=581433 Win=10183 Len=0 TSval=326998567 TSecr=30912434 66
168 4.527343 10.230.61.9 10.230.61.77 TCP 66 760 > nfs [ACK] Seq=6741 Ack=593413 Win=10183 Len=0 TSval=326998567 TSecr=30912434 66
169 4.527343 10.230.61.9 10.230.61.77 NFS 230 V3 READ Call (Reply In 243), FH:0x0eefae18 Offset:1572864 Len:65536 230
170 4.527343 10.230.61.9 10.230.61.77 TCP 66 760 > nfs [ACK] Seq=6905 Ack=611309 Win=10183 Len=0 TSval=326998567 TSecr=30912434 66
171 4.527343 10.230.61.9 10.230.61.77 TCP 66 760 > nfs [ACK] Seq=6905 Ack=629205 Win=10183 Len=0 TSval=326998567 TSecr=30912434 66
172 4.527343 10.230.61.77 10.230.61.9 TCP 62702 [TCP segment of a reassembled PDU] 62702
173 4.527343 10.230.61.77 10.230.61.9 NFS 3098 V3 READ Reply (Call In 107) Len:65536 3098
174 4.527343 10.230.61.9 10.230.61.77 TCP 78 760 > nfs [ACK] Seq=6905 Ack=638153 Win=10164 Len=0 TSval=326998567 TSecr=30912434 SLE=659081 SRE=668029 78
175 4.527343 10.230.61.9 10.230.61.77 TCP 78 [TCP Dup ACK 174#1] 760 > nfs [ACK] Seq=6905 Ack=638153 Win=10164 Len=0 TSval=326998567 TSecr=30912434 SLE=659081 SRE=676977 78
176 4.527343 10.230.61.77 10.230.61.9 TCP 9014 [TCP Out-Of-Order] [TCP segment of a reassembled PDU] 9014
177 4.527343 10.230.61.9 10.230.61.77 TCP 78 [TCP Dup ACK 174#2] 760 > nfs [ACK] Seq=6905 Ack=638153 Win=10164 Len=0 TSval=326998567 TSecr=30912434 SLE=659081 SRE=685925 78
178 4.527343 10.230.61.77 10.230.61.9 TCP 9014 [TCP Out-Of-Order] [TCP segment of a reassembled PDU] 9014
179 4.527343 10.230.61.9 10.230.61.77 TCP 78 [TCP Dup ACK 174#3] 760 > nfs [ACK] Seq=6905 Ack=638153 Win=10164 Len=0 TSval=326998567 TSecr=30912434 SLE=659081 SRE=694873 78
180 4.527343 10.230.61.77 10.230.61.9 TCP 3098 [TCP Out-Of-Order] [TCP segment of a reassembled PDU] 3098
181 4.527343 10.230.61.9 10.230.61.77 TCP 78 [TCP Dup ACK 174#4] 760 > nfs [ACK] Seq=6905 Ack=638153 Win=10164 Len=0 TSval=326998567 TSecr=30912434 SLE=659081 SRE=703821 78
182 4.527343 10.230.61.9 10.230.61.77 TCP 78 [TCP Dup ACK 174#5] 760 > nfs [ACK] Seq=6905 Ack=638153 Win=10164 Len=0 TSval=326998568 TSecr=30912434 SLE=659081 SRE=712769 78
183 4.527343 10.230.61.9 10.230.61.77 TCP 78 [TCP Dup ACK 174#6] 760 > nfs [ACK] Seq=6905 Ack=638153 Win=10164 Len=0 TSval=326998568 TSecr=30912434 SLE=659081 SRE=721717 78
184 4.527343 10.230.61.9 10.230.61.77 TCP 78 [TCP Dup ACK 174#7] 760 > nfs [ACK] Seq=6905 Ack=638153 Win=10164 Len=0 TSval=326998568 TSecr=30912434 SLE=659081 SRE=724749 78
185 4.527343 10.230.61.9 10.230.61.77 TCP 78 [TCP Dup ACK 174#8] 760 > nfs [ACK] Seq=6905 Ack=638153 Win=10164 Len=0 TSval=326998568 TSecr=30912434 SLE=659081 SRE=733697 78
NFS Client (HP ProLiant DL360p Gen8)
New Column Time Source Destination Protocol Length Info No.
209 4.527618 10.230.61.9 10.230.61.77 TCP 66 760 > nfs [ACK] Seq=6741 Ack=545641 Win=10183 Len=0 TSval=326998566 TSecr=30912434 66
210 4.527678 10.230.61.77 10.230.61.9 TCP 9014 [TCP segment of a reassembled PDU] 9014
211 4.527750 10.230.61.77 10.230.61.9 TCP 9014 [TCP segment of a reassembled PDU] 9014
212 4.527763 10.230.61.9 10.230.61.77 TCP 66 760 > nfs [ACK] Seq=6741 Ack=563537 Win=10183 Len=0 TSval=326998566 TSecr=30912434 66
213 4.527822 10.230.61.77 10.230.61.9 TCP 9014 [TCP segment of a reassembled PDU] 9014
214 4.527894 10.230.61.77 10.230.61.9 TCP 9014 [TCP segment of a reassembled PDU] 9014
215 4.527907 10.230.61.9 10.230.61.77 TCP 66 760 > nfs [ACK] Seq=6741 Ack=581433 Win=10183 Len=0 TSval=326998567 TSecr=30912434 66
216 4.527978 10.230.61.77 10.230.61.9 TCP 9014 [TCP segment of a reassembled PDU] 9014
217 4.527990 10.230.61.77 10.230.61.9 NFS 3098 V3 READ Reply (Call In 126) Len:65536 3098
218 4.527998 10.230.61.9 10.230.61.77 TCP 66 760 > nfs [ACK] Seq=6741 Ack=593413 Win=10183 Len=0 TSval=326998567 TSecr=30912434 66
219 4.528012 10.230.61.9 10.230.61.77 NFS 230 V3 READ Call (Reply In 340), FH:0x0eefae18 Offset:1572864 Len:65536 230
220 4.528064 10.230.61.77 10.230.61.9 TCP 9014 [TCP segment of a reassembled PDU] 9014
221 4.528136 10.230.61.77 10.230.61.9 TCP 9014 [TCP segment of a reassembled PDU] 9014
222 4.528149 10.230.61.9 10.230.61.77 TCP 66 760 > nfs [ACK] Seq=6905 Ack=611309 Win=10183 Len=0 TSval=326998567 TSecr=30912434 66
223 4.528209 10.230.61.77 10.230.61.9 TCP 9014 [TCP segment of a reassembled PDU] 9014
224 4.528281 10.230.61.77 10.230.61.9 TCP 9014 [TCP segment of a reassembled PDU] 9014
225 4.528294 10.230.61.9 10.230.61.77 TCP 66 760 > nfs [ACK] Seq=6905 Ack=629205 Win=10183 Len=0 TSval=326998567 TSecr=30912434 66
226 4.528353 10.230.61.77 10.230.61.9 TCP 9014 [TCP segment of a reassembled PDU] 9014
227 4.528426 10.230.61.77 10.230.61.9 TCP 9014 [TCP Previous segment not captured] [TCP segment of a reassembled PDU] 9014
228 4.528438 10.230.61.9 10.230.61.77 TCP 78 760 > nfs [ACK] Seq=6905 Ack=638153 Win=10164 Len=0 TSval=326998567 TSecr=30912434 SLE=659081 SRE=668029 78
229 4.528498 10.230.61.77 10.230.61.9 TCP 9014 [TCP segment of a reassembled PDU] 9014
230 4.528509 10.230.61.9 10.230.61.77 TCP 78 [TCP Dup ACK 228#1] 760 > nfs [ACK] Seq=6905 Ack=638153 Win=10164 Len=0 TSval=326998567 TSecr=30912434 SLE=659081 SRE=676977 78
231 4.528570 10.230.61.77 10.230.61.9 TCP 9014 [TCP segment of a reassembled PDU] 9014
232 4.528580 10.230.61.9 10.230.61.77 TCP 78 [TCP Dup ACK 228#2] 760 > nfs [ACK] Seq=6905 Ack=638153 Win=10164 Len=0 TSval=326998567 TSecr=30912434 SLE=659081 SRE=685925 78
233 4.528642 10.230.61.77 10.230.61.9 TCP 9014 [TCP segment of a reassembled PDU] 9014
234 4.528652 10.230.61.9 10.230.61.77 TCP 78 [TCP Dup ACK 228#3] 760 > nfs [ACK] Seq=6905 Ack=638153 Win=10164 Len=0 TSval=326998567 TSecr=30912434 SLE=659081 SRE=694873 78
235 4.528715 10.230.61.77 10.230.61.9 TCP 9014 [TCP segment of a reassembled PDU] 9014
236 4.528725 10.230.61.9 10.230.61.77 TCP 78 [TCP Dup ACK 228#4] 760 > nfs [ACK] Seq=6905 Ack=638153 Win=10164 Len=0 TSval=326998567 TSecr=30912434 SLE=659081 SRE=703821 78
237 4.528787 10.230.61.77 10.230.61.9 TCP 9014 [TCP segment of a reassembled PDU] 9014
238 4.528798 10.230.61.9 10.230.61.77 TCP 78 [TCP Dup ACK 228#5] 760 > nfs [ACK] Seq=6905 Ack=638153 Win=10164 Len=0 TSval=326998568 TSecr=30912434 SLE=659081 SRE=712769 78
239 4.528870 10.230.61.77 10.230.61.9 TCP 9014 [TCP segment of a reassembled PDU] 9014
240 4.528880 10.230.61.9 10.230.61.77 TCP 78 [TCP Dup ACK 228#6] 760 > nfs [ACK] Seq=6905 Ack=638153 Win=10164 Len=0 TSval=326998568 TSecr=30912434 SLE=659081 SRE=721717 78
241 4.528884 10.230.61.77 10.230.61.9 NFS 3098 V3 READ Reply (Call In 206) Len:65536 3098
242 4.528889 10.230.61.9 10.230.61.77 TCP 78 [TCP Dup ACK 228#7] 760 > nfs [ACK] Seq=6905 Ack=638153 Win=10164 Len=0 TSval=326998568 TSecr=30912434 SLE=659081 SRE=724749 78
243 4.528959 10.230.61.77 10.230.61.9 TCP 9014 [TCP segment of a reassembled PDU] 9014
244 4.528970 10.230.61.9 10.230.61.77 TCP 78 [TCP Dup ACK 228#8] 760 > nfs [ACK] Seq=6905 Ack=638153 Win=10164 Len=0 TSval=326998568 TSecr=30912434 SLE=659081 SRE=733697 78
- edit - akismet thinks im pasting spam :/ so here is my reply *
Packet captures:- VNX http://cloudshark.org/captures/1f8ff433f45c HP server http://cloudshark.org/captures/5d0ae1a8c89f
Here are the new pings, it looks like the packets are being fragmented after 8973 bytes? I know the servers are set to 9000 MTU and the switches are said to be set to 9044 MTU.
[[email protected] ~]# ping -M do -s 8972 10.230.61.77
PING 10.230.61.77 (10.230.61.77) 8972(9000) bytes of data.
8980 bytes from 10.230.61.77: icmp_seq=1 ttl=255 time=0.351 ms
8980 bytes from 10.230.61.77: icmp_seq=2 ttl=255 time=0.367 ms
8980 bytes from 10.230.61.77: icmp_seq=3 ttl=255 time=0.347 ms
[[email protected] ~]# ping -M do -s 8973 10.230.61.77
PING 10.230.61.77 (10.230.61.77) 8973(9001) bytes of data.
From 10.230.61.9 icmp_seq=1 Frag needed and DF set (mtu = 9000)
From 10.230.61.9 icmp_seq=1 Frag needed and DF set (mtu = 9000)
From 10.230.61.9 icmp_seq=1 Frag needed and DF set (mtu = 9000)
Thanks very much for the input so far, it is evident that packets are being lost from 10.230.61.77 (VNX) to 10.230.61.9 (HP Server). I have managed to get the switch config for the ports where these devices patch into and I have noticed that we have flow control (pause enabled) for all the ports except the VNX which has flow control disabled. Isn’t it best to enable flow control throughout?
The setup is as follows:- VNX -(10Gb)- switch -(10 Gb)- switch -(1 Gb)- HP Server
EMC VNX:-
Broadcom 10 Gigabit Ethernet Controller
0: fxg-1-0 IRQ: 24
txflowctl=disable rxflowctl=disable
Link: Up
0: fxg-1-1 IRQ: 25
txflowctl=disable rxflowctl=disable
Link: Up
VNX switch port:-
interface port 1/28
pause enable
switchport mode trunk
switchport trunk add 10-11,20-21,4078
flooding dlf enable
flooding mc enable
flooding bc enable
flooding limit 100000
HP switch port:-
interface port 1/8
pause enable
switchport mode trunk
switchport trunk native 11
switchport trunk add 10
HP Server:-
[[email protected] ~]# ethtool -a eth4
Pause parameters for eth4:
Autonegotiate: on
RX: on
TX: on
asked 03 Jun ‘13, 19:19
elfelvin
11●2●3●4
accept rate: 0%
Could you upload the capture from a file to cloudshark and post the URL? Should be a lot easier to read through (http://cloudshark.org/). Duplicate acks here effectively mean that 10.230.61.9 detects packet loss for this TCP session and is pushing for a retransmission.
For the ping test, could you try setting the do-not-fragment flag? I believe that’s either
-M dont
or-D
flag depending.I cannot add a comment, akismet thinks im pasting spam. I have edited the main post with details
Thanks.
10.232.61.9 is saying that it has to fragment, which I believe is your HP client, since it’s the IP with the false checksum errors in the client-side trace file. It might have negotiated PMTU from upstream, otherwise I’d check the network card settings and support for jumbo frames on that client.
From the “edited” name in the files, how have they been changed? There are ACKs for missing segments through the VNX trace, meaning that the traffic actually did happen but this capture file didn’t get it. There is definitely actual packet loss observed from the client too, with those duplicate ACKs, despite the trace file not having all the packets that the two systems actually saw.
How is CPU looking on these systems during the performance issues? With a large amount of packets, all fragmented, the processor may be being taxed which could also explain why the local packet capture is missing packets that had to have happened (the acks for segments not in the trace).
I still cannot post responses. I have updated my main post.
I have compared a working NFS pcap to my “slow” one and noticed that after packet 30 (ie where the export gets mounted) there is no 3 way handshake with the NFS server and it seems like the window size is about ~24000 whereas the working capture is using a window size of ~139000. I am using the same client but to different VNX servers to test this scenario.
My question is, why / how is the NFS client still talking with the server even though there is no 3 way handshake? Is it possible that because this handshake is missing that the TCP window size was never updated, and this is what is causing my slow response?
The capture starts before I mount the export so there is no chance I would have missed it.
Any NFS gurus out there?
If you don’t have the three-way handshake in the trace, how are you determining the receive window size? You need the handshake to say what the window scale factor is.
Was the session ever actually torn down? If you don’t have the three-way handshake but see TCP data sent between the hosts, either the session already existed before you started the trace or the trace somehow did not capture the handshake. Rest assured, if there is no three-way handshake there is no TCP session, so it definitely happened if you’re seeing TCP data segments.
Edit: I’m also going to delete my old answer since this isn’t quite answered conclusively yet it appears. Can you post the comparison traces of good and bad? The “bad” trace you first uploaded definitely had packet loss, and determining the source of that will hopefully lead to the source of the problem. Moving Wireshark to the switch upstream from the client may be one possible way to start pinpointing it.
elfelvin - did you resolve this and if so, how? We have a similar issue with our VNX using 10Gb to talk to “HP” boxes.
Hi, Sorry for taking so long to reply, but the issue was with a buffer on the switch. Basically VNX 10G -> 10G Swith -> 1G Switch -> 1G HP interface
I can’t remember the exact switch metric but looking at the switch port statistics where the HP server was patched into I could see a metric relating to dropped packets incrementing. We spoke to the vendor and they acknowledged that going from a 10G switch to a 1G switch had this affect as the memory / buffer on the switch was not large enough to cope and thus it dropped the packets.
Regards, Elvin