Here is a diagram of the network I am on. The problem is with ANY secure or encrypted traffic from ANY computer on the 20.0 network going to the 2008 server on the 30.0 network. Why I say secure, is because I can transfer big files (100 – 300Mb files) to the 2008 server via windows copy or ftp while secure/encrypted connections break. (SSL, MAPI, RDP just to name a few) Someone on MS forum (Link) with almost the same setup as mine was able to nail down the problem by changing his router Bandwidth Management setting from Priority to Rate Control. (explained below) Making the suggested change in the routers didn’t fix my problem and I am wondering where I need to place wireshark computer to get a trace that would be helpful in troubleshooting this issue? Bandwidth Management Rate Control Thank you in advance asked 30 Oct '10, 15:21 net_tech edited 30 Oct '10, 15:22 |
2 Answers:
This sounds very much like a problem with fragmentation. Because traffic needs to be encapsulated in the VPN protocol (IPsec most probably), there will be an additional header to each packet. For packets that already are at maximum size, this would create packets that can't be transported over the network. When nothing fancy is done by the VPN routers, the IP layer will take care of this by fragmenting the (too large) IP packets. But... if the DF (Don't Fragment) bit is set in the IP header of the packet, the router is not allowed to do that. The VPN router that wants to do fragmentation, but is not allowed to by the DF bit will send an "ICMP Fragmentation Needed, but DF bit set" message (ICMP type 3 code 4) back to the sender indicating this problem. If this is the case in your setup, you should be able to see those messages in Wireshark on the Win2008 server. There are a few solutions for this situation:
As for the original question, I would place wireshark on the Win2008 server or in between the Win2008 server and RV042 and start looking for ICMP type 3 code 4 messages. answered 31 Oct '10, 02:07 SYN-bit ♦♦ |
What do you mean "secure/encrypted connections" break? First I'd take a trace at a client system over on the 20 network and look at the Expert Infos for errors and look at the TCP connection and communications sequence. I'd do this for both the "good" connection and the "broken" connection. Next (or even simultaneously if possible) I'd get the traffic at the server to see what the traffic looks like at that point. You should be able to determine if the traffic along the path is having issues this way. You should also be able to tell if the problem occurs at the client or the server. Make sure you set your time column to show large gaps in time (View > Time Display Format > Seconds Since Previous Displayed Packet. You might consider doing a few conversation filters to really focus in on one connection at a time. Sometimes I'll save the connections in separate trace files using File > Save As. Then open a separate instance of Wireshark and compare the two. This should give you a good start - regardless of the cause of the problem, it's best to find out where it's occurring first and act accordingly. answered 30 Oct '10, 18:45 lchappell ♦ I though the problem existed only when data was encrypted by application, but I was wrong. I followed your recommendation and looked at the trace on the client computer on the 20 network. As it turns out, something happens when data is coming from the 2008 server on the 30 network to any computer on the 20 network, however everything looks normal when data leaves any computer on the 20 network and reaches 2008 server on the 30 network. When I attempted to copy a file from a client computer on the 20 network to a share on a 2008 server on the 30 network I had no problems. Next, I grab the same file I just copied to the share and try to bring it back to the client using the same windows copy command. Immediately I am presented with an error. “An unexpected error is keeping you from copying the file. If you continue to receive this error, you can use the error code to search for help with this problem” Error 0x80900006: Invalid Signature. Using “tcp.analysis.duplicate_ack” as my filter I am looking at 2500 duplicate ACK packets Using tcp.analysis.lost_segment” as my filter I am seeing 132 lost segments Every time I clicked retry button I got hundreds of duplicate ACKs and several lost segments. Can you tell if this is a session or transport layer problem? Once agian thank for your help! (30 Oct '10, 21:38) net_tech |
SYNbit,
This was exactly what you said "problem with fragmentation"! Is MMS also known as MTU or they are two separate settings? I could not find MMS setting in RV042, so changed MTU from Auto to Manual and set it to 1492 bytes. Assuming in Auto it was 1500? Surprisingly enough all communication problems that existed previously got resolved. 30 is a remote network for me, but I would certainly get a trace next time I am physically there. Also in the trace I got on the client from the 20 network there are 67 ICMP packets with TTL exceeded message (Fragment reassembly time exceeded)
MSS stands for maximum Segment Size and is TCP related. MTU stands for Maximum Transmission Unit and is Link Layer related. But in the end they have a close relation as the MSS is calculated from the MTU by subtracting the IP and TCP header sizes (20+20). So on a network with MTU=1500, the MSS is 1460.
You might need to lower your MTU even more, as IPsec adds more than 8 bytes of headers. An MTU of 1420 is seen quite often to incorporate IPsec on top of GRE. So 1420 is a safe value to use.
Thanks again and I will experement with the values.
If MTU is set correctly I should not see any ICMP TTL exceeded messages on the client right?
If MTU is set correctly, you should not see any "ICMP fragmentation needed, but DF bit set" messages (for traffic between the 20 and 30 network).
I'm sure you should also not see "TTL exceeded, Fragment reassembly time exceeded" for traffic between the 20 and 30 network then.
Is there any logic to why the problem didn't exist with communications to Windows 2003 server running on the same ESXi when there was an MTU mismatch?
Both servers return "Packet needs to be fragmented but DF set" for packets larger than 1472
ping -f -l 1473 192.168.30.1 (2003 Server)
Packet needs to be fragmented but DF set
ping -f -l 1473 192.168.30.5 (2008 Server)
Packet needs to be fragmented but DF set