I have an embedded device that is running an http server that returns [TCP ZeroWindowProbeAck] packets if and only if the gateway/router that the server lives on has NAT turned on. More details: The embedded device is using Z-World Rabbit Web Server and the router is a Digi ConnectPort - the ConnectPort LAN lives on 10.10.6.1. In the ConnectPort there is a setting called Enable Network Address Translation (NAT). When NAT is turned off, the browser session between my PC and the embedded device is fine and normal. However, when I turn NAT on (in the ConnectPort), the web server response to the PC slows to a crawl. Using Wireshark I realized that when NAT is turned on, the session between my PC and the embedded web server is is returning many packets that contain [TCP ZeroWindowProbe] and [TCP ZeroWindowProbeAck] info. Here's a c/p of just a few of those packets (within a span of a few minutes I will notice hundreds of these zero window packets). Note: 10.10.6.100 is my PC and 10.10.6.106 is the web server.
When I turn off NAT in the ConnectPort the zero window probe packets stop completely. I’m fishing for insight on how the NAT setting in the ConnectPort would cause the web server to send zero window probe acknowledgments to the PC client. Update (response to Kurt)
Here’s more details on the network setup:
Update 2 If and only if NAT is enabled do I get the ZeroWindowProbeAck from the server. Strangely enough, the ZeroWindowProbeAck will occur whether or not the NAT is being hit. In other words, the ZWPA messages occur even when I am connected to the server locally (via LAN) (eg. A web browser on 10.10.6.100 connects directly to the HTTP port at 10.10.6.106). The embedded device, the PC, and the ConnectPort are connected via a simple hub (no smart switch involved). Update 3 I have uploaded a Wireshark capture to http://www.cloudshark.org/captures/937f5b4667cf (My original capture was greater than 10MB, so I had to remove a few thousand packets. But this truncated version should be enough to show the ZeroWindowProbeAck packets that I am referring to). Note: I wasn’t able to get a capture with the problematic devices while they were connected to the Digi ConnectPort. However, using the Cradlepoint CTR35 I was able to reproduce the problem. IP Assignments
Here is the list of events that I did during the Wireshark session:
Here is how the nodes are connected asked 21 Dec ‘12, 07:19 KTM edited 17 Jan ‘13, 09:13 |
2 Answers:
Enabling NAT on the router will not influence the connection between the client and the embedded device, as they are directly connected. So something else must be of influence here. Since the seq and ack in the packets are both 1, I assume that these are packets following the TCP three-way handshake. This means the server accepts a connection, but says it has no buffer to receive any data. What happens when you do enable the NAT, but don't create the port forwarder? Maybe your embedded device gets swamped with traffic from the internet? answered 21 Dec '12, 12:47 SYN-bit ♦♦ I haven't tried to see what happens when I remove the port forwarder - good idea. Thanks for that. I'm confident that the device isn't getting swamped from the Internet (only certain IP's are allow to access the LAN from the WAN). (21 Dec '12, 13:25) KTM |
answering to your update #2. I don't see how NAT will have an influence here, especially if the NAT 'rule' is not being used. What I could imagine: As soon as you enable 'incoming' NAT (port forwarding) for the server, the router also (silently) enables outgoing NAT (masquerading, hide nat, you name it) for the server. As soon as that is enabled, the server might start downloading data from the internet (firmware updates, etc.). While is it busy, it might not have enough resources to answer your internal requests appropriately (full buffer -> zero window). However, without a full trace (pcap file), it's hard to tell what's going on. Can you post the two pcap files (with/without NAT enabled) captured at the client somewhere (one click file hoster, cloudshark.org, etc. BEWARE of the privacy issues in doing so!). Can you also try to capture the traffic in front of the server (maybe using a second PC/laptop)? BTW: Are you sure your 'hub' is a real hub. If so, you should see the traffic from the server to the internet while capturing on the client. Do you? If yes, capturing at the client should be sufficient. Then, please capture the traffic with/without NAT by using the following capture filter:
Regards answered 21 Dec '12, 13:39 Kurt Knochner ♦ edited 21 Dec '12, 14:00 I didn't/wouldn't think NAT would have an influence, either. I wonder if when NAT is enabled, it will continually update its NAT Table (even if a request hasn't been made from the outside)... if so, could it be that that NAT traffic is causing the 10.10.6.106 server to fill up its buffer. I'll look into posting the pcap file and the pre-server-traffic capture. Thanks for the offer to look. @hub - I don't know what you mean by "real". It looks and works like a hub. (21 Dec '12, 13:59) KTM
well, nowadays switches are sometimes labeled as 'switching' hubs whereas they are really (unmanaged) switches. However, people often just read the hub part ;-) So, it's actually hard to buy a 'real' hub, as they are no longer needed (cheap switch alternatives available) and therefore they are no longer produced in masses. (21 Dec '12, 14:04) Kurt Knochner ♦ Most devices these days are switches, even though they don't say switch on the box. A switch works much differently from a hub and that has a great impact on what you do and don't see on your capturing system. Re-reading your update, you state that the embedded system and the PC are "hub-connected" do you mean they are attached to a "hub" and the "hub" is connected to the ConnectPort? Or do you mean they are both connected to the LAN ports of the ConnectPort? (21 Dec '12, 14:07) SYN-bit ♦♦ I interpreted his update #2 to my question as a separate hub. However, now I'm no longer sure.. (21 Dec '12, 14:11) Kurt Knochner ♦ @Kurt - Sorry, I didn't think your question through regarding "real" hub. You are correct, I'm using an unmanaged switch (Netgear FS605 (http://goo.gl/6pJ4c) .. on the box it was labeled as a "Switch/Hub"). The PC, the embedded device, and the ConnectPort all connect to the switch/hub. So, in total, there are four devices involved - PC, embedded device, switch/hub, and the ConnectPort. (22 Dec '12, 08:02) KTM O.K. then a capture in front of the embedded server might help. But let's start with the client captures with/without NAT. Can you upload those somewhere? (22 Dec '12, 09:15) Kurt Knochner ♦ One other thing to check is the arp entry for the embedded server on your PC, does it point to the embedded device or to the Connectport? And the traffic coming back from the embedded device, what src-mac address does it have in Wireshark. So, with all 4 devices connected and NAT being enabled, it is slow. What happens immediately after you disconnect the Connectport (after verifying that it was slow when the Connectport is connected)? I agree with @Kurt, traces traces, we want to look at traces :-) (22 Dec '12, 12:36) SYN-bit ♦♦ The holiday's will keep me from gathering more data and doing more tests. Hopefully you gents will still be interested in this topic to start off the new year :). Thanks, again. (22 Dec '12, 17:22) KTM Usually the questioners loose interest in their question/project ;-)) Anyway, I'm looking forward to your updates. (23 Dec '12, 02:18) Kurt Knochner ♦ showing 5 of 9 show 4 more comments |
the same subnet? How is NAT involved?
Just to clarify.
After you enable NAT on the router you get ZeroWindowProbe while you connect to the server internally (not through the NAT device)? And you don't see those ZeroWindowProbe internally when you disable NAT?
If so: Do you have an internal switch, or are your internal systems (client and server) connected to the router via a port of that router (internal switch of the router)?
@Kurt - see Update 2. Thx