This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

Help with fault finding dropped connection on Netgear R7000

0

Good day,

I'm a little stuck here. I own a new Netgear R7000. I installed several beta versions of dd-wrt but the following version is now installed. The problem I'm will describe is version independent and always exists.

Specs; - DD-WRT v3.0-r28600M kongac (12/31/15) - Linux 3.10.94 #59 SMP Thu Dec 31 09:20:23 CET 2015 armv7l - Broadcom BCM4709

The problem is that the WAN/LAN connection freezes randomly many times a day for short periods. It's very annoying and I'm not able to find the reason for it. It's also notices when ssh into the network because it disconnects all the time. I searched the log files in the router but there is no error what so ever. To be sure the problem is located in the router I setup a "ping" test to several computer in and outside the network.

You see the capture file attached filtered for the Ping replies > 50ms. The ping test from this capture is initiated from a laptop with a wireless connection but I also tried this from a NAS connected by LAN directly with the same result.

When you look to the capture actually all pinged computers have the same result, in or outside the LAN. The router "192.168.1.1" response is faster than the others but still way above the 1ms or less for all other replies.

I read some topics about the router having hardware problems etc? But I really like to pinpoint what would exactly cause this issue. Is somebody familiar with the problem?

I don't think it's dd-wrt related but I'm not sure. All versions have the same issue. Even after clean install and the 30/30/30 reset with basic settings the problem already occurs..

Hope anyone has a brilliant idea or can send me in the right direction for fault finding. It looks like the router just freezes for short periods but no errors are generated. Maybe there a way to determine with Wireshark if it's hard or software related.

Capture file of the ping replies > 50ms

asked 20 Feb '16, 11:04

RFMuser's gravatar image

RFMuser
11337
accept rate: 0%

As you are new to Wireshark: a capture file contains only the raw data of each packet augmented with some limited extra information (timestamp, interface name), but not with results of any previous analysis.

So if you filter only the ping responses from your capture file into a new one, the information about their delay from the matching requests is lost, because Wireshark calculates the delay from the timestamps when processing the capture file, but the file structure does not allow it to store the results to the output file.

Next, delays of 50 ms would definitely not break your ssh session, you needs tens of seconds of silence for this to happen.

Third, there may be a hardware problem, causing the router to miss the packets. This would explain why it happens regardless of the OS version. So I'd recommend to install a tcpdump package on the router itself, and run a capture simultaneously on the router and on a PC connected to the internet through it, and then download the capture file from the router and compare how the same tcp session looks like in both these captures.

(20 Feb '16, 11:18) sindy

As you are new to Wireshark: a capture file contains only the raw data of each packet augmented with some limited extra information (timestamp, interface name), but not with results of any previous analysis. So if you filter only the ping responses from your capture file into a new one, the information about their delay from the matching requests is lost, because Wireshark calculates the delay from the timestamps when processing the capture file, but the file structure does not allow it to store the results to the output file.

I see it now. What would be the best way to share it then without sharing to whole capture file?

Next, delays of 50 ms would definitely not break your ssh session, you needs tens of seconds of silence for this to happen.

I just filtered on ping replies >50ms but it can take up much longer. I saw replies above 7100ms. Then it drops the Shh connection. It happens randomly but many times an hour.

Ping reply > 7000ms

Third, there may be a hardware problem, causing the router to miss the packets. This would explain why it happens regardless of the OS version. So I'd recommend to install a tcpdump package on the router itself, and run a capture simultaneously on the router and on a PC connected to the internet through it, and then download the capture file from the router and compare how the same tcp session looks like in both these captures.

Ok I have to find out how I can implement this on the DD-WRT router.

(20 Feb '16, 11:59) RFMuser

What would be the best way to share it then without sharing to whole capture file?

It depends. You can apply a display filter icmp, instead of the icmp.resptime > 0.05 you've probably used, to limit the packet list to all icmp packets, or you can manually enter a list of individual frame numbers when exporting them. Under circumstances you may want to use a bigger hammer more sophisticated tools, but for the moment I don't think your issue deserves it.

I have to find out how I can implement this on the DD-WRT router.

I can't help here, I know it is possible with OpenWRT but I have no idea whether your Netgear is supported.

As you mention WLAN, could it be as simple as that the 2.4 GHz band at your place is simply too crowded?

(20 Feb '16, 12:32) sindy

2 Answers:

0

In your original post, you mentioned "from a NAS connected by LAN directly with the same result."

I assume that means you connected the NAS and some other computer directly to the wired Ethernet LAN ports on the Netgesr router and then pinged each device. If you haven't done that, please try that first. That will eliminate the WLAN being the culprit as @sindy suggested.

If you do continue to see the problem, then my suggestion is the following:

  1. Go to the Netgear site and install the latest approved firmware for your router.
  2. Test the ping on both wired and wireless connection
  3. Install DD-WRT
  4. Retest

I know this might seem trivial, but I have experienced some wireless routers not behaving very nicely with DD-WRT installed.

answered 21 Feb '16, 03:50

Amato_C's gravatar image

Amato_C
1.1k142032
accept rate: 14%

Hi, at the moment I'm working remotely. I just setup another test.

The NAS is pinging several computers remotely and locally every 5 seconds. Then I started tcpdum on the NAS and the router.

I already noticed earlier that regularly the pings are not coming trough the router but I will check the dump with wireshark when it's ready.

If nothing works I will try installing the netgear firmware and see what happens.

(21 Feb '16, 17:47) RFMuser

What you wrote suggests that the NAS is connected to the router using wired Ethernet while the rest of the "local" computers is connected using WiFi. Can you provide your network diagram or description so that it would be clear which machine (NAS, PC1, PC2) is connected using what media to the router and which machine pings which other machine (including the router) with what result?

E.g.

  • NAS pings the router over wired ethernet only, no loss, maximum response time 5 seconds

  • PC1 pings the router over WiFi, 10 % loss, maximum response time 1 second

  • NAS pings PC1, 5 % loss, maximum response time 10 seconds

(22 Feb '16, 01:09) sindy

Hi Sindy,

Just a simple description now;

NAS is wired to ROUTER

LAPTOP is wireless to ROUTER

ROUTER is WIRED to MODEM

NAS is setup to ping laptop,router,modem, and exernal IP address every 5 seconds.

I did a tcpdump on the NAS and on the router (LAN interface side).

Both TCP dumps show exactly the same. I can see that exact every 20 minutes all pings going out to the WAN fails!

So ping to MODEM, and External IP address fails. Only ping to Laptop and Router itself are ok.

Attached a picture of the failed replies to the external IP address and modem where you can see that the request doesn't even leaves the router.

I find it very suspicious that the time interval is so steady at 20 minutes. I didn't set anything special in the router.

to be continued... Ping test_Wireshark

(22 Feb '16, 20:17) RFMuser

This way we start getting somewhere. You have demonstrated to yourself that both the wired and wireless LAN are fine, and that the issue appears only when routing and/or the WAN interface and connection is involved. (I suppose you use the same IP subnet for wired and wireless LAN so there is no routing between the NAS and the LAPTOP, only switching, right?)

Is the router configured to ask the modem for an IP address at the WAN side using DHCP or have you assigned its IP address manually?

Please ping the modem from the NAS while tcpdumping both LAN and WAN on the router simultaneously. Depending on your tcpdump version, -i lan_if_name -i wan_if_name may be enough, or you may need to run two tcpdump instances, each capturing on one of the interfaces, and then merge them together chronologically.

Do not limit the capture to icmp so that you could see other traffic (and eventual gaps or peaks in it).

I would suppose to find one of the following when the gap in ping responses occurs at LAN side:

  • if the router is a DHCP client, a DHCP lease has expired, and either the router does not ask for a new one fast enough or the modem does not provide a new offer fast enough,

  • there is a burst of some other traffic which squeezes the icmp out (depending on possible DDWrt's traffic shaping settings, not very likely),

  • there is a dead silence at the WAN, or even the tcpdump terminates, saying that the interface is down: that would indicate some hardware problem of the router or anything further (most likely the modem, less likely the router, least likely your ISP's DSLAM or CMTS as the modem does not respond to pings itself).

So you may find out that the modem reboots or re-establishes the connection to the DSLAM/CMTS every 20 minutes.

(22 Feb '16, 22:26) sindy

0

Hi again, ok I can post legally an answer now ;-)

After the suggestions from Sindy I immediately thought of the modem with build in switch and router. I set the WAN side of the Netgear router to static 192.168.178.100. (LAN on 192.168.1.x) The modem has an LAN IP of 192.168.178.1. For some reason(probably forgotten) I left the DHCP server running on the modem. Although there were no clients connected and the address xxx.xxx.xxx.100 of the router was outside the DHCP, I guess the modem was trying to renew something. I didn't really find a Wireshark message, however, didn't look long for it either. I just switched off the DHCP server on the modem an VOILÀ! No more disconnections. The DHCP server of the modem was not set to a lease time of 20 minutes(and no clients) but for some reason there was a real steady disconnection visible in Wireshark.

I just started off with Wireshark but what a powerfull tool! It gave me a real nice view of what is going on. Of-course with the suggestions on the forum as great addition! Before I was thinking the router was bad or the dd-wrt software buggy. Looking into the TCPDUMP file of the router I found some additional problems which would otherwise be unnoticed. I will continue to use now I know the basics of it.

Thanks!

answered 24 Feb '16, 07:29

RFMuser's gravatar image

RFMuser
11337
accept rate: 0%