This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

Some websites keep dropping off for a few minutes, while others are ok…

0

(This is on a home network connected with a broadband router)

I have had this problem for AGES, but intermittently enough that I've not spent too much time investigating. But over the last few days it is happening much more often and is becoming very annoying.

Basically, when I click a link to a website works most of the time, but intermittently it will just time out. It's like specific websites just go 'offline' for a few minutes and I just have to wait until they come back. This happens to so many websites that it's clear it's not the websites that are the problem, and the fact that other websites work at the same time shows it's not a network outage.

Previously I tried changing my default DNS servers but that made no difference.

There are some websites that this keeps happening to (such as google) while others never seem to have this problem (that I've noticed).

I thought it might be a domain name issue of some type, but I've installed wireshark onto my PC and the name resolution looks fine. (ie. when I compare a failed trace to a successful trace it is resolving the name to the same IP address)

The one thing that does happen when the problem occurs though is a lot of TCP Retransmission packets. But I'm not really sure what that indicates in terms of a cause, and not really sure what else to look for.

I'm not doing anything network/broadband intensive, just browsing.

Firstly, does this problem ring any bells?

If not what steps should I take next?

asked 08 Aug '16, 03:42

wireminnow's gravatar image

wireminnow
6112
accept rate: 0%

The retransmissions are likely to be a consequence; as a symptom, their occurrence suggests that pre-TCP phase (DNS) is not the root cause.

A blind shot: there is likely a MTU problem somewhere in the network at one of alternative paths between you and these sites, causing large packets in one direction not to make it to the destination regardless how many times they get retransmitted because they don't fit into the actual MTU at that section of the path.

To check, try to temporarily reduce MTU at your PC to something like 1000 bytes. If the issue disappears, it makes sense to fine-tune the value at the PC, or to make your home router handle it instead (if possible and if you have more devices connected to it).

For more details, have a look here and search for "braindead" on the page.

(08 Aug '16, 05:07) sindy

Thanks. I just changed my adapter settings to MTU=1000 for both ipv4 and ipv6, but still getting the problem. Also, I'm losing contact with the mail servers on the same domains at the same time (even though they are often physically different servers).

(08 Aug '16, 06:09) wireminnow

Well, mail servers and web servers aren't much different when it comes down to TCP session, so if their IP addresses don't differ too much, you may expect that the network path between you and those servers is the same. Domain name is not a good enough guideline for this as it doesn't always mean the same like IP subnet.

Since it is not a MTU problem, some box may be freezing or rebooting at one of the paths, which is definitely something you cannot fix yourself but it could help you to hand over the list of IP addresses of the problematic servers to your ISP as they might be able to identify a common path to them and check what's wrong there.

(08 Aug '16, 06:16) sindy

this has been going on for probably over a year, you'd have thought if something was crashing / freezing someone would have noticed / fixed it by now? Also, isn't the whole point of TCP/IP is that it re-reroutes around network problems (which is why this seems like a weird problem). Can I check the path using tracert or something like that?

(08 Aug '16, 06:30) wireminnow

this has been going on for probably over a year, you'd have thought if something was crashing / freezing someone would have noticed / fixed it by now?

Yes I would, but I have exactly the same problem with my ISP at home, where it is not an issue of several remote servers but of the whole neighbourhood getting nowhere at all for about 5 minutes every now and then. Maybe someone else has reported it as well, maybe not, but definitely the ISP themselves haven't noticed that.

Also, isn't the whole point of TCP/IP is that it re-reroutes around network problems (which is why this seems like a weird problem).

No, it is not. The TCP control mechanisms only affect the behaviour of the endpoints, not of the transit nodes. On the other hand, some transit nodes keep track of established TCP sessions so that all packets belonging to a single session would be sent using the same route to avoid issues related to incorrect packet ordering at the receiving side (as different paths often have different travel times)

Can I check the path using tracert or something like that?

What you can do is to issue the tracert several times and see whether the list of intermediate boxes will be the same or not. If it changes, there may be an issue with one of the routes, and if you are really lucky, you may spot it while running the tracert as it would end before reaching the destination because it would hit the broken path while it is broken.

(08 Aug '16, 06:50) sindy

By "transit nodes", I had in mind routers out there in the internet. Load balancers, firewalls, and NATs next to the endpoints often interfere a lot with the TCP sessions passing through them, but they usually do not actively change routing of one session's packets either.

(08 Aug '16, 06:57) sindy

The problem I have with tracert is that it uses ICMP messages, thus isn't guaranteed to be handled the same as TCP packets by the intermediate hops and I've experienced this in the past with an ISP router with an egress filter that blocked TCP IPSEC packets, but allowed ICMP through.

I like LFT as it can do traceroutes using TCP or UDP, but it isn't available on Windows.

(08 Aug '16, 07:51) grahamb ♦

Changed MTU=1000 for IPv4 and IPv6

But keep in mind that minimum MTU at IPv6 is 1280.

(08 Aug '16, 10:44) Christian_R

Changed ipv6 to 1280. Just tried tracert while the website was 'offline' for a few minutes, and the several traces I did were exactly the same as the several traces I did when the website was working. So same route and all successful...

(09 Aug '16, 03:29) wireminnow

that's in step with what @grahamb wrote, icmp may be routed differently from tcp. Any chance you could use a linux machine to run LFT as he's suggested?

(09 Aug '16, 03:34) sindy

not without finding an old machine and installing linux on it... how useful would LFT be?

(10 Aug '16, 06:56) wireminnow

I have no personal experience with LFT so if it is not clear from the description, @grahamb may have some best practice recommendation.

But as I think of it, what kind of broadband router do you have, cable/VDSL/mobile? Theoretically, the router itself may cause the issue too, but there is little chance to find out as you usually cannot capture at its WAN side, so it is easier to replace it and see whether there is any difference.

(10 Aug '16, 07:15) sindy

As I mentioned previously, lft can trace the path using TCP or UDP (although oddly my near to hand Ubuntu 14.04 version doesn't do UDP) thus it should be subject to the same routing as your http requests\responses.

I'm not sure if it will help solve your issue, but it's handy to have in your toolbox if routing issues are suspected.

Usually it has to be run as root (for raw access to the interface) so do something like sudo lft -d 80 server.to.test for a standard http port 80 test.

(10 Aug '16, 07:31) grahamb ♦
showing 5 of 13 show 8 more comments

One Answer:

0

I've seen this on numerous networks.

It all boils down to the MTU size.

Default MTU is set for 1500 bytes. But if you're running over ADSL (PPPoE) or other network, your maximum MTU will be 1452 or less. If you're using IPSEC or other encryption protocol over PPPoE, your MTU will need to be set even smaller (1382 or there about).

The reason you can see some sites and not others is that the standard 3-way handshake consists of very small packets and thus the session can be made. But when data over the MAX MTU allowable over your network is breached, the data seems to stop loading. This is especially true if you have packets which are specified as DO NOT FRAGMENT. Some firewalls and other devices can be set to ignore the DON'T FRAGMENT flag and fragment it anyway and these devices, should allow traffic to be displayed properly.

FWIW

answered 13 Aug '16, 21:39

wbenton's gravatar image

wbenton
29227
accept rate: 0%