I recently installed conky (system monitoring application) on my Linux Mint 17.3 PC, and I have since noticed some intermittent odd network behaviour. At random times during the day, I see a sustained spike in upload & download traffic on my eth0 interface (normal wired NIC). It lasts 10-30 minutes, then goes away on its own. Normally, the conky output for network traffic looks like this: 10KB/s down; 11KB/s up. I'm listening to music on YouTube; so this seems pretty reasonable. BUT! Sometimes, I happen to glance over & see the traffic graph, and it looks like this: The white "BAR" is the up/send side of my NIC max'ing out this chart. Yes, it's only 630 KB/s, but that seems pretty dang high, considering that I'm not initiating it. So, I'm naturally wondering "what is using all of that bandwidth?" I've dug in a bit, and (using a combination of nethogs, netstat, tcpdump, and Wireshark) I have identified that the culprit in all situations is a single connection to a remote host. The connection is either on :80 or :443; and is simply a sustained loop of ACK's (coming from their side) and RST's (coming from me). Normally, NetHogs looks a bit like this: Note the Process ID on the left (handy for identifying the source of the traffic) But when the 'Spike' is happening, the NetHogs output looks like this: Note the '?' in the PID. Yes, I'm running NetHogs as root. How can it not know?? So, I dump the traffic with Here's the actual tcpdump output for the extra-geeky:
Anyhow, my (limited) understanding of what's going on with the RST's & ACK's is this: They say: "ACK", meaning "OK, I got what you said" I say: "RST", meaning "Right, let's end this conversation" But even after I 'reset' the convo; they keep ACK'ing me! This goes on for several minutes; then the traffic just dies off after awhile. (I haven't timed it; that might be interesting info too). A few other pieces of info:
So, all of that to lead up to my questions:
asked 05 Jan '17, 13:36 jonwadsworth |
One Answer:
Hello jonwadsworth! This is one of the coolest RST loops that I have ever seen. Here is what's going on: Your Linux box is trying to reset a connection. You figured that out correctly. For some reason this packet doesn't make it to the destination address. Instead you get these retransmitted ACKs all the time. From a receivers point of view it is perfectly normal, if data arrives after a Reset has been send. That's usually data in transit that was transmitted by the remote system before it has received the RST. This happens usually within a few milliseconds (1 TCP roundtrip time to be precise). In your tracefile we see TCP packets without data arriving for more than 8 seconds. That's odd. Let's take a closer look. The ACK packets with source address 104.196.229.58 have a TTL of 64 and an IP ID of 0. Both are odd values. The TTL has an initial value of 255, 128 or 64 (depending on your operating system) and is reduced by one when a router forwards a packet. A TTL of 64 indicates that the very packet originates from a host on your local network. My guess is, that your firewall is sending this packet for what ever reason. Next the IP ID. This is something like a 16 bit serial number that is used to process fragmented IP packets. Depending on your OS the IP ID could be incremented by one for each packet in a TCP connection or each packet send by the host (independent of the TCP connection). Or it could be a just a random number. In your tracefile all ACK packets have the same IP ID of zero. This is another indicator, that the packets are not coming from the remote host. If 104.196.229.58 had send these packets as real retransmissions we would see different IP IDs. I assume, that your firewall is doing something strange. I am pretty sure that neither the RST packets nor the ACK packets are visible on the external interface of the firewall. It would be greatly appreciated if you could tell us what manufacturer / software / configuration is responsible for this behaviour. Oh, by the way: Capturing directly on the firewall may or may not give you a correct representation of the traffic leaving the physical interface. The result depends on the order of operation on the firewall. The tracefile will be different, depending on the fact if the filter engine kicks in before or after Wireshark / tcpdump have access to the packet. Anyway, thank you for one of the most remarkable TCP behavior that I have seen in a while. Thumbs up. Good hunting! answered 07 Jan '17, 02:42 packethunter Hey Packethunter! Wow, some great info there. Well, it's happening again today, so I have more data! And, it looks like your suspicions are correct. First -- the Firewall is a Sonicwall TZ200 -- on their low-end of devices; but it has the latest firmware updates, etc. Here's the SonicWall's Packet Monitor output for the address in question, during a 'event': I'm not sure if this site will display the entire width of that image, so in case you can't see -- the far right columns show that the packets with a "Source IP" of 50.234.5.128 (the ACK's) are actually being GENERATED by the SonicWall! This is super crazy to me -- it's as if the SonicWall is forging/impersonating that device. I suppose that it thinks that it has a good reason for this -- but seems crummy to me. Then, when my PC sends back the RST, the logs indicate that the SonicWall has DROPPED those packets. Presumably so that the actual server at 50.234.5.128 doesn't get these RST's and wonder WTF is going on. In case you'd like to have a look---- I have added my PCAP from my local system: https://github.com/jonwadsworth/pcap/blob/master/dump.pcap And the pcap from the SonicWall: https://github.com/jonwadsworth/pcap/blob/master/packet-c.pcap OK, so: 1) Any clue why a Firewall would GENERATE ACK's for a remote device? 2) Could something in my configuration be causing this? 3) Any idea how to prevent this from happening in the future? Thanks again for help in sleuthing this out!! (09 Jan '17, 12:27) jonwadsworth So it's happening again... it seems like the SonicWall is detecting this as an RST flood. I'm guessing that this is an overly aggressive TCP flood detection setting on the SonicWall. Here are my TCP settings: Any ideas? Thanks!! (09 Jan '17, 13:32) jonwadsworth 1 Is there any setting for ssl or tls content filtering or control? (09 Jan '17, 13:44) Bob Jones I have seen this behaviour with :80 as well as :443; so I'm inclined to think it's not SSL-specific -- but I'll check & report back (10 Jan '17, 09:28) jonwadsworth There are 3 tracefiles attached to this question: - Client - FW - Server Am I right? (10 Jan '17, 11:27) Christian_R Please try to disable the RFC 5961 enforcement RFC 5961 addresses a problem in TCP where an attacker might be able to break a session by guessing sequence and acknowledge numbers within a certain range. This RFC adds a few extra checks to make the attackers life harder. It is possible that the firewall is a bit overzealous with it's checks. Just a (very) wild guess: Your initial post mentions, that the problems started after the installation of a monitoring application. Is it possible, that a source port number was re-used? The RST/ACK loop could be explained, if the firewall uses stale data from an internal table to perform the RFC 5961 checks. Can you browse through your log files to check if (and when) this combination of source IP, source port, destination IP and destination port was used? On the other hand: The case can be closed, if disabling RFC 5961 enforcement both fixes the problem and leaves you happy Good hunting (11 Jan '17, 14:10) packethunter That's right, they are in my Github, linked above. (11 Jan '17, 14:22) jonwadsworth Hm, from timing in the sonicwall trace it seems the FW generates the ACKs. But if I remember right, can't open the traces at the moment, the TTL in every trace and and every packet is 64. That looks unexpected to me, as the traces have been captured at different point and the ip addresses suggest that there should be a routing instance in the path. (12 Jan '17, 01:09) Christian_R In all the tracefiles provided on Github I can only see the RST/ACK loop. To verify or to refute my guess regarding the RFC 5961 implementation working with stale data we need a much longer tracefile. Not sure, if you can force this with your toolset:
If the problem disappears by disabling the RFC 5961 check this threat can be closed. Otherwise I suggest a support call to the firewall manufacturer, as it is clear that the firewall is generating the packets. Good luck (14 Jan '17, 05:18) packethunter showing 5 of 9 show 4 more comments |
I'd still rather have look at actual pcap file in a publicly accessible location...
First blush at the RST/ACK behavior without seeing the trace is the RSTs are either not getting to the destination, or they are considered invalid by the receiving destination. I have some older products that have some TCP stack issues and a modern Linux host does not accept RSTs from these devices. Conversely, perhaps an outbound firewall won't forward them because it lost the state (e.g. you probably see RELATED-ESTABLISHED and other such keywords when doing firewall config).
I don't know that tool but I usually use netstat -tnp. Once sockets are in certain states that are not passing data (e.g. Time-Wait), I noticed no process is listed. I wonder if you are seeing that here as well.
Is lsof any better?
Thanks for your comments Bob! I have the .pcap file: https://github.com/jonwadsworth/pcap/blob/master/104.pcap
It's pretty big (12MB).... Very interested to hear what you think.
I will try your suggestions of netstat -tnp & checking the firewall the next time this happens!
Just being curious: is there a virtualization software running at the time this happens?
No Virt software running... thanks for asking