This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

Why are the ACK packets getting lost?

0

I have a web server hosted on AWS sending email via an email server hosted in a separate data center. I can telnet into the email server with no issues from a server outside AWS but when connecting from my AWS web server I have two issues.

  1. The initial connection often gets stuck in a series of re transmissions after an initial spurious re transmission immediately following the syn/ack. You can see this at the beginning of the trace. This issue is duplicated when I send emails. The initial send hangs until a connection is made.

  2. At the end of the trace you can see frame 13198 on the web-server acks's sequence 8040907. However the very next frame again begins another series of re transmissions. You can see the ack for 8040907 on both the web-server and the email server trace I've included below. The acks are getting lost but I'm not sure where. I'm connecting directly from the public ip of the aws web server to the mail server and not through a load balance.I have also opened the inbound ports in the security group to include the port I'm using to connect to email as well as allowing in all ICMP traffic. Would greatly appreciate any insight.

webserver trace 10.2.2.2 https://www.cloudshark.org/captures/d17da6802b39

email server trace 10.1.1.1 https://www.cloudshark.org/captures/d6c1e750ddf8

asked 22 Dec '16, 22:23

vze80's gravatar image

vze80
5114
accept rate: 0%

Did you try after disabling TCP timestamps?

Please google to find how to disable it for your corresponding operating system.

(23 Dec '16, 09:15) soochi

One Answer:

1

could you please try after turning off timestamps

answered 23 Dec '16, 00:32

soochi's gravatar image

soochi
57349
accept rate: 0%

In this TCP implementation timestamp (RFC 1323) is enabled. it must be disabled at the OS. Like in Windows with the command -> netsh int tcp set global time=disabled

(23 Dec '16, 04:25) soochi

Soochi, Turning off TCP timestamps on the server indeed cleared up the issue instantly. on linux the command is "echo 0 > /proc/sys/net/ipv4/tcp_timestamps" To disable permanently add "net.ipv4.tcp_timestamps = 0" to /etc/sysctl.conf. I'm not entirely sure what the problem was but this post and the associated links were helpful in beginning to understand the issue. http://serverfault.com/questions/235965/why-would-a-server-not-send-a-syn-ack-packet-in-response-to-a-syn-packet Thanks.

(25 Dec '16, 19:31) vze80

glad to hear that the workaround worked!

comparing the traces email-server_anon and web-client_anon, i observed the following.

1, The timestamp set by server is changed by someone in the path before arriving at client. This could be clearly seen in the tcp streams 2-9. All these streams were reseted by client as the syn-ack arrived after 30 seconds!

2, From tcp stream 10-13 the syn-ack arrived after 16 seconds! These sessions were also reseted by client.

3, At stream 14 the syn-ack arrived at 8 seconds, then at stream 15 the syn-ack arrived at 4 seconds and at stream 16, the syn-ack arrived after 2 seconds.

4, At stream 16 the timestamp from server is increased by some device. this only happens to syn-ack packets and not further.

5, it seems that the timestamp modification device resides near to the client. each time the timestamp is modified as per the arrival of the corresponding syn-ack. i measured it a value arround 8000 when the syn-ack arrived around 30 seconds delay.

I believe the underlying problem is the delay caused due to some other reason which should be further investigated.

i also noticed that the client supports mtu upto 9k, but when the syn arrives at server the mss is reduced (most probably by internet connecting device at client location)

Please provide a new trace with timestamp disabled.

(26 Dec '16, 07:56) soochi

Please ignore my previous comment. the statements are wrong. the timestamp is not modified on the path. the server retransmitts packets with different timestamps and only the last retransmission from each session arrives at the client.

anyways it would be interesting to look at a capture with timestamps disabled as the issue does not exist. it also seems the packets are lost from server to client and not in the other direction.

(26 Dec '16, 13:57) soochi

The client resides on AWS which has a lot of unique networking architecture. Each instance has a public and a private ip address (each with different MTU). AWS also blocks all inbound ports for all services (including ICMP) unless specifically opened up in the security group (firewall). This could potentially cause problems when the MTU size changes in the network path. (http://docs.aws.amazon.com/redshift/latest/mgmt/connecting-drop-issues.html). One of the first fixes I tried (before disabling tcp timestamps) was allowing ICMP packets to the client which had no effect. I'm running another capture now. I'll upload shortly.

(27 Dec '16, 08:36) vze80

Successful SMTP/TCP packet capture from client and server

Mail-Server (10.1.1.1) https://www.cloudshark.org/captures/b92ad0f5933c

Mail Client (10.2.2.2) https://www.cloudshark.org/captures/ca3ef01a45e4

Since disabling TCP timestamps on the AWS client all of the SMTP connection and MYSQL connection hangs seem to have been resolved. There does seem to be an occasional RST from the AWS mail-client in the most recent SMTP packet capture linked here but the connection seems to recover and continue. Not sure what would be causing that.

(27 Dec '16, 12:21) vze80

could you please anonymize the capture which then includes the complete TCP header. The packets are cut at 54 Bytes which removed the options.

(27 Dec '16, 13:40) soochi

Sorry about that. I'm getting "Access Voilations" with TraceWrangler so I was only able to re-anonymize the mail-server capture. Hope that is more helpful.

Mail-Server (10.1.1.1) https://www.cloudshark.org/captures/f13d1636d8ee

(27 Dec '16, 18:19) vze80
showing 5 of 8 show 3 more comments