This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

Why is message Time To Live only 1?

About half of my packets are window size 32768 and TTL = 64, which is what I would expect. I'm on a dedicated VLAN for iSCSI. The other half of my packets are window size 524 and TTL = 1. All these packets come from one host. Like I said, this is a dedicated non-routed VLAN. The customer is experiencing latency issues across the board from several hosts. I'll be getting a network capture from a second host today.

analysis

asked 24 May '12, 06:15

gipper
30●12●12●16
accept rate: 0%

edited 25 May '12, 10:41

helloworld
3.1k●4●20●41

...and your question is what exactly? :-)

You should either provide a trace (keep in mind that you might expose sensitive data) via www.cloudshark.org, or tell us more about your observations. For example: what host is sending the TTL 1 packets, what protocol is it, which direction (to client, to server) etc.

(24 May '12, 06:26) Jasper ♦♦

I suggest to do either of these:

check your routing infrastructure in the iSCSI "subnet". According to your description it's just a flat net within one VLAN, however: never trust any assumptions, so better check ;-)
figure out the vendor of that device and contact their support (regarding the TTL and the window size). Maybe they already know this (bug or something)
As @Jasper said: if you can provide a capture sample, we might be able to help.

Regards
Kurt

(24 May '12, 07:39) Kurt Knochner ♦

Not sure how to attach my trace export file

(24 May '12, 08:02) gipper

You can upload it to cloudshark.org. BEWARE: You cannot delete an uploaded file.
You can use a one-click file hoster (search google) to upload the file and then post the link. http://netload.in seems to be acceptable in terms of user annoyance and ads.

(24 May '12, 08:49) Kurt Knochner ♦

I uploaded some of the capture. Had to cut it in to pieces due to 10MB size limit. http://cloudshark.org/captures/2b1890434ed9 Look at frame 1.

(25 May '12, 08:43) gipper

2 Answers:

The trace is a bit problematic since there is lots of packets missing from it, and not so many big packets at all. I guess most large packets were coming in too fast and too big to be captured by your capture device...

Anyway, working with what we have, here's what I think:

The TTL = 1 is uncommon, but a reason could be that the system using it wants to make sure that the packet is not routed, because that isn't exactly something you want to happen when talking to a storage array with as little latency as possible. I just checked with my own ESX servers talking to an iSCSI SAN (using the software iSCSI adapter) to see if it does the same, but it uses a TTL of 64, so it's not a default to have TTL 1.
Regarding the Window size - unfortunately there is no TCP Session Start (SYN SYN/ACK ACK), so we do not know if Window Scaling was negotiated. Usually when I see a window hovering around a value of 500 bytes (or in this case, staying dead center on 524 bytes) it is because there was a scale factor negotiated, usually 8. Which would mean that the window size needs to be multiplied by 256. But one can't say for sure unless you capture a session start. Also, it doesn't look like the other node is sending back chunks of 524 bytes, but some larger packets as well (above 1000 bytes), even when it's rare. Which is probably due to packet loss problems on the capture device.

Preliminary verdict: I doubt the TTL is a problem. Also, the window size is too constant and the other node is sending more data than it would if the window size would not have to be scaled. But this is something that needs verification, and without packet loss on capture (a.k.a. "drops"). So you probably need a capable gigabit capture solution.

answered 25 May '12, 09:12

Jasper ♦♦
23.8k●5●51●284
accept rate: 18%

Thanks for the reply. I'm having some ICMP issues longe reply times since I investigated further. So I'm waiting for the customer to capture a trace of this. I figure I need to work at this layer first and correct the problems there. What capture filter would I use to get the syn syn/ack ack?

(25 May '12, 10:04) gipper

You could use a filter like "tcp[tcpflags] & tcp-syn != 0" to get all packets with a SYN flag. Those are the ones containing the Window Scaling parameters.

(26 May '12, 06:05) Jasper ♦♦

I also believe the trace is problematic, however due to other reasons:

If you select tcp.stream 0 you see a lot of ACK packages for "unseen" data. Possible causes are:
- the data is coming in to fast and your capture device is dropping it (as @Jasper said).
- your capture setup might be faulty, e.g. capturing only on one physical interface of an aggregated link (LACP / "Adapter Teaming" on your ESX server). How did you capture the data?
If you look at tcp.stream eq 2 and tcp.stream eq 3 there is virtually unidirectional TCP communication for 18 seconds. That's traffic from one src to one dst, no data the other way round. These packets are mostly ACKs, sometimes SCSI "commands". Where is the data that has been ACKed?

This leads me to the conclusion, that your capture setup might be faulty. Please tell us more about how you captured the data.

Unless we can't trust the capture data, no assumptions can be made about the possible problems.

Regarding the TTL: Only one of your VMware servers seems to set the TTL=1. Is there any difference in the version of the VMWare software running on that host (10.1.10.121)?

Regards
Kurt

answered 25 May '12, 10:03

Kurt Knochner ♦
24.8k●10●39●237
accept rate: 15%

edited 25 May '12, 10:15