Hi, not sure if I am too dumb to get that right, but I am puzzled by that finding: In a 30-sec trace I found quite a few packet pairs, each one for another socket/stream (and the only packets for those sockets(/streams during the captured period at all) like that one:
This is not quite clear to me: packet 407 comes with a sequence number of 1. In 408, the reply means: got all data up to seq 1, expect at least seq 2 ... Can't see why that is not a perfect ACK to the preceeding packet. Another question is of course where that SeqNumber 1 comes from. Would not carry the SYN-ACK and the following ACK during session establishment already seq number 1? So - as the socket was established before the capture apparently, the only explanation would be a wraparound of the seq number. But: for all the seen suspicious ACK pairs where Wireshark sees ACKs for lost segments, the seq.nums and ACKs have the same values (1, 2) ! If however the initial SeqNumber is 0, the numbers in those packets would be ok, but then Wireshark should flag the first of the packets as ACKed lost segment (as the preceeding one in that stream is indeed not captured !), shouldn't it? Any advice welcome. Uwe asked 14 Dec '11, 09:48 ufalke |
One Answer:
Lets first start with the value of the SEQ and ACK. By default Wireshark will show relative sequence numbers. This means, it takes the sequence number of the SYN and SYN/ACK packets as reference and calculates the difference. The result is shown. You can verify this by clicking on the SEQ or ACK field and look at the 4 bytes in the hex pane, they show a different number. If Wireshark did not see the SYN, the SEQ and ACK of the first packet of the conversation will be shown as 1. Now for the "ACK lost segment", the SEQ in frame 407 is 1, but its tcp length is 0, so the next expected sequence number would again be 1. So when the ACK in frame 408 comes with a value of 2, there must have been (at least) one frame lost. This can also be seen by looking at the TCP Timestamp options. The TSER in frame frame 408 is the "echo" of the last seen TSV from the other side. As the TSV in frame 407 is smaller than the TSER in frame 408, this means there must have been at least one other packet on the network in between those packets. Then for the grande finale, since these two frames come right after one-another in the tracefile, the only logical conclusion would be that the capture process was not able to capture all packets to disk during the capturing. answered 14 Dec '11, 10:16 SYN-bit ♦♦ |
Thank you (knew it was my fault :-). If I got that right, it could only be that tcpdump (which I used to capture) did drop packets. If the network (switch/cables) would have dropped anything, we would see retransmits (which we don't), right? The rel. ACK number of 2 means that he TCP layer has received something on that socket in between . So : would it be correct to not suppose switch or cable problems (that is of interest as there is a slight suspicion towards a switch )?
many thanks again Uwe
(I converted your "answer" to a "comment" as that is the way this site works best, please see the FAQ)
You are right in the statement that the switch did not discard the packet between the communicating systems. However, the switch might have discarded the packet on the spanport. This happens for instance if you create a monitor session of a 100 Mbit/s Full Duplex port that is both receiving as sending >50 Mbit/s. The resulting load on the monitor port would become > 100 Mbit/s. So if your monitor device is connected @ 100Mbit/s, then you would see drops on the spanport of the switch.
thanks for fixing me up :-> the capture was done on an 1GbE NIC on one of the related Linux box (i.e. on an endpoint), not on the switch. So, if anything has dropped packets, it must have been tcpdump. Looking closely at the example, there are just about 48us between the two frames. That is about the typical TCP/IP over Ethernet round trip latency, isn't it? The overall packet rate was not seen as I filtered for just 4 target hosts. I might capture all traffic instead, may be the on-the-fly fileter for a set of 4 peer IP addresses causes packet drops?