Why chunked HTTP response is not decoded?

Question

I have captured opening of news web site main page with wireshark 2.0.1.

GET request is parsed by wireshark but response is not.

Wireshark shows the following output:

8 [TCP segment of a reassembled PDU]" ... (many such lines)
... 
28 [TCP Previous segment not captured] ...
29 [TCP ACKed unseen segment] ...

Then "TCP segment of reassembled PDU" appears again and again (same pattern as above).

If I select "Follow TCP stream" wireshark decodes the conversation. E.g.:

GET / HTTP/1.1 User-Agent: Wget/1.13.4 (linux-gnu) Accept: */* Host: newsru.com Connection: Keep-Alive

HTTP/1.1 200 OK Server: nginx Date: Sun, 10 Jan 2016 07:22:44 GMT Content-Type: text/html; charset=windows-1251 Transfer-Encoding: chunked Connection: keep-alive Keep-Alive: timeout=25 Expires: Sun, 10 Jan 2016 07:23:14 GMT Cache-Control: max-age=30 Serv: ng5

…

(Entire conversation is 163K)

I am under impression that the problem is in “Transfer-Encoding: chunked” ( no length is given, chunks are used)

Is it a bug in Wireshark or have I missed some settings to dissect chunked HTML? I’ve expected to see the settings which controls max length of the HTML payload being dissected but do not see any.

“Resemble chunked transfer-coded bodies” is set.

Thanks beforehand!

Answer 1

2

Possibly because your capture is missing some frames from the server?

For example. see frames 27-29. Frame 27 sends data setting the expected tcp sequence no. to 20380, but frame 28 sends data with a sequence number 26172. There's 5792 bytes missing, likely to be 2 tcp segments given that some of the segments are transporting 2896 bytes of tcp data. Frame 29 shows the client acking sequence no. 29068, that is all data including frame 28, so the client has seen the data, but you didn't capture it.

answered 12 Jan '16, 09:07

grahamb ♦
19.8k●3●30●206
accept rate: 22%

This looks strange. I have just repeated the experiment.

Now I've made measurements from different machine (and different country). And got similar results: see pcap file

The commands I have used to capture it:

tshark -i eth0 -w tst1.pcap host newsru.com
wget http://www.newsru.com

How can it be that the client have seen the data but wireshark has not?

I have also made a comparison of the index.html file I've got and the response which is shown in the "Follow TCP stream" window. I do see that size of the conversation captured by wireshark is bigger that what wget saved. And meld does not show missed parts.

(12 Jan '16, 10:33) Vitaly R

1

How can it be that the client have seen the data but wireshark has not?

The server response is (probably) not standard compliant. If you replace fc4 with CRLFs, Wireshark is able to parse the response.

Serv: ng5

fc4

Regards
Kurt

(12 Jan ‘16, 10:42) Kurt Knochner ♦

1

The second capture still seems to be missing packets, see frame 92 where it has a tcp sequence number of 56936, but the expected value was 55528. Another 1408 bytes of payload missing.

The client then acks subsequent frames, extending its window size to accommodate the new data until frame 113 where the client can no longer increase its window size.

Then in frame 200 the server sends the missing segment.

However things go wrong again in frame 230, the missing segment is sent in frame 272, and the server now resends a lot of the date it transmitted between 230 and 272. In frame 307 the client finally acks the highest sequence number seen and closes the connection.

See it seems that all the data is there, but due to the dropped and retransmitted frames Wireshark’s tcp and\or http dechunking reassembly has got a bit confused.

(12 Jan ‘16, 11:26) grahamb ♦

@grahamb:
Possibly because your capture is missing some frames from the server?
The capture summary shows the follwing stats:

Interfaces
Interface:              br0
Dropped packets:        33 (3e+01 %)
Capture filter:         port 80
Link type:              Ethernet
Packet size limit:      262144 bytes

My side question is: What does this dropped packets statistic mean?
Does it mean the amount of packets dropped only for the capture (So that we are just not able to see them but there have been reached the system),
or mean the number of dropped packets at this interface (so the system hasn´t seen this packets, too)?

(12 Jan ‘16, 12:30) Christian_R

1

Seems that I found the answer to my side question: https://ask.wireshark.org/questions/2095/what-does-packets-dropped-really-mean

So if understand it right, this could be the explanation for the missing frames in the tracefile.

(12 Jan ‘16, 15:11) Christian_R

Yep. It looks like the root of the problem is in dissectors/packet-tcp.c:

desegment_tcp function:

msp = (struct tcp_multisegment_pdu*)wmem_tree_lookup32_le(tcpd->fwd->multisegment_pdus, seq-1);
…
if (msp && msp->seq <= seq && msp->nxtpdu > seq) {
/* Continue to defragment the packet */
…
} else {
/* This segment was not found in our table, so it doesn't

contain a continuation of a higher-level PDU.
Call the normal subdissector.
*/
}

When one of the packets is not coming in a proper time, wireshark finds msp structure but it fails the test msp->nxtpdu > seq. As a result, new dissection starts for the newly arrived packet and hence the whole logic breaks. The missed packet comes later but it is too late to process it.

Failed part:

desegment_tcp: 3, msp = 0x7f7701617ea0
desegment_tcp: msp->seq: 1, seq: 56936, msp->nxtpdu: 55529

(13 Jan ‘16, 04:34) Vitaly R

showing 5 of 6 show 1 more comments