I'm examining results from tcpdump using wireshark/tshark and I'm seeing many packets with info "Continuation or non-HTTP traffic" and many other packets with info "[TCP segment of a reassembled PDU]". I'm curious as to what the difference between the two is. The trace comes from a simulation of client-server interaction using HTTP streaming. Each client initiates an HTTP connection (using GET) and the server proceeds to send back chunked data indefinitely. The size of the content is therefore unknown and cannot be provided in the header. I'm quite confused because when I compare a "Continuation" packet with a "TCP segment" packet, they look nearly identical (the differences being minor details such as the timestamp). Can anyone shed some light on these two concepts for me? Thank-you, Eric Edit: Here is one of my captures. In this particular trace, it looks like the switchover from "reassembled PDU" to "HTTP continuation" starts at number 6054/6055. Note that there are quite a lot of duplicate messages where the difference is just in the port; this is because it is simulating many clients (500 in this one, I believe). asked 07 Nov '11, 18:31 eHalcyon edited 10 Nov '11, 15:43 |
2 Answers:
This is merely a result of the TCP Protocol preferences setup, giving you two different views on the same type of data. If you go to your Wireshark Preferences and select the TCP protocol settings, you'll see something called "Allow subdisector to reassemble TCP streams". Depending on wether it is checked or unchecked you get either "reassembled PDU" or "continuation" Messages in the info column. What this setting does is to allow Wireshark to look for and combine packets that contain pieces of the same payload and reconstruct it for you (otherwise you'd have to export them one by one and assemble them yourself). This is mostly needed for payload reconstruction. If you're more interested in packet timings etc. it is usually better to disable reassembly to see a clearer view on what happened when. answered 07 Nov '11, 23:59 Jasper ♦♦ |
In fact there are at least three different issues with reassembling considered chunked HTTP transfer encoding and you must check your preferences very carefully, especially if you are dealing with 'endless' server connection sending chunks of messages. First, the application-level protocol packet, such as HTTP request may fit in single TCP segment, and may not. If the HTTP header is big enough to be split in segments (that's a rare issue, but happens if site is sending lots of cookies and optional X-headers), then you will see two or more packets in the wireshark capture, period. The same can happen to HTTP response headers and mostly it does happen to HTTP request/response bodies. Sometimes applications just do send HTTP headers in single TCP segment and HTTP body in next one. But please note, that those segments have nothing in common with chunks, when chunked Transfer-Encoding is used, because that encoding is application level and TCP is the transport level of the OSI model. So, even your single "chunk" can span multiple segments. But that's not the whole story. Single TCP segment can either fit in ethernet frame (PDU), but can be split as well. Most of the time this does not happen, but for some badly configured Windows machines the maximum size of TCP frame is bigger then usual maximum of Ethernet switches can handle. To add more fun, transport-level packets must be ACK'ed by the endpoint, and sometimes ACK is set within next TCP data packet, and sometimes it is sent separately, while still on the same HTTP port. So, if you try to analyse Web application traffic on TCP level, you'll get a loads of useless sh#t most of the time. That's why you should use filters. To help upper-level protocols collect and filter information, the wireshark dissectors have notion of 'reassembling', where higher-level dissector returns special code meaning 'hey, I need more data to properly dissect this packet' and then processing is restarted when more data arrives. If you turn off ALL reassembling options for TCP and HTTP (and SSL) protocols, then you'll see the naked packets as they are on the wire. You'll notice that 'Continuation of HTTP traffic' message in Info column when packet is with data, but neither HTTP request nor HTTP response header found within it. And all packets without data will be tagged as plain TCP in Protocol column. Mostly that's about ACKs, SYNs and FINs, so you can filter them out. If you allow TCP to reassemble streams, but leave other options unchecked - the picture won't change much, because upper level protocols won't request reassembling. If you allow HTTP to request reassembling the headers spanning multiple segments and bodies then you can already do filtering by application protocol means. E.g. enter 'http' in the Display filter and you'll can forget about all [reassembled PDU] infos - they all be marked as being 'TCP' protocol. Now the dangled part - reassembling application-level chunks. If you analyse protocol that depends upon sending data in chunks, e.g. AJAX chat over HTTP, I'd suggest leaving that option unchecked. Because reassembling stops when you receive the chunk with '0' size, which in your case you would never. However, if your application does encode HTTP bodies with gzip, and use chunked encoding just to send it in streamlined version, you'd better check option of chunk reassembling, otherwise ungzipping will fail. That was quite a lot of text above, but hope now everything is clear for you. Also, if you want more advanced filtering options for HTTP responses, you may find it useful to install following Lua script : Assocating HTTP responses to requests in Wireshark. Should you have any questions about it, feel free to ask. answered 08 Nov '11, 22:57 ShomeaX All very interesting! However, I am still confused as to why I am seeing both "reassembled PDU" and "HTTP continuation" messages in the same trace when the "allow subdisector to reassemble TCP streams" option is checked. Any ideas? (09 Nov '11, 12:00) eHalcyon Does this by any chance happen after a lost packet? It might also be a bug in the HTTP dissector, but it's hard to tell without the tracefile... (09 Nov '11, 14:51) SYN-bit ♦♦ +1 for posting tracefile, but considering the context, the first might be part of responses with Content-Length set, and the last are just the responses with Transfer-Encoding set to chunked, in case you've unchecked the "reassemble chunked-transfer bodies" in HTTP settings. (09 Nov '11, 16:25) ShomeaX No, packets do not seem to be lost. Well, I have a few "TCP ACKed lost segment" but there are no retransmissions so I think that means that TCPdump just missed the segment but caught the ACK. But those messages don't come up near where it starts switching from "reassembled PDU" to "continuation". Apologies for my ignorance, but is there a way for me to attach a file here or should I just copy+paste text? (09 Nov '11, 16:54) eHalcyon 1 oh, yeah, http://cloudshark.org looks more useful for uploading captures. =) (09 Nov '11, 23:39) ShomeaX http://cloudshark.org/captures/48a0ca8e0d35 In this particular trace, it looks like the switchover from "reassembled PDU" to "HTTP continuation" starts at number 6054/6055. Note that there are quite a lot of duplicate messages where the difference is just in the port; this is because it is simulating many clients (500 in this one, I believe). (10 Nov '11, 13:14) eHalcyon From what I see, there is the exact http response packet missing in the tcp stream which frame 6055 is part of (ip.addr==192.168.0.169 and ip.addr==192.168.0.170 and tcp.port==60078). Could this be the thing missing in order for reassembly to get started? Because if I for example select "ignore packet" on a 200 OK in my 1.6.0 wireshark, I see the absolute same thing that reassembly stops to work after a packet loss which is pretty obvious to me (11 Nov '11, 03:56) Landi Landi, I'm not sure what you mean. In frame 6055's tcp stream (204), I do see that a packet was lost in the capture (there is a an ACK for a lost segment). Is this what you're referring to? If I examine the tcp stream for frame 6054 (459), that one switches from "reassembled PDU" to "Continuation" on frame 15091, well before any lost packets. (14 Nov '11, 12:17) eHalcyon @eHalcyon: That's not what I see in your capture tcp.stream == 459 gives me perfect "... Reassembled PDU" statements including your mentioned frame 15091. However, later in frame(s) 22861 and 23411 there are ACKed lost segments again and THEN reassembly - like expected - kicks out. http://www.imgbox.de/users/public/images/KVIqC7mj5D.png (16 Nov '11, 03:05) Landi @Landi - huh, when I view it on CloudShark I see what you see. But viewing the same stream in Wireshark, I see this: (16 Nov '11, 12:06) eHalcyon Maybe a version issue ? I don't know which Wireshark Version is running behind cloudshark - but maybe you go for the latest 1.6.3 version and compere with one or two others running as portable apps? (18 Nov '11, 05:31) Landi showing 5 of 11 show 6 more comments |
Thanks for the response! I tried unchecking the option, which left only "Continuation" messages. However, I am still curious as to why BOTH messages appear when the option is set. That is, when I "allow subdisector to reassemble TCP streams", it starts off with "reassembled PDU" but still eventually ends up with "continuation" messages.