This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

UDP reassembly with multiple PDUs per packet

1

I am writing a dissector for a UDP protocol with the following (rather unfortunate) features:

  • A single PDU may (and in most cases does) span multiple packets.
  • A single header may also span multiple packets
  • A packet may also contain multiple PDUs, both complete and fragmented
  • The length of a PDU is determined by a header field, but an unknown number of bytes must be read before getting to that value, as the header is preceded by a variable length delimiter
  • There are no sequence numbers or other ways of uniquely identifying a PDU
  • There is no flag indicating whether a PDU will be fragmented, or whether multiple PDUs will appear in a packet, other than by reading the length
  • All communications are between a single sender and receiver

In practice, this means that some assembly must be done before it is even possible to determine how much assembly will be needed to complete a PDU.

I have approached this by using the various tools in reassemble.h. However, I am getting stuck in a few places, and am looking for suggestions.

My dissector essentially looks like this (pseudocode):

dissect_proto(tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree) {
    while [bytes remain in tvb from offset]
        if [pdu length is unknown]
            found = fragment_get(pinfo, 0, fragment_table);
            if [fragment was found]
                [loop through found->next and add total_length and data]
                buffer = tvb_new_real_data(data, total_length, total_length);
            else
                buffer = tvb_new_subset(tvb, offset, ...);
        bytes_available = tvb_length(buffer);
        pdu_length = get_pdu_length(buffer, &pdu_offset);
        if [pdu length is known and is smaller than bytes remaining]
            complete = TRUE;
        /* bytes_to_consume is min(pdu_length, bytes_available) */
        pinfo->fragmented = !complete;
        head = fragment_add(tvb, offset, pinfo, 0, fragment_table,
                offset, bytes_to_consume, !complete);
        next_tvb = process_reassembled_data(tvb, offset, pinfo, "Reassembled packet", head, &proto_frag_items, NULL, tree);
        offset += bytes_to_consume;</code></pre><p>Mostly, I am hung up on how to add fragments properly. I would like to be able to read part of a packet, add any portion of that from a fragmented PDU into the table, and continue looping through. As I read more PDUs out of that packet, I add fragments, marked complete, for those into the table. The last PDU would likely be incomplete, and added as well.</p><p>It appears that the fragment table is expecting a single add operation per packet. Is there a correct way to keep track of fragments when there may be multiple per packet?</p></div><div id="question-tags" class="tags-container tags"><span class="post-tag tag-link-reassembly" rel="tag" title="see questions tagged &#39;reassembly&#39;">reassembly</span> <span class="post-tag tag-link-udp" rel="tag" title="see questions tagged &#39;udp&#39;">udp</span> <span class="post-tag tag-link-dissector" rel="tag" title="see questions tagged &#39;dissector&#39;">dissector</span> <span class="post-tag tag-link-pdu" rel="tag" title="see questions tagged &#39;pdu&#39;">pdu</span> <span class="post-tag tag-link-fragmentation" rel="tag" title="see questions tagged &#39;fragmentation&#39;">fragmentation</span></div><div id="question-controls" class="post-controls"></div><div class="post-update-info-container"><div class="post-update-info post-update-info-user"><p>asked <strong>11 Aug '11, 12:27</strong></p><img src="https://secure.gravatar.com/avatar/d84e8965aabf69774c0bf979cf8e55e6?s=32&amp;d=identicon&amp;r=g" class="gravatar" width="32" height="32" alt="sweetpea&#39;s gravatar image" /><p><span>sweetpea</span><br />

16113
accept rate: 0%


2 Answers:

0

From the looks of it your protocol is screwed up. UDP is an unreliable datagram protocol, hence does not guarantee delivery, nor sequence. Once an inconsistency occurs your dissector (and any receiver for that matter) will get out of sync.

Maybe the RTP dissector can be of help, it also runs over UDP, and contains reassembly code. Although I think it has the benefit of sequence numbers.

answered 12 Aug '11, 01:38

Jaap's gravatar image

Jaap ♦
11.7k16101
accept rate: 14%

The protocol is screwed up. Sadly, that doesn't eliminate the need to dissect and troubleshoot it. The main consolation is that PDUs are generally fairly small, so any dropped or out-of-sequence packets will only affect a small part of the stream.

The RTP dissector seems to depend on conversations and sequence numbers, but it does offer some hints. Setting the partial reassembly flag is one step I was missing. I'm still stumped on multiple PDUs, though.

(15 Aug '11, 15:29) sweetpea

On the first pass through the capture, you'll see the packets in the sequence they're in inside the capture file, which is the closest approximation you'll get to the time order. Reassemble them assuming they're in the right order; if they're not, what you reassemble will be what any receiver who got them in the same order would see - if that's bogus, then what the receiver reassembles will be bogus to, so Wireshark will show you that bogosity. (I.e., given the brokenness of the protocol, the brokenness of reassembly will show you the results of the brokenness of the protocol.)

(16 Aug '11, 02:56) Guy Harris ♦♦

Yes. I know I can't do better than the sequence coming into wireshark. Originally, I implemented all of the reassembly myself, which worked on the first pass, but not on random access when looking at actual frames in wireshark. I can understand how to parse and display PDUs based on that first pass; what's tricky is finding the correct way to store the information so that it works in random access.

(17 Aug '11, 15:42) sweetpea

0

OK, so let's look at the protocol's (mis-)features:

  • A single PDU may (and in most cases does) span multiple packets.

That means you need to have some way of knowing when the PDU ends, i.e. when you're done with reassembly.

  • A single header may also span multiple packets

That means you need to have some way of knowing when the header ends.

  • A packet may also contain multiple PDUs, both complete and fragmented

Again, you need to know when the PDU ends, even if the entire PDU is within one UDP datagram.

  • The length of a PDU is determined by a header field, but an unknown number of bytes must be read before getting to that value, as the header is preceded by a variable length delimiter

OK, that presumably means the length tells you when the PDU ends. If the delimiter is variable-length, there has to be some way of knowing when the delimiter ends; what is that?

  • There are no sequence numbers or other ways of uniquely identifying a PDU

Which, as Jaap noted, means that the receiver has to assume that the packets are delivered in order, and, if they're not, it won't work correctly, so, if Wireshark doesn't reassemble the packets "correctly" in that case, it's actually correct in the sense that it'll show you what a receiver that got the UDP packets in the same order will think it got, even if that's not what the sender intended it to see.

  • There is no flag indicating whether a PDU will be fragmented, or whether multiple PDUs will appear in a packet, other than by reading the length

That's similar to many protocols running atop TCP, so that's not inherently insoluble. You might have to implement something similar to tcp_dissect_pdus() in your dissector.

answered 16 Aug '11, 03:03

Guy%20Harris's gravatar image

Guy Harris ♦♦
17.4k335196
accept rate: 19%

edited 17 Aug '11, 16:10

Whether or not the dissector will show PDU reassembly problems experienced by the receiver also depends on where the capture is made. At the sender side all may seem nice and dandy, while at the receiver things may not be...

(16 Aug '11, 05:02) Jaap ♦

For the moment, I'm ignoring packet loss and sequence problems.

(17 Aug '11, 15:54) sweetpea

The delimiter looks like MSG:xxxxx[newline], where xxxxx is 1-5 characters. Following is a 2-byte PDU type, and 2 byte PDU length, then the data, then a 2-byte checksum.

Packets often come in a pattern like this:

Packet 1: MSG:xxxxx[newline]
Packet 2: [type][len]
Packet 3: [data][checksum]MSG:xxxxx[newline][type][len][data][checksum]MSG:xxxxx[newline]
Packet 4: [type][len]
Packet 5: [data][checksum]
Packet 6: MSG:xxxxx[newline]

...

(17 Aug '11, 15:57) sweetpea