This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

making pcap-ng hadoop friendly

0

Are there any plans to make the pcap-ng format more friendly to hadoop and mapreduce. Meaning being able to "split" a single file and stay on cut. I believe this can be accomplished with pcap-ng and just wondering if was in the plans.

asked 08 Mar '12, 04:47

jdbethge's gravatar image

jdbethge
1112
accept rate: 0%

edited 08 Mar '12, 14:32

Guy%20Harris's gravatar image

Guy Harris ♦♦
17.4k335196


2 Answers:

1

The inherent sequential nature of network capture analysis makes it unfit for such processing.

answered 08 Mar '12, 05:33

Jaap's gravatar image

Jaap ♦
11.7k16101
accept rate: 14%

I would disagree. As there seems to alot of work currently in analyzing pcap using hadoop map reduce. See https://labs.ripe.net/Members/wnagele/large-scale-pcap-data-analysis-using-apache-hadoop

(08 Mar '12, 12:11) jdbethge

0

There are no plans to change the pcap format, as there's not much you can do to change it - you can't add new fields to the file header or the packet header, as that would break all code that reads pcap files. A new file format would need a new magic number, but that amounts to a different file format.

pcap-NG is extensible, so there's more that could be done, but, as Jaap notes, there are limits to the amount of parallelism possible when processing a capture file - the proper interpretation of a packet may depend on the contents of previous packets.

answered 08 Mar '12, 11:06

Guy%20Harris's gravatar image

Guy Harris ♦♦
17.4k335196
accept rate: 19%

I am not talking about changing pcap (that would be crazy), but adding capability to pcapng.

(08 Mar '12, 12:13) jdbethge

OK, so I edited the qestion to clarify that.

If the capability in question can be provided by adding a new block type to pcap-NG, it can probably be done; if so, indicate what sort of block that would be.

(The existing pcap-ng format is less likely to be changed in an incompatible fashion, as there's already code out there that reads and writes it; that would involve a new major version number.)

(08 Mar '12, 14:34) Guy Harris ♦♦