Are there any plans to make the pcap-ng format more friendly to hadoop and mapreduce. Meaning being able to "split" a single file and stay on cut. I believe this can be accomplished with pcap-ng and just wondering if was in the plans. asked 08 Mar '12, 04:47 jdbethge edited 08 Mar '12, 14:32 Guy Harris ♦♦ |
2 Answers:
The inherent sequential nature of network capture analysis makes it unfit for such processing. answered 08 Mar '12, 05:33 Jaap ♦ |
There are no plans to change the pcap format, as there's not much you can do to change it - you can't add new fields to the file header or the packet header, as that would break all code that reads pcap files. A new file format would need a new magic number, but that amounts to a different file format. pcap-NG is extensible, so there's more that could be done, but, as Jaap notes, there are limits to the amount of parallelism possible when processing a capture file - the proper interpretation of a packet may depend on the contents of previous packets. answered 08 Mar '12, 11:06 Guy Harris ♦♦ I am not talking about changing pcap (that would be crazy), but adding capability to pcapng. (08 Mar '12, 12:13) jdbethge OK, so I edited the qestion to clarify that. If the capability in question can be provided by adding a new block type to pcap-NG, it can probably be done; if so, indicate what sort of block that would be. (The existing pcap-ng format is less likely to be changed in an incompatible fashion, as there's already code out there that reads and writes it; that would involve a new major version number.) (08 Mar '12, 14:34) Guy Harris ♦♦ |
I would disagree. As there seems to alot of work currently in analyzing pcap using hadoop map reduce. See https://labs.ripe.net/Members/wnagele/large-scale-pcap-data-analysis-using-apache-hadoop