This is our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

TLR/DNR - use of magic numbers in capture files - what are they

I was trying to use the file command on various sets of hardware to fully automate data analysis of network performance on Wireshark capture files on Cygwin, CENTOS, and Solaris. However, I did notice that one of the Solaris (Intel/x86-64) 11.3 VM I just built thinks my capture files are just 'data'.

I took a step back and used capinfos command to determine the file type. However, when looking at an "od -x <file>" I noticed that the byte ordering was flip on a pcapng file.

I had-----------------------0d0a 0a0d 0078

I expected to see------ 0a0d 0d0a 7800

The capture file was generated in a CENTOS VM hosted on a standard Intel/windows 7 platform.

So what I would like is a good location to find what are the specific magic numbers used in the standard set of files that Wireshark can understand. I rather not go the brute force method of running Wireshark and capturing a bunch of files. This question is more of scratch the itch sort of question and OCD on having the ability to determine what a file is without having access to 'capinfos'.

Part of me is wondering would it be different Big and Little Endian style hardware? Google says Solaris will be big or little depending on the SPARC versus Intel.

asked 15 Apr '16, 09:20

Brad%20M's gravatar image

Brad M
6335
accept rate: 0%


However, I did notice that one of the Solaris (Intel/x86-64) 11.3 VM I just built thinks my capture files are just 'data'.

By which you probably mean that the file command, or some other code using the library that the Ian Darwin file command uses or that uses the same "magic" file, thinks your capture files are just "data".

byte ordering was flip on a pcapng file ... Part of me is wondering would it be different Big and Little Endian style hardware?

Yes. Both pcap and pcapng files are normally written in the byte order of the host that wrote the file. The first 4 bytes of a pcap file are:

  • a1 b2 c3 d4 if the file was written on a big-endian machine and has microsecond-resolution time stamps;
  • d4 c3 b2 a1 if the file was written on a little-endian machine and has microsecond-resolution time stamps;
  • a1 b2 3c 4d if the file was written on a big-endian machine and has nanosecond-resolution time stamps;
  • 4d 3c b2 a1 if the file was written on a little-endian machine and has nanosecond-resolution time stamps.

The first four bytes of a pcapng file are always 0a 0d 0d 0a, whether the file was written by a big-endian or little-endian machine. If your hex dumper is dumping a series of 2-byte words, rather than individual bytes, they would appear as 0d0a 0a0d if you're dumping on a little-endian machine or 0a0d 0d0a if you're dumping on a big-endian machine, but that's a consequence of the way your hex dumper works rather than anything inherent in the file format.

However, non-pcapng files could also begin with those bytes, so they are not, by themselves a "magic number" sufficient to identify pcapng files. A valid pcapng file will also have, in the 4 bytes starting at an offset of 8 from the beginning, either:

  • 1a 2b 3c 4d if the file was written on a big-endian machine;
  • 4d 3c 2b 1a if the file was written on a little-endian machine;

so you, if you're trying to identify pcapng files, you need to look at the first 4 bytes and at the 4 bytes starting at an offset of 8 from the beginning, and you'll need to check for both big-endian and little-endian versions - just as you have to do with the magic number for pcap files.

Newer versions of the Ian Darwin file command's magic file can identify pcapng files as well as pcap files.

permanent link

answered 15 Apr '16, 19:01

Guy%20Harris's gravatar image

Guy Harris ♦♦
17.4k335196
accept rate: 19%

Two answers and both are very useful.

I looked at the magic file and the od -x of the PCAP file to figure out what is what. Mostly because I figured I could just update the out-of-date Solaris 11.3 magic file and have it understand with the file command that what I am looking at was in deed a PCAP (or durative) and go happy on it's way. I just have just dumped the bytes and built my own data structure, versus trying to use the default for od.

Those four examples should provide some sort of 'clue' to help determine what I am looking at. So far I will only be looking at the 'little endian ', but figured if someone got a wild hair, adding support for Big endian would be useful. I was very surprised that I was able to pkg fetch a pre-built version of Wireshark for Solaris 11.3/Intel.

My scripts will be using the capinfos command and only fall back to the file command where needed. I might end up making a more useful universal standalone magic file for use on systems that don't have Wireshark installed.

While 1a 2b 3c 4d and 4d 3c 2b 1a don't provide any new information that can't be figured out by looking at the first set of numbers, they will be useful to prevent 'snipe' hunts or false positives on PCAP style files.

Thanks both of you for some very good information.

(18 Apr '16, 10:06) Brad M

While 1a 2b 3c 4d and 4d 3c 2b 1a don't provide any new information that can't be figured out by looking at the first set of numbers

Yes, they do:

  • the fact that they prevent false positives on pcapng files means that they do provide new information (that lets you distinguish pcapng files from non-pcapng files that happen to begin with CR LF LF CR, unlikely though that might be;
  • they tell you the byte order of the pcapng file.
(18 Apr '16, 10:22) Guy Harris ♦♦

In short, yes, the PCAP (and I believe also the PCAPNG) file formats differ on big- and little-endian machines. You're right that endianism depends on the CPU not (just) the OS. See https://wiki.wireshark.org/Development/LibpcapFileFormat for a good explanation of the PCAP file format (pay attention to the section about the magic number in the global header).

For more details on other magic numbers (for other file types) you'd have to look in the wiretap library's source code. You may find looking in the wireshark-mime-package.xml file (in the top directory of the source code) somewhat more convenient; I created that a while ago based on what was in the wiretap library. It's used by freedesktop.org-compliant desktops to show you if a given file is of a particular type. It has file extensions (so the GNOME file browser will tell you that a .pcapng file can be opened by Wireshark) but also magic numbers (so a PCAPNG file named "MyCapture" will also show up as being a Wireshark data file). (Hmmm, it may not be up to date.)

Oh, and my experience was always that the /etc/magic file on Solaris was always woefully out of date (this is the file that file uses to tell you the file type based on the magic number in the file.)

permanent link

answered 15 Apr '16, 09:57

JeffMorriss's gravatar image

JeffMorriss ♦
6.2k572
accept rate: 27%

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×238
×36
×9
×1

question asked: 15 Apr '16, 09:20

question was seen: 4,877 times

last updated: 19 Apr '16, 01:57

p​o​w​e​r​e​d by O​S​Q​A