This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

How to filter merged pcaps for dupes? editcap produces an unexpected result.

0

The pcaps were merged with the default "Packets from the input files are merged in chronological order based on each frame's timestamp". There is also the "Mergecap assumes that frames within a single capture file are already stored in chronological order" which may be part of the problem, I'm just not sure.

2 pcaps were captured using airodump simultaneously on 2 separate wifi cards plugged into the same laptop seperated by several meters. This was done to minimize packet loss which seems to always exist for me.

I used mergecap to merge them.

Now I'm attempting to remove the duplicate packets that they both have in common.

But I potentially have a problem here. The editcap -D values all produce different dupe counts. For example the difference between D 5 and D 10 was over 10% (I'm hesitant to commit to a figure from memory, could have been over 20%). But D 15 and D 20 was ~1%.

So now I need to know whether or not it's normal for APs and clients to produce a lot of dupes? Would removing them be fine or could it corrupt some data?

I think wifi packets have sequence numbers, is editcap using them? Can it be forced to use only them and then use -D [sequence_max] ?

Some help would be appreciated. Maybe someone knows a better method to do this altogether.

asked 09 May '15, 11:41

dingrite's gravatar image

dingrite
6112
accept rate: 0%


One Answer:

0

editcap doesn't look into packets when deduping, it simply calculates MD5 hashes over the full frame content and compares them to previous hashes.

-D tells editcap how far it has to look into the past, so -D 10 means compare the current hash to the last 10 hashes (default is 5). Usually, higher values result in more frames being removed because the chance of finding a match in the past is higher. But at some point you'll get all matches, so increasing the -D parameter makes no difference anymore.

Lets look at an example:

Frame  1: 000FF9E667DE540E52C917D4D5D4B38C
Frame  2: 0062A32CE6D7B9273A7E0E2D96DC71D3
Frame  3: 000FF9E667DE540E52C917D4D5D4B38C
Frame  4: 0B3812725B8E3182EE027B5F5D90ADB0
Frame  5: 017D189BDAB9C46AE4A3FE8C85C02133
Frame  6: 0C1303358A2A7926E3D27F716297042C
Frame  7: 1040E6AC7B0CA4365FB9089BD3EBE635
Frame  8: 111731B71DEC7FF17347BE6D1FC20AA1
Frame  9: 000FF9E667DE540E52C917D4D5D4B38C
Frame 10: 12CD165F1171BB76368CFAC7FED4E276

Frames 1, 3 and 9 have the same hash (= are identical). With the default history window of 5 you'll have 1 duplicate found (frame 3) because when the hash of frame 3 is calculated, frame 1 is still in the history. Frame 9 will not be removed, because the oldest frame hash in the history window will be frame 4, so no match is found.

If you increase -D to 10 frames, both duplicates are found and removed. If you set -D to 15 frames, you'll still only get 2 duplicates, so increasing the parameter further doesn't do anything to the result.

answered 10 May '15, 02:42

Jasper's gravatar image

Jasper ♦♦
23.8k551284
accept rate: 18%

then how could I eliminate dupes in wifi captures? If editcap can't do it, what can?

(10 May '15, 12:26) dingrite

Depends on how your dupes look. I have no capture file that has this kind of problem, so I don't know what to look for.

(10 May '15, 14:49) Jasper ♦♦

Wifi packets should have a sequence number that goes up to 4095. Why doesn't it ensure unique hashes for every packet? And I get a different count of dupes even when I go from -D 15 to -D 20. And I'm pretty sure that even when one card misses some packets the timeframes are so small they are at most 1-10 packets apart.

(10 May '15, 15:04) dingrite

Then probably the frames are not exact byte-by-byte duplicates, otherwise editcap should be able to find them.

(11 May '15, 02:53) Jasper ♦♦