This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

LUA - how to filter duplicate TCP packets

0

I stumbled across this thread: https://ask.wireshark.org/questions/38664/how-do-you-filter-for-duplicate-rtp-packets and blatantly copied the LUA script.

My use case is to filter out duplicate TCP packets which are in the original trace due to SPAN from both sender and receiver. The SPAN cannot be changed since it captures continuously to a latency analysis engine so now I need to be able to filter the redundant traffic in order to do troubleshooting.

For this very specific use case I can filter on TCP sequence number, source and destination port and TCP stream index. Perhaps more must be added but this is not directly related to my question.

The script makes it possible to use a display filter such as "tcpdup.duplicate == true" which will show all TCP packets now identified as duplicate - and it works! :-)

However I would like to filter out half of these packets, so I see the one of the packets and in effect hide the duplicates. Naturally this is possible doing manual filtering on e.g. destination MAC address but this is not a uniform solution as it's very specific to one particular TCP session.

Is it possible to make this more general to simply filter out the 2nd packet and if so how?

I realize this is mostly down to scripting/coding knowledge which by now should be obvious is not one of my strong skills, so if you have any pointers or suggestions please don't hesitate :-)

Here's my script so far:

-- our new Proto object
local tcpdup = Proto("tcpdup","TCP Duplicates Protocol")

– new fields for our "tcpdup" protocol – the purpose for these is so they can be filtered upon local pf_is_dup = ProtoField.bool("tcpdup.duplicate", "Duplicated") local pf_dup_frame = ProtoField.framenum("tcpdup.frame", "DupFrame", base.NONE)

– register the ProtoFields above tcpdup.fields = { pf_is_dup, pf_dup_frame }

– some existing fields we need to extract from TCP packets, to determine duplicates – all of these must be the same for us to consider two packets duplicates local f_ip_id = Field.new("ip.id") local f_tcp_seq = Field.new("tcp.seq") local f_tcp_srcport = Field.new("tcp.srcport") local f_tcp_dstport = Field.new("tcp.dstport") local f_tcp_stream = Field.new("tcp.stream")

– the table we use to track seen packet #s and seen field info – we'll use this as both an array and map table – the array portion is indexed by packet number – the map portion is keyed by "ip.id:tcp.seq:tcp.stream" – the resultant for both is the same instance of a subtable with the – packet numbers of the dups in an array list local packets = {}

local function generateKey(…) local t = { … } return table.concat(t, ':') end

– adds the packet's number to both the array and map – which is done when we see a particular set of fields for the first time local function addPacketList(pnum, key) local list = { pnum } packets[key] = list packets[pnum] = list end

– adds the packet to the array part, using an existing list of dups – also adds the packet's number to the list of dups local function addPacket(pnum, list) – add this packet's number to the array portion of the big table packets[pnum] = list – add this packet's number to the list of dups list[#list + 1] = pnum end

– whenever a new capture file is opened, we want to reset our table – so we hook into the init() routine to do that function tcpdup.init() packets = {} end

– some forward "declarations" of helper functions we use in the dissector local createProtoTree

– our dissector function function tcpdup.dissector(tvb, pinfo, tree) – first, check if this is an tcp packet, by seeing if it has a tcp.seq local tcp_seq = select(1, f_tcp_seq())

if not tcp_seq then
    -- not a TCP packet
    return
end

local pnum = pinfo.number

-- see if we've already processed this packet number
local list = packets[pnum]

if not list then
    -- haven't processed this packet
    -- see if the fields match another packet we've seen before
    local ip_id = select(1, f_ip_id())
    local tcp_srcport = select(1, f_tcp_srcport())
    local tcp_dstport = select(1, f_tcp_dstport())
    local tcp_stream = select(1, f_tcp_stream())
    local key = generateKey(tostring(ip_id), tostring(tcp_seq), tostring(tcp_srcport), tostring(tcp_dstport), tostring(tcp_stream))

    list = packets[key]

    if not list then
        -- haven't seen these fields before, so add it as a non-dup (so far)
        addPacketList(pnum, key)
        createProtoTree(pnum, tree)
    else
        -- we haven't processed this packet, but we have seen the same fields
        -- so it's a duplicate.  Add its number to the array and entry...
        addPacket(pnum, list)
        -- and now create its tree
        createProtoTree(pnum, tree, list)
    end
else
    -- we found the packet number already in the table, which means
    -- we've processed it before
    createProtoTree(pnum, tree, list)
end

end

createProtoTree = function (pnum, root, list) – add our "protocol" local tree = root:add(tcpdup)

if not list or #list < 2 then
    -- it's not a duplicate
    tree:add(pf_is_dup, false):set_generated()
else
    tree:add(pf_is_dup, true):set_generated()
    -- now add the other packet numbers as reference tree item fields
    for _, num in ipairs(list) do
        if num ~= pnum then
            tree:add(pf_dup_frame, num):set_generated()
        end
    end
end

end

– then we register tcpdup as a postdissector register_postdissector(tcpdup)

asked 08 Aug ‘16, 04:03

NJL's gravatar image

NJL
21448
accept rate: 0%

just a question - have you tried to do get rid of the duplicates using Super Deduper?

(08 Aug ‘16, 04:58) sindy

Or editcap which comes with Wireshark?

(08 Aug ‘16, 05:22) grahamb ♦

I tried editcap and used it initially to remove the vast majority of the duplicates. I should have stated that but forgot :-)

The duplicates that are left are from traffic that had to be routed (source and destination on different subnets on the same switch). They have different source and destination MAC addresses, hence editcap does not see them as duplicates even though everything else is identical.

I have not tried Super Deduper, I’ll give that a try, however I could see this LUA script (if I can get it working the way I want) really useful for other use cases.

I’ve given it some more thought and I think what I need is “simply” a number of each duplicate packet. That way I should be able to use a display filter with e.g. “tcpdup.duplicate_packet_number == 1” which would then only display the 1st packet of all duplicates identified.

Any suggestions on how to do it or where to find more information is very welcome :-)

(08 Aug ‘16, 07:03) NJL

The point is that a dissector is not called just once per frame but several times - once when the file is loaded, and then at least each time you click a packet in the packet list. So you would have to extend the contents of the list with frame.number for each copy of the packet, compare the frame.number of the currently dissected packet with all the stored ones, and only mark as duplicates those whose frame.number would be higher than the lowest stored one. Or you could even assign order numbers 1 to N to each copy, assuming that for the first time, the frames would be dissected in the order they are read in, and that you would always calculate the order number to be assigned to the tree in that dissector run by finding its frame.number in the list.

(08 Aug ‘16, 14:39) sindy


One Answer:

0

Thanks for the help.

A colleague of mine with more coding skills than me were kind enough to help.

That resulted in the dissector script below.

The script looks for packets which have the following identical fields: - IP ID - TCP sequence number - TCP source port - TCP destination port - TCP stream index

The new "TCP Duplicates Protocol" dissector identifies (just as the original) any duplicates by inserting a new boolean field called "Duplicated". If a duplicate exist the duplicate frame number is also inserted in a separate field called "DupFrame". Finally, the first frame of any duplicates are identified with another boolean field called "FirstSeen".

For my use case I can use a display filter like this "tcpdup.duplicate == false || tcpdup.firstseen == true"

This results in the dissector filtering out all the duplicates, while still showing the frames that were not duplicated.

I'm sure there are more ways to skin this cat, and most likely also more elegant ways, but for now this serves the purpose.

Hopefully someone else can get benefit from this dissector :-)

-- our new Proto object
local tcpdup = Proto("tcpdup","TCP Duplicates Protocol")

– new fields for our "tcpdup" protocol – the purpose for these is so they can be filtered upon local pf_is_dup = ProtoField.bool("tcpdup.duplicate", "Duplicated") local pf_dup_frame = ProtoField.framenum("tcpdup.frame", "DupFrame", base.NONE) local pf_dup_frame_firstseen = ProtoField.bool("tcpdup.firstseen", "FirstSeen")

– register the ProtoFields above tcpdup.fields = { pf_is_dup, pf_dup_frame, pf_dup_frame_firstseen }

– some existing fields we need to extract from TCP packets, to determine duplicates – all of these must be the same for us to consider two packets duplicates local f_ip_id = Field.new("ip.id") local f_tcp_seq = Field.new("tcp.seq") local f_tcp_srcport = Field.new("tcp.srcport") local f_tcp_dstport = Field.new("tcp.dstport") local f_tcp_stream = Field.new("tcp.stream")

– the table we use to track seen packet #s and seen field info – we'll use this as both an array and map table – the array portion is indexed by packet number – the map portion is keyed by "ip.id:tcp.seq:tcp.stream" – the resultant for both is the same instance of a subtable with the – packet numbers of the dups in an array list local packets = {}

local function generateKey(…) local t = { … } return table.concat(t, ':') end

– adds the packet's number to both the array and map – which is done when we see a particular set of fields for the first time local function addPacketList(pnum, key) local list = { pnum } packets[key] = list packets[pnum] = list end

– adds the packet to the array part, using an existing list of dups – also adds the packet's number to the list of dups local function addPacket(pnum, list) – add this packet's number to the array portion of the big table packets[pnum] = list – add this packet's number to the list of dups list[#list + 1] = pnum end

– whenever a new capture file is opened, we want to reset our table – so we hook into the init() routine to do that function tcpdup.init() packets = {} end

– some forward "declarations" of helper functions we use in the dissector local createProtoTree

– our dissector function function tcpdup.dissector(tvb, pinfo, tree) – first, check if this is an tcp packet, by seeing if it has a tcp.seq local tcp_seq = select(1, f_tcp_seq())

if not tcp_seq then
    -- not a TCP packet
    return
end

local pnum = pinfo.number

-- see if we've already processed this packet number
local list = packets[pnum]

if not list then
    -- haven't processed this packet
    -- see if the fields match another packet we've seen before
    local ip_id = select(1, f_ip_id())
    local tcp_seq = select(1, f_tcp_seq())
    local tcp_srcport = select(1, f_tcp_srcport())
    local tcp_dstport = select(1, f_tcp_dstport())
    local tcp_stream = select(1, f_tcp_stream())
    local key = generateKey(tostring(ip_id), tostring(tcp_seq), tostring(tcp_srcport), tostring(tcp_dstport), tostring(tcp_stream))

    list = packets[key]

    if not list then
        -- haven't seen these fields before, so add it as a non-dup (so far)
        addPacketList(pnum, key)
        createProtoTree(pnum, tree)
    else
        -- we haven't processed this packet, but we have seen the same fields
        -- so it's a duplicate.  Add its number to the array and entry...
        addPacket(pnum, list)
        -- and now create its tree
        createProtoTree(pnum, tree, list)
    end
else
    -- we found the packet number already in the table, which means
    -- we've processed it before
    createProtoTree(pnum, tree, list)
end

end

createProtoTree = function (pnum, root, list) – add our "protocol" local tree = root:add(tcpdup) local number_of_packets number_of_packets=1 if not list or #list < 2 then – it's not a duplicate tree:add(pf_is_dup, false):set_generated() else tree:add(pf_is_dup, true):set_generated() – now add the other packet numbers as reference tree item fields for _, num in ipairs(list) do if num < pnum then tree:add(pf_dup_frame, num):set_generated() tree:add(pf_dup_frame_firstseen,false):set_generated() end if num > pnum then tree:add(pf_dup_frame, num):set_generated() tree:add(pf_dup_frame_firstseen,true):set_generated() end end end end

– then we register tcpdup as a postdissector register_postdissector(tcpdup)

answered 09 Aug ‘16, 05:54

NJL's gravatar image

NJL
21448
accept rate: 0%