This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

Split large pcap by VoIP sessions

0

Hi. I've got really large dump with plenty of VoIP sessions (over RTP). I wan't to split it into smaller files, but not by time or by size. I want to store each call-session into separate file. Is it possible with Wireshark, tshark or some other tools? I've tried to use the Lua script from examples: https://wiki.wireshark.org/Lua/Examples#Dump_VoIP_calls_into_separate_files But I'm not sure if it really works: nothing happens after script execution...

asked 09 Oct '16, 04:40

trixter's gravatar image

trixter
21459
accept rate: 0%

edited 09 Oct '16, 04:43

The Lua script you refer to says

require "rex_pcre"
require "luasql.mysql"

so to work, it needs a library processing posix-compliant regular expressions and a library allowing to interface a database (both seem an overkill to me but that's another story).

During runtime, the script creates a separate capture file for each VoIP call initiated using SIP and dumps to it all packets which belong to that call, based on the RTP and udptl (t38) dissectors' ability to identify the packet of a signalling protocol which contained the command setting up that particular RTP or udptl stream.

So what surprises me is that you say it does nothing at all, it should at least woe that the libraries are unavailable, or that

Are your VoIP calls initiated using SIP or using another protocol (because the script doesn't deal with MGCP, H.323, or H.248/MEGACO)?

Is Lua enabled in your Wireshark setup?

If it is, does Wireshark complain about anything wrong about Lua?

If you run tshark, can you see the Starting voip.lua script. line in its output?

Does the user under which you run Wireshark enough privileges to write into the destination directory?

(09 Oct '16, 07:08) sindy

OK, I've installed pcre and luasql dependencies, now I'm getting: "Lua: Error During execution of dialog callback: [string "-- voip.lua..."]:11: attempt to index global 'luasql' (a nil value)" when trying to execute this script in Wireshark.

(09 Oct '16, 07:36) trixter

Have you created a database voiper accessible using username voiper and password password in your MySQL before running the script?

After reading the script a bit more carefully, the use of an SQL database still seems to me an overkill as it works with just a single table of few columns so Lua tables (one per column), but I'd recommend to first pushstart it as it is and then eventually get rid of the SQL.

Besides, it also seems to me that the regexp is required only to extract a proprietary header x-inin-crn from the SIP messages, so it should be possible to leave it out completely.

And there is also os.clock() used to check whether a packet is still worth handling, so maybe the intention was to use the script only during live capture. Since you seem to be parsing existing files, you should skip this part as well.

(09 Oct '16, 07:53) sindy

Well, looks like dead end for me, because I'm not familiar with Lua in any way and can't make even small changes to this script - it's a blackbox to me :(

(09 Oct '16, 08:03) trixter

The alternative way would be to use MATE, but I assume it would mean just another hue of black to you, and although there is less to learn, there is also less to achieve - namely, MATE won't save the calls to files.

However, what is the ultimate goal of the exercise? 5000 files, each containing a single call, is also nothing convenient for manual search-through?

(09 Oct '16, 08:10) sindy

It's just for the sake of convenience: I've got pretty powerful computer, but still opening such a huge file and manipulating with it is a true pain. Also I want to load some sessions to external software for another kinds of processing, but I have to be sure that each file contains a separate session: not just pieces of some sessions split by time or size.

(09 Oct '16, 08:14) trixter

If you have ever programmed anything, Lua is not that complex to learn, and the business logic is quite simple:

  • register your tap to get all SIP, RTP and udptl packets

  • create a new output file with each new SIP Call-ID value you extract from an incoming SIP INVITE packet with no To-tag, and maintain a table mapping the Call-ID values to file names

  • copy each SIP packet bearing that Call-ID to the corresponding file, and if it contains an SDP, create a row in a table frame2callid where the value is the Call-ID or the file handle associated to it and the index is the frame number

  • for each RTP (or udptl) packet, use the rt.setup-frame value as an index to the frame2callid table to learn where (to which file) to copy it (and ignore RTP packets which don't have that value).

  • at the end of the capture, close all output files.

This way you'll save all calls whose initial INVITE is present in that capture; calls which had already been running when the capture started will be ignored. So you'll get just the beginning of calls spanning multiple source files.

Another limitation is that if the telephony engine of Wireshark fails to detect an RTP stream for whatever reason, you miss it too.

And yet another limitation is that if the traffic contains some complex scenarios like call transfers, or if there is just a B2BUA which decouples the Call-IDs between two branches of the same actual call, you'll have to merge several output files together to get everything related into a single file.

(09 Oct '16, 08:56) sindy
showing 5 of 7 show 2 more comments

2 Answers:

0

You can try do to that with TraceWrangler, using an "Extraction" task. By default, that task will split your file into sessions based on socket pairs. My guess is that each of your VoIP session has one specific socket pair which is different from all others.

answered 09 Oct '16, 04:47

Jasper's gravatar image

Jasper ♦♦
23.8k551284
accept rate: 18%

I've tried it on a sample small dump with 2 RTP sessions, but it "extracted" dozens of files... How do I filter only VoIP traffic for extraction?

(09 Oct '16, 05:43) trixter

Nope. Handling FTP is a rose garden as compared to handling VoIP. VoIP uses one protocol (set) to organize calls, and another protocol to deliver the media. The sockets used by the media are indicated in the application layer of the control/signalling protocol, so TraceWrangler would have to parse the control protocol to control handling of other protocols dynamically.

(09 Oct '16, 05:44) sindy

Okay, I'm not that familiar with VoIP captures I have to admit. In this case Tracewrangler won't be of much help, as it doesn't parse VoIP protocols at this time.

(09 Oct '16, 05:47) Jasper ♦♦

OK, got it. So is there any other solution? Maybe I can use some scripting like Pyshark? I've already extracted all sessions as list (CSV) using Wireshark capabilities (~5k sessions). It contains: "Source Address","Source Port","Destination Address","Destination Port","SSRC" and some other fields. Is it possible now to extract correspondent RTP streams line-by-line to separate pcap-files?

(09 Oct '16, 05:56) trixter

0

I've never collected enough motivation to write a Lua listener, and now I know why.

If you are 150 % sure that the SIP part of your VoIP traffic uses solely non-fragmented UDP packets as transport, the Lua code below is what you asked for, except that I haven't tested it on captures containing RTCP or T.38 packets.

Fragmentation of SIP packets as well as use of TCP as SIP transport renders it unusable, because the way it is written, the listener always receives only the last fragment of reassembled SIP PDUs, regardless whether they have been reassembled from IP fragments or TCP segments (or both), because the SIP dissector is invoked only when processing the reassembled transport layer.

To fix this, it would be necessary to send to the listener all the IP fragments and TCP segments, and the listener would have to remember them until they would become reassembled and then, depending on whether the result of the reassembly contained a valid SIP PDU or not, either save them to the output file (possibly creating weird negative timestamp deltas if an RTP packet would squeeze between two fragments of a SIP PDU) or just drop them.

Also, bear in mind that the Dumper.new method appends data to existing files, so you have to clean up the output directory before opening the same source capture another time.

-- the output directory may be "hardcoded" this simple way,
-- but if you use command line (tshark) and thus you can set
-- environment variables, use
-- local outputdir = os.getenv("my_output_path")
-- as a way to fetch the path from an environment
-- variable "my_output_path" instead

local outputdir = "c:/Users/your_login/Documents"

– declare the Lua table for file handles local files = {}

– declare the Lua table of frames containing SDPs local sdp_frames = {}

– prepare the field extractors for the individual protocol types which we are tapping local frame_number_f = Field.new("frame.number")

local rtp_setup_frame_f = Field.new("rtp.setup-frame")

local t38_setup_frame_f = Field.new("t38.setup-frame")

local rtcp_setup_frame_f = Field.new("rtcp.setup-frame")

local sip_callid_f = Field.new("sip.Call-ID") local sip_method_f = Field.new("sip.Method") local sip_to_tag_f = Field.new("sip.to.tag")

local sdp_version_f = Field.new("sdp.version")

– create and register the listener local tap = Listener.new("ip", "rtp or rtcp or t38 or (sip and !(sip.CSeq.method == REGISTER) and !(sip.CSeq.method == OPTIONS))")

– declare the executive body of the tap function tap.packet(pinfo,tvb,ip)

– declare a common function handling all media-like packets function handle_media(setup_frame) – if a setup frame for this media stream has actually been encountered, save the packet if sdp_frames[setup_frame] then files[sdp_frames[setup_frame]]:dump_current() end end

– attempt to extract all signature values local frame_number = frame_number_f().value – I can do it this because frame.number always exists local sip_callid = sip_callid_f() local sip_method = sip_method_f() local sip_to_tag = sip_to_tag_f() local sdp_version = sdp_version_f() local rtp_setup_frame = rtp_setup_frame_f() local rtcp_setup_frame = rtcp_setup_frame_f() local t38_setup_frame = t38_setup_frame_f()

– handle SIP packets if sip_callid then sip_callid_v = sip_callid.value

– check whether the PDU is an initial INVITE, and create a call if it is and if that call doesn't exist yet – because there was an unauthorized initial INVITE before sip_method = sip_method_f() if sip_method then if (sip_method.value == "INVITE" and not(sip_to_tag_f()) and not(files[sip_callid_v])) then local f_handle = Dumper.new_for_current( outputdir .. "/" .. tostring(sip_callid) ..".pcap" ) files[sip_callid_v] = f_handle end end

– check whether the PDU contains an SDP and if so, add the frame to the list – of those responsible for media stream establishment if files[sip_callid_v] then if sdp_version then sdp_frames[frame_number] = sip_callid_v end end

– finally, if the frame belongs to an existing call, copy it to the output file local f_handle = files[sip_callid_v] if f_handle then f_handle:dump_current() end end

– handle "media" packets if rtp_setup_frame then handle_media(rtp_setup_frame.value) end

if rtcp_setup_frame then handle_media(rtcp_setup_frame.value) end

if t38_setup_frame then handle_media(t38_setup_frame.value) end

end

– declare the function to print the progress, not actually necessary function tap.draw() end

– declare what to do after the last packet has been processed function tap.reset() – close all files at once here, which may be way too late if there are hundreds of calls – and so you may run out of your file handle quota for call_id,f_handle in pairs(files) do f_handle:flush() f_handle:close() end end

answered 09 Oct ‘16, 14:21

sindy's gravatar image

sindy
6.0k4851
accept rate: 24%

edited 09 Oct ‘16, 14:27