This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

Question on linebreaks on output from tshark’s -z io,stat option

0

Hello,

I've done up a script that reads a capture file with tshark's -z io,stat argument, where the goal is to be able to generate statistics on several different display filter search criteria with a single pass on the capture file itself (automatically-generated capture files of predictable name and timestamp, where the script users tshark to get the stats off of that time period and pushes it to a line in a .csv file). This was a slightly tedious effort due to every added display filter increasing the line count in the output of the tshark query, which also changes which line of output the statistics themselves are generated in.

Anyway my question is this:

Right now it looks like all outputs of this command will put the statistiics onto a single, very long line even when dozens of display filters are used in the query. Is it a safe assumption with current Wireshark/Tshark versions (in this example, 1.8.6) that the io,stats printout will put all stats on a single very-long line, or will it break the line at some upper limit and use a second line? If it does, what is that upper limit? The reason is I'm making that one-line assumption at the moment and don't want my scripts to break if they're calling a hundred different display filters.

Related question - any way we could lose the text art in that output and just pump out a nice clean delimited line of stats in the order requested?

asked 04 Oct '13, 20:53

Quadratic's gravatar image

Quadratic
1.9k6928
accept rate: 13%


2 Answers:

0

Yes I tried to break it last night but it does seem to be always one line. I'm 'relatively' confident that I'm safe there

You can be safe. I've just checked the source code. There is no limit (besides available RAM), so you can rely on a single line.

See iostat_draw() in tap-iostat.c. The required space for the column data is requested via g_malloc() and the column data itself is printed in small pieces with printf, column by column, so there is not even a large string that needs to be handled internally.

There is also no limit from the OS, at least I cannot imagine one, because if there was a limit you would not be able to pipe large amounts of data via STDOUT/STDIN into another program, which is obviously not the case on any of the current OSes.

Regarding the fancy ASCII art. You can simply convert that to CSV with this one liner on Linux and similar OSes.

   tshark -nr input.pcap  -z io,stat,1 -q | grep '<>' | sed 's/ <> */;/' | sed 's/^| *//' | sed 's/ *| */;/g'

It might not be the best and fastest, nor the most elegant regexp, but it works ;-)

Regards
Kurt

answered 07 Oct '13, 12:21

Kurt%20Knochner's gravatar image

Kurt Knochner ♦
24.8k1039237
accept rate: 15%

edited 07 Oct '13, 12:24

Thanks, yes I'm not familiar enough with WS's source code to know where to look there.

Oh, and for the line catch, as it's just one line the more efficient way is probably to grab the line number as it's directly related to the number of display filters being called. Then just a single 's/|/,/g' to delimit on commas and a '/[^0-9,]//g' to turn it into the raw stats without all the ASCII junk. The only worry I had was an extra line break breaking that '# of filters -> line to grab' relation.

(07 Oct '13, 14:33) Quadratic

How about this?

tsahrk ... |grep '<>' | sed 's/[|<>]//g'

does not create a CSV, but the output is easy to parse, like perl split() on whitespace.

(07 Oct '13, 14:45) Kurt Knochner ♦

Would work, though a grep function will never be as efficient as just telling the script what line to read from. My method finds the line as a simple linear function of the number of display filters that the user wants to build stats for.

(07 Oct '13, 15:22) Quadratic

as just telling the script what line to read from

Ah, you're reading the tshark output directly. Well, that's the way I would have done it as well. I thought you were looking for ASCII art free output. Never mind ;-)

(07 Oct '13, 15:28) Kurt Knochner ♦

Yeah, I get the ASCII-free output by doing a perl split function on the | delimiter once I point to the right line. It becomes a CSV later once I sort the stats I actually care about from the line.

(14 Oct '13, 10:01) Quadratic

1

I just tested this on OSX, I was able to create lines with a length of >50000 characters with the "io,stat" option. I assume (without looking at the source code) that there is no limit in tshark itself, but that there might be a limit imposed by the OS.

Regarding the "pretty printing", AFAIK this is hardcoded. You could file an enhancement request to add an option to csv'ify the output.

answered 05 Oct '13, 01:11

SYN-bit's gravatar image

SYN-bit ♦♦
17.1k957245
accept rate: 20%

Thanks. Yes I tried to break it last night but it does seem to be always one line. I'm 'relatively' confident that I'm safe there.

I've also submitted an enhancement request for the output. It's not really a big deal to account for the variable line count and stats position based on user display filters present, but my feeling here is that the output for a human user is kind of silly as well since you're presented with a long line of ASCII art that breaks several times across the screen, meanwhile it's not intuitive to grab the stats in a script either so it kind of misses both audiences a bit.

(05 Oct '13, 18:35) Quadratic