Hello, I've done up a script that reads a capture file with tshark's -z io,stat argument, where the goal is to be able to generate statistics on several different display filter search criteria with a single pass on the capture file itself (automatically-generated capture files of predictable name and timestamp, where the script users tshark to get the stats off of that time period and pushes it to a line in a .csv file). This was a slightly tedious effort due to every added display filter increasing the line count in the output of the tshark query, which also changes which line of output the statistics themselves are generated in. Anyway my question is this: Right now it looks like all outputs of this command will put the statistiics onto a single, very long line even when dozens of display filters are used in the query. Is it a safe assumption with current Wireshark/Tshark versions (in this example, 1.8.6) that the io,stats printout will put all stats on a single very-long line, or will it break the line at some upper limit and use a second line? If it does, what is that upper limit? The reason is I'm making that one-line assumption at the moment and don't want my scripts to break if they're calling a hundred different display filters. Related question - any way we could lose the text art in that output and just pump out a nice clean delimited line of stats in the order requested? asked 04 Oct '13, 20:53 Quadratic |
2 Answers:
You can be safe. I've just checked the source code. There is no limit (besides available RAM), so you can rely on a single line. See There is also no limit from the OS, at least I cannot imagine one, because if there was a limit you would not be able to pipe large amounts of data via STDOUT/STDIN into another program, which is obviously not the case on any of the current OSes. Regarding the fancy ASCII art. You can simply convert that to CSV with this one liner on Linux and similar OSes.
It might not be the best and fastest, nor the most elegant regexp, but it works ;-) Regards answered 07 Oct '13, 12:21 Kurt Knochner ♦ edited 07 Oct '13, 12:24 |
I just tested this on OSX, I was able to create lines with a length of >50000 characters with the "io,stat" option. I assume (without looking at the source code) that there is no limit in tshark itself, but that there might be a limit imposed by the OS. Regarding the "pretty printing", AFAIK this is hardcoded. You could file an enhancement request to add an option to csv'ify the output. answered 05 Oct '13, 01:11 SYN-bit ♦♦ Thanks. Yes I tried to break it last night but it does seem to be always one line. I'm 'relatively' confident that I'm safe there. I've also submitted an enhancement request for the output. It's not really a big deal to account for the variable line count and stats position based on user display filters present, but my feeling here is that the output for a human user is kind of silly as well since you're presented with a long line of ASCII art that breaks several times across the screen, meanwhile it's not intuitive to grab the stats in a script either so it kind of misses both audiences a bit. (05 Oct '13, 18:35) Quadratic |
Thanks, yes I'm not familiar enough with WS's source code to know where to look there.
Oh, and for the line catch, as it's just one line the more efficient way is probably to grab the line number as it's directly related to the number of display filters being called. Then just a single 's/|/,/g' to delimit on commas and a '/[^0-9,]//g' to turn it into the raw stats without all the ASCII junk. The only worry I had was an extra line break breaking that '# of filters -> line to grab' relation.
How about this?
does not create a CSV, but the output is easy to parse, like perl split() on whitespace.
Would work, though a grep function will never be as efficient as just telling the script what line to read from. My method finds the line as a simple linear function of the number of display filters that the user wants to build stats for.
Ah, you're reading the tshark output directly. Well, that's the way I would have done it as well. I thought you were looking for ASCII art free output. Never mind ;-)
Yeah, I get the ASCII-free output by doing a perl split function on the | delimiter once I point to the right line. It becomes a CSV later once I sort the stats I actually care about from the line.