This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

Problem using Tshark to write fields to csv file: Tshark breaks a field value into 2 values if it has a comma character

0

I have some network traffic in the form of a .pcap file. It has around 5000 packets. I have to read some 50 fields from it (like Arrival time, Source IP, Destination IP) etc. and dump them into a MySQL database.

The method I have tried to use is using Tshark to extract the fields I am interested in, and write them into a .csv file. Then I could use a single SQL query to dump the contents of .csv file into the database. So the command looks something like

tshark -r my-file.pcap -T fields -e frame.time ... -e ip.src -e ip.dst > outputfile.csv

The problem with this is that tshark does not parse the values of fields properly when putting them into the .csv file. For example, when it finds a comma (,) character occuring within a value, it breaks the value there, puts the part of the value occurring before the , into that field, and puts the rest of the value (i.e. the part occuring next to ,) into the next field, and so the actual value of the next field goes into the field next to the next field, and so on...

For example if there are two fields: frame.time with a value of Jun 23, 2016 08:15:00.844245000 and ip.src with a value of X.X.X.X, then this is what .csv file looks like:

frame.time        src.ip
Jun 23            2016 08:15:00.844245000        X.X.X.X

and this is how it turns out in the database table:

__________________________________________________
|| frame.time    ||    src.ip                   ||
|| Jun 23        ||    2016 08:15:00.844245000  ||
|| X.X.X.X       ||    NULL                     ||

The question is that how do I fix this? Any tips/suggestions/advice is welcome.

asked 10 Oct '16, 21:31

Jesss's gravatar image

Jesss
51141720
accept rate: 0%

edited 10 Oct '16, 22:06


One Answer:

0

It's not that tshark creates two instances of the field, it's that tshark does not treat fields which contain the comma symbol specially, so the commas from inside the fields are indistinguishable from the commas separating the instances of the same field for the application processing the csv.

Look at the -E command line option at tshark man page. You can change the separator of same field instances from comma to something else, and you can ask Wireshark to add quoting characters to each field.

answered 10 Oct '16, 22:18

sindy's gravatar image

sindy
6.0k4851
accept rate: 24%