Hi, We have some web services that we'd like to capture packets for using tshark. Some of the service traffic is dissected using -R xml argument and some are dissected using -R data argument. We like to see the payload using the -T fields argument. So with -R data we run the following command.
We'd like to run a similar command to get the actual text of the XML So to get an ouput such as
We tried something like this asked 25 May '12, 07:05 aaghili edited 25 May '12, 08:37 grahamb ♦ |
3 Answers:
Can you try this:
Please check this question (similar problem). Regards answered 25 May '12, 07:08 Kurt Knochner ♦ edited 25 May '12, 07:24 |
This is what I get, when I load this page: URL:
Command:
Output:
tshark Version: Questions:
Regards answered 25 May '12, 10:53 Kurt Knochner ♦ edited 25 May '12, 11:08 I get the same result. its blank. This service does get dissected correctly as XML when I run this command. tshark -i 7 -R xml -V -l When I run this command I can see the packet. The issue is the XML is so large that this outputs over 20000 lines. The reason I'd like to use the -T fields is so I can get one single line that's very large instead of 20000 small lines. Our app reads the standard out of tshark output. Thank you for the help! (25 May '12, 11:09) aaghili the version of tshark is 1.2.10 running on linux 64bit. (25 May '12, 11:11) aaghili can you upgrade tshark? (25 May '12, 11:20) Kurt Knochner ♦ sure where do you download the latest linux version 64bit? (25 May '12, 14:40) aaghili Use your distributions package manager, or compile your own. Compilation details can be found here. (25 May '12, 14:46) grahamb ♦ ok I now get what you're getting but that is not the valid xml that is being sent back. take a look at your xml text. That's just some tags not the actual response for the site. Is there a way to see the actual payload of the xml response? (26 May '12, 06:22) aaghili That's the response of the server. It contains the HTTP response headers and the data (XML). If you remove the HTTP headers (with a script) you will get the XML data. (26 May '12, 09:01) Kurt Knochner ♦ ok I now get what you're getting but that is not the valid xml that is being sent back. take a look at your xml text. That's just some tags not the actual response for the site. Is there a way to see the actual payload of the xml response? Here what the payload is <?xml version="1.0" encoding="ISO-8859-1"?><!-- Edited by XMLSpy® --><breakfast_menu> <food><name>Belgian Waffles</name><price>$5.95</price><description>two of our famous Belgian Waffles with plenty of real maple syrup</description> <calories>650</calories></food><food><name>Strawberry Belgian Waffles</name> <price>$7.95</price><description>light Belgian waffles covered with strawberries and whipped cream</description><calories>900</calories></food><food><name>Berry-Berry Belgian Waffles</name> <price>$8.95</price><description>light Belgian waffles covered with an assortment of fresh berries and whipped cream</description><calories>900</calories> </food> (26 May '12, 09:33) aaghili sorry I'm not sure if I understand. The payload isn't this (26 May '12, 09:36) aaghili Extracting the text part does so only for each packet separately. You will have to combine the output to get the whole XML response. I believe what you are looking for is the output of "Follow TCP stream" (Wireshark GUI). If that's the case, there is a new option in tshark 1.7.x, that does almost the same:
This will print all data of TCP stream #5 in ASCII format. To figure out which stream contains XML data, you can use this command:
It will print the stream number of every TCP stream that contains XML data. Unfortunately you cannot do this on the fly, during data capture (-i). So you have to capture the data first and then analyze it. Furthermore I was not yet able to get the out always in ASCII. Sometimes the output of If it's the same for you, I suggest to use tcpflow instead. See my answer for this question:
(26 May '12, 11:00) Kurt Knochner ♦ thank you kurt! I'll try this next week with some of our internal services and let you know. Al (26 May '12, 15:28) aaghili O.K. BTW: If you like my answer, you can select it as the right one. see faq. (27 May '12, 01:21) Kurt Knochner ♦ This option doesn't work for me. Because the output is not sent to standard out but to a file and I need to sniff on an interface. So if I run the command as such 'tshark -i 2 -z follow,tcp,ascii,1 -q' the ouput is written to std out when I hit ctrl-C (only). Our application needs to read the standard out of tshark on the fly. Another way I could get around this problem is to force tshark to dissect all packets as data packets. So even an XML packet be decoded as a data packet. Is there such an option? I want to run tshark -i 2 -R data and see an XML packet in the output of this command. (27 May '12, 05:49) aaghili as I said, you can't use If you need that, you have these options
You could tell us more what you want to achieve. Maybe there is another way to achieve that. (27 May '12, 12:14) Kurt Knochner ♦ If wanted to use option 1. how would I run to get all the packets to re-assemble. If I run the following tshark.exe -i 2 -T fields -e frame.number -e frame.time -e ip.src -e ip.dst -e data -e text port 80 i always get this </name>,</price>,</description>,</calories>,</food>,</name>,</price>,</descr iption>,</calories>,</food>,</name>t; and not the whole payload. I need to see the whole payload to re-assemble. Basically what I want is to see the whole XML payload either in text/ or hex. I don't mind re-assembling packets. (27 May '12, 12:46) aaghili O.K. it looks like
Then extract the data from the output. This works on my system. If the content is compressed, tshark will first print the compressed content and then the "Uncompressed entity body". Option -V works as well. It's more data to parse, but possibly easier to parse. You could even try this:
Then look for 'field name="xml.tag"' (27 May '12, 13:10) Kurt Knochner ♦ I check the options you mentioned. if the -T Fields option would give me the whole packet that would have worked best. I don't think this is gzip issue. I tried this with other services and I was getting the same result. It seems like its a bug. Because the only things it prints is the end xml tags only like this. </food>. not the openning tag or the actual body of the element but only the end element. (27 May '12, 14:29) aaghili ok I checked these options and non will be an advantage over just running it like this. tshark -i 2 -R xml -V port 80. Again the problem is that when this is sent to standard out it puts each tag as a separate line. So a large xml payload could have 30000 lines that we need to process. The nice thing about the -T fields option (if it worked) would be that the payload is sent to standard out in a single line no matter how large it is. Much easier to process with better performance. (29 May '12, 09:32) aaghili Can you tell me where in the code I can look to find out about this issue? basically I'd like to see the where the ouput of this command is generated. tshark.exe -i 3 -T fields -e frame.number -e frame.time -e ip.src -e ip.dst -e data -e text port 80 Thanks (01 Jun '12, 07:40) aaghili tshark.c:process_packet() -> print_packet() -> print.c:proto_tree_write_fields() (01 Jun '12, 08:06) Kurt Knochner ♦ showing 5 of 20 show 15 more comments |
O.K. here is (probably) my last attempt for a solution/explanation ;-) If you run the following command, you will see the XML tags and the XML data.
HOWEVER, there is no field xml.data to print the whole XML structure. You are free to extend wireshark/tshark with such a field. The alternative would be -e text (the 'content-encoded entity body') of the http response.
HOWEVER this does NOT print the full HTTP response. I'm not sure if this works as designed, or if it is a bug (gzip encoding). Please open a bug report, if you think this is a bug. Then, the above command (-R xml) works ONLY while reading from a file, it does NOT work while reading from an interface (-i), at least not on my test system. So, your attempt to collect the whole XML data structure in one line, on-the-fly, will not work. Look at the man page for a possible explanation for this (option -R):
If the packets that did not match the filter "xml" are dropped, then tshark is probably not able to build the required internal state to collect the whole xml data. Can someone with better knowldge about the internals, please comment on this? Regards answered 01 Jun '12, 09:07 Kurt Knochner ♦ edited 01 Jun '12, 09:14 |
Thank you for the response. When I set it up as you said I don't see any xml text. Its just blank
this is what I get. HAve you tested this and it works for you?
Frame 117 (851 bytes on wire, 851 bytes captured) May 25, 2012 10:35:11.582303000 10.203.192.79 10.202.160.99 Transmission Control Protocol, Src Port: cdid (3315), Dst Port: http (80), Seq: 55761, Ack: 1, Len: 797