Is there any better way to dump data? (http)

Question

I'm using tshark to get html pages out of a capture file, and checking how many of those contain a specific element. Currently I'm using the following params:

tshark.exe -r test.pcap -o "tcp.desegment_tcp_streams:TRUE"  -R "tcp.stream==13 and http" -T pdml > test.session13.pdml

and get the html documents themselves inside a data-text-lines element spread over many field elements in the output pdml. Which means that I need to concatenate the data in those elements to get the whole html back together. That's not a great problem, I just wonder, is there a better way to do so? Meaning, letting tshark itself output the complete html?

thanks!

Accepted Answer

1

How about:

tshark.exe -r test.pcap -o "tcp.desegment_tcp_streams:TRUE" -R "http contains <string>"

answered 21 Dec '10, 06:05

SYN-bit ♦♦
17.1k●9●57●245
accept rate: 20%

It's more complicated than that, it's really a validation rule (the location of javascript references). So I need to get the plain HTMLs out.

(21 Dec '10, 06:14) r0u1i

Then you might want to take a look at tcpflow

(http://www.circlemud.org/~jelson/software/tcpflow/)

(21 Dec '10, 06:18) SYN-bit ♦♦

I'm accepting it as 'NO, there's no better way in tshark' :)

(22 Dec '10, 02:47) r0u1i