This is our old Q&A Site. Please post any new questions and answers at

I'm using tshark to get html pages out of a capture file, and checking how many of those contain a specific element. Currently I'm using the following params:

tshark.exe -r test.pcap -o "tcp.desegment_tcp_streams:TRUE"  -R " and http" -T pdml > test.session13.pdml

and get the html documents themselves inside a data-text-lines element spread over many field elements in the output pdml. Which means that I need to concatenate the data in those elements to get the whole html back together. That's not a great problem, I just wonder, is there a better way to do so? Meaning, letting tshark itself output the complete html?


asked 21 Dec '10, 05:56

r0u1i's gravatar image

accept rate: 0%

edited 21 Dec '10, 06:14

How about:

tshark.exe -r test.pcap -o "tcp.desegment_tcp_streams:TRUE" -R "http contains <string>"
permanent link

answered 21 Dec '10, 06:05

SYN-bit's gravatar image

SYN-bit ♦♦
accept rate: 20%

It's more complicated than that, it's really a validation rule (the location of javascript references). So I need to get the plain HTMLs out.

(21 Dec '10, 06:14) r0u1i

Then you might want to take a look at tcpflow


(21 Dec '10, 06:18) SYN-bit ♦♦

I'm accepting it as 'NO, there's no better way in tshark' :)

(22 Dec '10, 02:47) r0u1i
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text]( "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:


question asked: 21 Dec '10, 05:56

question was seen: 4,069 times

last updated: 22 Dec '10, 02:47

p​o​w​e​r​e​d by O​S​Q​A