This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

Analysing large trace file (~800G)

0

Hi folks,

I have a problem that I'm trying to track down for weeks now. It seems that only a full trace from the beginning of the CIFS session (Kerberos ticket exchange) up to the end will help. This will result in a trace of ~ 800G.

Obviously I don't feel like getting a machine with 1 TB of RAM. Is there any way to make WS decode the stream without loading everything into the memory? The point here is that I can search for the packet where something goes wrong because I have the contents of that packet.

So what I'd like WS to do is to decode the stream up to the packet that I'm looking for, and then let me look back and forth. I there a way to do this with a reasonable dimensioned machine with say 32 G of RAM?

Thanks Andre

asked 06 Feb '16, 04:57

Alphaphi's gravatar image

Alphaphi
6113
accept rate: 0%

edited 07 Feb '16, 05:38

Christian_R's gravatar image

Christian_R
1.8k2625

These things come first into my mind when I read your question:
- You can split the trace into smaller pieces
- You can use a tool like tracewrangler https://www.tracewrangler.com for this
- You can use tshark - You can use tshark outputs with excel

(06 Feb '16, 06:18) Christian_R

800G is a a LOT - TraceWrangler is probably not able to handle it, at least not yet or unless the workstation is a RAM monster.

Question is: why do you need to look at the whole stream? Is there anything needs to be decrypted or decoded that requires looking at all packets? And is the 800G one single TCP connection? That would be the largest I ever heard of - what is transferred inside?

(06 Feb '16, 17:00) Jasper ♦♦

Hi Jasper, this is a CIFS-Transfer of ~300-400 GB from one NAS storage to another. So the copy box will read in the files to be transferred from the source, and write out everything to the target, which doubles the volume.

Now I'm not a WS-specialist. But I know one, and he tells me that it is neccessary to capture the Kerberos ticket exchange at the very beginning of the CIFS stream, and then walk along the whole stream up to where the error occurs (which I suspect to be somewhere in the answers from AD, and I want to prove that). I'm looking for a way how to do this without the need of having a 8-TB-RAM-Box.

(07 Feb '16, 01:01) Alphaphi

About what a data rate do we talk.

And maybe this question gives you a hint how others deal with this requirements. https://ask.wireshark.org/questions/47868/looking-on-recommendationsbest-practices-on-ws-deployment-at-large-data-centers

(07 Feb '16, 01:47) Christian_R

Data rate is not too high, the source has only a 1-GBit-pipe, so it's max 2 GBit on the copy box.

(07 Feb '16, 04:45) Alphaphi

Can you describe the overall environment for the capture? In particular:

  • will you capture on the copybox itself or on a separate machine?

  • whatever the capturing machine will be, will it have a local disk big enough or will it send the capture file to the NAS over Ethernet as well?

(07 Feb '16, 05:40) sindy

Well from point of view: Traces with more then 300MBit/s bandwith are at the border where Wireshark works reliable without some tuning mechanisms. https://ask.wireshark.org/questions/523/the-peak-of-network-flow-rate-that-wireshark-can-deal-with
Also you should think about the capabilities of the capture point(SPAN Port, TAP, local machine).

(07 Feb '16, 05:46) Christian_R
showing 5 of 7 show 2 more comments

One Answer:

0

If you by chance already have the file, I'm afraid you'd have to use tcpdump on a linux machine to split it into several files, because dumpcap cannot read from file and both tshark and Wireshark collect protocol state so they are likely to run out of RAM on such a huge file.

If you are only getting ready for the capture, the easiest way is to tell dumpcap (which you have to use anyway due to RAM limits) to save the data into files not bigger than X megabytes, so -w your_file_name -b filesize:100000 -b files:10000 as parameters to dumpcap will create up to 10000 files of 100 MBytes each. You can then use a batch to let tshark search for the pattern you know in all the files, and then you can merge the file where it finds it and the previous one together so that you could work with enough history before the trigger packet.

answered 06 Feb '16, 07:29

sindy's gravatar image

sindy
6.0k4851
accept rate: 24%

Seems I have missed the key point which is the encryption. You'll have to try, or wait for someone who knows for sure to answer, whether you could capture the initial key exchange in the first file, then merge the first and third file together and see whether the decryption algorithm would be able to re-sync and decrypt the data from the third file despite the gap in them.

(06 Feb '16, 10:04) sindy

It's not an encrypted stream. But as said, my WS guru tells me that it is crucial to process the Kerberos ticket at the very beginning of the session all along to where the error occurs. I must find out if the method you outline will do the trick together with the Kerberos ticket.

And no, I don't have captured the stream yet. At the moment I try to figure out how to do it best (or least painful). Furthermore, I'll have to capture the stream many times, as the error occurs only once in a while :(

(07 Feb '16, 01:05) Alphaphi