This is our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

I need to find a way to split a large pcap file into separated pcap files. What I want to find is a application like Splitcap but I need a application which runs on Linux. Tcpflow or Tcptrace don't generate pcap file as their output. The output pcap file should contains a tcp flow.

If there's an application, please let me know. Split pcap file using tshark will be very helpful for me.

asked 07 Dec '12, 08:21

fates's gravatar image

fates
35459
accept rate: 0%


tshark can do that.

tshark -nr input.cap -R "tcp.stream eq 1" -w stream_1.cap

Regards
Kurt

permanent link

answered 07 Dec '12, 08:31

Kurt%20Knochner's gravatar image

Kurt Knochner ♦
24.8k1039237
accept rate: 15%

Thanks for the comment, Kurt. But my input file contains more than one millinon flows. Is there any other options to do this?

(07 Dec '12, 08:33) fates
1

And you are asking for what? Having 1 million files, one for each stream?

If so, you can run tshark in a loop and use the loop counter in the stream filter and the output file name.

See the following question:

http://ask.wireshark.org/questions/4677/easy-way-to-save-tcp-streams

(07 Dec '12, 08:41) Kurt Knochner ♦

A easier/faster method would be this python script:

http://corelabs.coresecurity.com/index.php?module=Wiki&action=attachment&type=tool&page=Impacket&file=split.py

http://corelabs.coresecurity.com/index.php?module=Wiki&action=view&type=tool&name=Impacket

The script needs pcapy. You can install pcapy on Ubuntu like this:

apt-get install python-pcapy

(07 Dec '12, 08:57) Kurt Knochner ♦

Thanks, Kurt! By the way, I've already tried the python script. :) But this script uses "Impacket" package and this package cannot handle corrupted packets. This is why I'm trying to find other solutions.

(07 Dec '12, 09:51) fates

well, then use tshark.

(07 Dec '12, 09:59) Kurt Knochner ♦

Separate the packets into flows considering only 4 tuples: source address, source port, dest address, dest port for further analysis.

The packets are saved in the time order without any processing like TCP resembling.

The flow timeout is considered as 64 seconds suggested by CAIDA.

https://github.com/caesar0301/pkt2flow

permanent link

answered 25 Dec '12, 04:01

Jamin's gravatar image

Jamin
171
accept rate: 0%

The tshark scripts didn't finish in 30 minutes on my 4G pcap with about 40 flows. The following finished in about 90 seconds. tshark versus tcpdump?

#!/usr/bin/perl -w

use strict;

use Data::Dumper;

sub mysystem {
  my ($s, $donothing) = @_;
  chomp $s;
  print "$s\n";
  if ( defined $donothing ) {
    return;
  }
  my $rv = system "$s > cmd.out";
  if ($? == -1) {
    die "failed to execute: $!\n";
  } elsif ($? & 127) {
    die sprintf "child died with signal %d, %s coredump\n",
      ($? & 127),  ($? & 128) ? 'with' : 'without';
  } elsif( $rv ) {
      $rv = $rv/256;
      die "$s exited with status $rv\n";
  }
  `cat cmd.out`;
}

# return 4-tuples. the protocol is always tcp.
sub identify_tcp_flows {
  my $pcapfn = shift;

  my %flows;

  open F, "tcpdump -n -r ${pcapfn} tcp |" or die "fozzle";
  while (<F>) {
    if ( m{
          \A
          (?<timestamp>
            \d{2} :
            \d{2} :
            \d{2} [\.] \d+
          ) \s+
          IP \s+
          (?<src_ip>
            \S+
          )
          [\.]
          (?<src_port>
            \d+
          ) \s+
          > \s+
          (?<dst_ip>
            \S+
          )
          [\.]
          (?<dst_port>
            \d+ | http
          ) : \s+
          Flags \s+
          [\[]
          (?<flags>
            [^\]]+
          )
          [\]] , \s+
          (
            seq \s+ \d+ , \s+
          )?
          (?<ack>
            ack \s+
            (?<ackbytes>
              \d+
            ) , \s+
          ) ?
        }xms
       ) {
      if ( ! exists $flows{"$+{dst_ip}:$+{dst_port}-$+{src_ip}:$+{src_port}"} ) {
        $flows{"$+{src_ip}:$+{src_port}-$+{dst_ip}:$+{dst_port}"} = {
                                                                        src_ip => $+{src_ip},
                                                                        src_port => $+{src_port},
                                                                        dst_ip => $+{dst_ip},
                                                                        dst_port => $+{dst_port},
                                                                       };
}
    } else {
#      warn "couldn't parse $_";
    }
  }
  \%flows
}

my $pcapfn = $ARGV[0];
my $r_h_flows = identify_tcp_flows $pcapfn;
for my $f ( keys $r_h_flows ) {
  mysystem "tcpdump -n -r ${pcapfn} -w $f.pcap \"tcp and host $$r_h_flows{$f}{src_ip} and host $$r_h_flows{$f}{dst_ip} and port $$r_h_flows{$f}{src_port} and port $$r_h_flows{$f}{dst_port} \"";
}
permanent link

answered 14 Feb '13, 13:51

brucer42's gravatar image

brucer42
112
accept rate: 0%

edited 14 Feb '13, 14:14

You can use PcapSplitter which is part of the PcapPlusPlus suite. It's cross-platform so it can run on both Windows, Linux and Mac OS X. There's also a binary version for several OS's here. It can process large pcap files containing large amount of streams (both TCP and UDP). You should use it as follows:

./PcapSplitter -f /path/to/your/file.pcap -o /output/dir -m connection
permanent link

answered 23 Jul '16, 12:45

seladb's gravatar image

seladb
11
accept rate: 0%

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×752
×238
×35
×7

question asked: 07 Dec '12, 08:21

question was seen: 19,818 times

last updated: 23 Jul '16, 12:45

p​o​w​e​r​e​d by O​S​Q​A