This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

How much data was sent by cloud webservice?

0

Hi,

I am running an HTTP restful data API) using a cloud based solution. I want to verify the data usage i.e. traffic out of the cloud (I am only charged for data sent out from the cloud. No charge you data going into the cloud).

The reason I am doing this is that I do not trust the billing data from the cloud provider.

So my plan is to use dumpcap to capture all traffic and then analyse the logfiles for a 24h period (the scary bit is that I have four instances serving data and each generate ~15GB of dumpcap logfiles per 24h). This is my capture command: D:\Progra~1\Wireshark\dumpcap -i \Device\NPF_{I/F DETAILS REMOVED} -b filesize:256000 -b files:600 -w c:\Wireshark\packets.cap

Apart from the quantities of data I am not quite sure how to filter out and only get the traffic going out and in the end get the total GB of data.

Any advice will be welcome!

asked 08 Aug '13, 04:24

APIshark's gravatar image

APIshark
1112
accept rate: 0%


One Answer:

2

Apart from the quantities of data I am not quite sure how to filter out and only get the traffic going out and in the end get the total GB of data.

First of all, I'm not sure if Wireshark is the ideal tool for such an analysis (amount of data, etc.).

I would probably go a different route if I had to analyze the traffic. Maybe I would enable SNMP on the servers and poll the interface stats (in/out packets and bytes) for a first and rough overview about the amount of data. Maybe the Windows Perfmon tool is also useful.

Anyway, if you want to do it with Wireshark, you can do it like this.

Do not record the full frame, as that will generate too much data and you don't need the payload, just the information about the length of the frame. Hence, please call dumpcap with the option -s 100 (snap length). This will only record the first 100 bytes of a frame, IP/TCP/UDP header + some payload data, just in case...

After you have finished the capture process, you need a script that walks through all file names and calls tshark with each file. Please use your preferred scripting tool on windows (perl, python, shell, poweshell, etc.).

tshark -nr input_0001.pcap -q -z conv,ip > output_0001.txt

The conv module will print the amount of packets and bytes in either direction for each IP conversations. Save this output for each pcap file and then parse the output files with a second script to extract the total amount of data sent by your server(s).

Regards
Kurt

answered 08 Aug '13, 05:48

Kurt%20Knochner's gravatar image

Kurt Knochner ♦
24.8k1039237
accept rate: 15%

Hi Kurt,

Thank you for the very helpful answer!

I have collected the data and done what you asked me to do with tshark. I get the following output (not included the last 3 columns and replaced some IP info with x),

                           |       <-      | |       ->      | |
                           | Frames  Bytes | | Frames  Bytes | |

168.xx.xx.xx <-> 10.xxx.xxx.26 20418 162450590 168380 229742169
... ...

So the 10.xxx.xxx.26 address is the web service IP from where I want to calculate outgoing data (it send data to the user) and it appears in both the IP address columns. In the 1st column I look for the .26 address and wherever it appears I sum the data in the 4th (->) column. Similar for the 2nd column where it appears I sum the data in the 3rd (<-) column.

Can you please confirm this is correct?

However since this is a Cloud installation there are several internal private addresses (10.) and a couple of public address where traffic is sent to. None of these count as external chargeable traffic so I am excluding all those by subtracting that data. For example the internal address 10.119.242.129 is receiving traffic from the .26 (web service) so in the 1st column I look for the 10.119.242.129 address and wherever it appears I take the data in the 3rd (<-) column and subtract it from the data I calculated going out (first paragraph above). Similar for the 2nd column where it appears I take the data in the 4th (->) column.

The result is very close to what I expected.

Many thanks! Anders.

(15 Aug '13, 06:05) APIshark
1

Can you please confirm this is correct?

The output of conv,ip looks like this (as you have shown in your example).

       |       <-      | |       ->      | |     Total     |    Relative    |   Duration   |
       | Frames  Bytes | | Frames  Bytes | | Frames  Bytes |      Start     |              |
10.1.1.1            <-> 20.1.1.1                150    783843       100         121272     250    905115     0,000000000         9,9971

So, there are 121272 bytes from 10.1.1.1 -> 20.1.1.1 and 783843 bytes from 20.1.1.1 -> 10.1.1.1. For that conversation, there is a total of 905115 bytes (in any direction).

None of these count as external chargeable traffic so I am excluding all those by subtracting that data.

correct.

The result is very close to what I expected.

good.

(15 Aug '13, 07:18) Kurt Knochner ♦

Thanks Again! This has been a very useful and interesting exercise.

(16 Aug '13, 03:15) APIshark

If an answer has solved your issue, please accept the answer for the benefit of other users by clicking the checkmark icon next to the answer. Please read the FAQ for more information.

(16 Aug '13, 03:19) grahamb ♦