Hi, I wonder what's the process of capturing Kafka producer buffer length (whatever is being passed over to socket to the broker as a single chunk). What I tried is running the following on a machine with Kafka producers:
Then:
The output looks a bit weird. There are numerous lines with numbers like:
Can you confirm whether these number represent real buffer size, or point me to a more correct direction? Thanks! asked 13 Mar '17, 01:25 spektom |
2 Answers:
It would help if you could provide an example pcap file (through cloudshark, dropbox, googledrive or any other filesharing service). When looking at the kafka protocol, I suspect there can be multiple values in one kafka PDU and one kafka PDU can span multiple TCP packets. Due to reassembly, Wireshark (and tshark) will gather the whole PDU and then parse it. As there are multiple values in the one PDU, there will also be multiple fields kafka.bytes_len, one for each value. What do you mean by "real buffer size"? answered 13 Mar '17, 02:33 SYN-bit ♦♦ |
The output appears correct: If you examine the attached capture with Wireshark, you will note that certain frames contain multiple instances of the field "kafka.bytes_len" with the values as shown in the tshark output (see below). I suggest that you look at the wireshark kafka dissection to determine if there exists a field which gives you the information wanted by you ("real buffer size"). (I'm not familiar with the kafka protocol). Partial tshark output from your capture file Notes: Current version of Wireshark filter name is "kafka.tcp.ports" "-d" option (decode as) is not needed since you are specifying the port in the "-o" option. I added "-e frame number" The "kafka.bytes_len" fields are shown only for the frames in which the complete kafka PDU is reassembled. Again, see the Wireshark dissection.
answered 13 Mar '17, 09:10 Bill Meier ♦♦ edited 13 Mar '17, 09:17 Thanks! I guess the closest I can get is (13 Mar '17, 22:40) spektom |
Kafka aggregates data into a buffer (the size is determined by
batch.size
andlinger.ms
parameters), then sends it over to a broker. This is what I meant to capture - the buffer size, which is being send to a broker. I've put some sample (1000 packets) here: https://www.cloudshark.org/captures/e92de4d1daf4