It's the "shift right" operator. See the pcap-filter(4) man page; look at the "expression" section, which describes the syntax of expressions like that. (The syntax is based on the C programming language syntax for expressions; a number of other languages use a similar syntax.)
But that doesn't explain how that filter expression works.
tcp[12:1]
fetches the value of the byte at an offset of 12 from the beginning of the TCP header. RFC 793 is the specification for TCP, and shows what the TCP header looks like. Section 3.1 of RFC 793 shows the header; the byte at an offset of 12 has, in the upper 4 bits, the "data offset", which indicates how long the TCP header is, in units of 4-byte words.
tcp[12:1] & 0xf0
clears the lower 4 bits (0xf0
, in the filter's C-style syntax, is the hex value F0, which has the upper 4 bits set - F being 15, which is 1111 in binary - and the lower 4 bits clear; &
is the "bitwise AND" operator, just as in C), so that's the "data offset" in the upper 4 bits.
(tcp[12:1] & 0xf0) >> 2)
takes that value and shifts it right by 2 bits. To convert the value with the "data offset" and 4 bits of zero to a data offset in bytes, you would first shift it right 4 bits, to put the data offset in 4-byte words in the lower 4 bits, and then multiply the result by 4 to convert it from a count of 4-byte words to a count of bytes.
But multiplying by 4 is, for an unsigned value (such as the byte in question), equivalent to shifting left by 2, so that's shifting it right 4 bits and then left by 2 bits. If the lower bits are all zero, that's equivalent to shifting right by 2 bits, so (tcp[12:1] & 0xf0) >> 2)
is calculating the length of the TCP header, in bytes.
So tcp[((tcp[12:1] & 0xf0) >> 2):4]
fetches the 4 bytes at an offset of "length of TCP header, in bytes" - i.e., the first 4 bytes after the TCP header - as a big-endian number. If those 4 bytes are, in order, the ASCII character "G", the ASCII character "E", the ASCII character "T", and the ASCII character " " (a space) - i.e., the first 4 bytes of an HTTP "GET" request" - those 4 bytes, as a big endian number, would be 0x47455420, so that's checking whether the TCP payload begins with "GET ".
answered 26 Nov '16, 19:12
Guy Harris ♦♦
17.4k●3●35●196
accept rate: 19%
impressive answer :-)
So basically does this:
1111 0000 >> 2 becomes 0011 1100
the same as above but with hexadecimal numbers:
0xf0 >> 2 becomes 0x3C
Now I get it.
Thanks for your detailed response.