This is a static archive of our old Q&A Site. Please post any new questions and answers at ask.wireshark.org.

Search for unicode string (UTF-16)

0

When frame contains Unicode string like "select", it is displayed as "s e l e c t", the space between characters is the null character \x00 not the space.

if i use the display filter:

 frame contains "s e l e c t"

it is not filtered.

so, I have to convert the string "select" to hex decimal manually , and run the display filter:

frame contains 73:00:65:00:6c:00:65:00:63:00:74:00

and it's working.

Also, I tried to use the find tool (in the tool bar) and picked Wide (UTF-16) and entered "s e l e c t", but it couldn't find the string.

I use wireshark v 2.2.0

My question:

  • Is there a simple way to filter for Unicode string direct instead of converting string to hex string
  • what i should enter in the find tool when picking Wide (UTF-16) to search for the asci "select" but as a Unicode string

asked 15 Sep '16, 13:39

M-Hassan's gravatar image

M-Hassan
11115
accept rate: 0%


One Answer:

1

That's Unicode in a UTF-16 encoding, i.e. 2 bytes per code unit.

Leave the character encoding selector set to Narrow & Wide and just enter your string with the required characters, i.e. "select". This will search for both UTF-8/ASCII and UTF-16 encodings of the required string.

If you really want to find just UTF-16 encodings of the string, set the encoding selector to Wide, but still just enter the actual characters required.

answered 16 Sep '16, 02:35

grahamb's gravatar image

grahamb ♦
19.8k330206
accept rate: 22%

Thanks.It's working and find the string, but wireshark highligt last half of the string, review my sample http://imgur.com/qJtub28 Is this normal? What the actual character for UTF-16 to enter?

(16 Sep '16, 04:37) M-Hassan

That wasn't the original question though :-)

Looks like a bug to me when restricting the search to the packet bytes pane, please raise an entry on the Wireshark Bugzilla.

If an answer has solved your issue, please accept the answer for the benefit of other users by clicking the checkmark icon next to the answer. Please read the FAQ for more information.

(16 Sep '16, 05:10) grahamb ♦

what is the UTF-16 of the string "select" to enter when i pick the character encoding selector wide (UTF-16), so i can enter the display filter {frame contains "the UTF-16 of the the string select"}

(16 Sep '16, 05:56) M-Hassan
1

If the data you require is dissected in a field, you can just use the appropriate protocol.field == "mystring" filter.

If the data isn't in a field, as in your example of a TDS "select" statement, then you'll have to manually convert the string to the appropriate UTF-x equivalent, e.g. "select" encoded as UTF-16LE is 73:00:65:00:6c:00:65:00:63:00:74:00 and this would be used as frame contains 73:00:65:00:6c:00:65:00:63:00:74:00.

There is an online converter here, and you likely want to convert from text to UTF-16LE (as used by Windows systems).

(16 Sep '16, 06:45) grahamb ♦

Thanks for help. +10 for the tool convertor UTF-16LE

(16 Sep '16, 07:13) M-Hassan
(17 Sep '16, 07:05) M-Hassan
showing 5 of 6 show 1 more comments