Monday, January 18, 2010

hexBinary Encoding

Recently, I hacked together a wrapper script for reporting job statuses to Hudson. The XML API in Hudson called for "hexBinary" encoded data. I hadn't heard of this before, and couldn't find much in the way of decent examples on Teh Interwebs. From the spec, it seems to be pretty simple: for each byte in your data, write out its two character hex value. So if your byte has decimal value 223, write out its hex string: "DF". (Aside: this seems like a silly encoding, at least space-wise: why not the ubiquitous base64?) I wanted a simple shell script, so the issue was how to do this encoding without pulling in a full-out scripting language. Fortunately, hexdump has format strings. Unfortunately, its docs aren't great.

Example of hexBinary encoding using hexdump:

echo "Hello world" | hexdump -v -e '1/1 "%02x"'
48656c6c6f20776f726c640a

So what the hell is that? -v means don't suppress any duplicate data in the output, and -e is the format string. hexdump's very particular about the formatting of the -e argument; so careful with the quotes. The 1/1 means for every 1 byte encountered in the input, apply the following formatting pattern 1 time. Despite this sounding like the default behaviour in the man page, the 1/1 is not optional. /1 also works, but the 1/1 is very very slightly more readable, IMO. The "%02x" is just a standard-issue printf-style format code.

References:

No comments:

Post a Comment