Skip to content

tr: Illegal byte sequence

$ cat /tmp/foo
ÿs
$ tr -d \r < /tmp/foo 
tr: Illegal byte sequence
Whoops? Let's take a closer look:
$ od -x /tmp/foo 
0000000      73ff    0a0d
0000004
So, it's some unicode character (0xff), a small "s" (0x74), then a CR (0x0d, which I'm trying to remove) and a newline (0x0a) at the end. Turns out it's how MacOS 10.6 handles unicode characters. Specifying a different locale seems to help:
$ LC_CTYPE=C tr -d \r < /tmp/foo 
ÿs

Trackbacks

s9y testdrive on : whois: Invalid charset for response

Show preview
As if MacOS didn't have enough charset problems, here's another one: $ /usr/bin/whois denic.de % Error: 55000000013 Invalid charset for response Although the problem has been reported to DENIC years ago, they still send out UTF-8 data if the ha

Comments

Display comments as Linear | Threaded

Robin on :

Thanks. This has helped me clean up a massive 4Gb file!

Jon on :

Thanks very much :-) I18n strikes again!

ryan on :

thank you for posting this, christian!

Add Comment

E-Mail addresses will not be displayed and will only be used for E-Mail notifications.
Form options

Submitted comments will be subject to moderation before being displayed.