Skip to content

Fuzzy uniq & colorized HTML diffs

The other day I came across a file full of these infamous "Alle Kinder..." jokes. But the file was rich in almost-duplicates:
$ grep Liter alle-kinder.txt 
Alle Kinder sind besoffen, nur nicht Dieter, der trinkt noch 'n Liter
Alle Kinder sind besoffen, nur nicht Dieter, der trinkt noch nen Liter
Alle Kinder sind besoffen. nur nicht Dieter, der trinkt noch 'n Liter.
So I could not just use sort(1) and filter out the duplicates - because they were not really duplicates. So I needed something to look for similar jokes in that file and filter them out, just like uniq(1) would do.

Luckily the internet is here to help, as always, and I came across this fantastic script: a fuzzy version of uniq(1). Running the file through the script, only one of the 3 occurences of the same joke is left:
$ alle-kinder.txt | grep Liter
Alle Kinder sind besoffen, nur nicht Dieter, der trinkt noch 'n Liter.
Great! Oh, but then it'd be interesting to see which entries got kicked by the fuzzy uniq script. Sure, diff(1) could do that. But for some reason I wanted diff's output in color. Hm, ColorDiff? But what if I wanted the output to be HTML too? Don't ask what gave me that idea but it's nice to know that other people are equally crazy and put up a bash script to convert diff output into colorized HTML. Yeah, you got that right:
$ diff -u alle-kinder.txt alle-kinder_fuzzy.txt | | tidy > diff.html
And out comes something like this :-)


With April Fool's Day coming closer, it's time for yet another Upside-Down-Ternet howto - only this time with OpenWrt redirecting to an external Squid proxy. The setup in short:
  • Install Squid3, with the following settings in squid.conf:
      acl localnet src
      http_access allow localnet
      http_port 3128 intercept
      url_rewrite_program /usr/local/bin/
  • The /usr/local/bin/ does the actual work and turns the images upside down. There are a lot of other scripts to choose from :-)

  • Configure your local webserver, so that the URL from can be served. Also, one must take care that permissions are set correctly:
      mkdir -m2750 /var/www/ternet
      chown proxy:www-data /var/www/ternet
    This way, the Squid proxy running as user "proxy" can write to the directory while the webserver, running as user "www-data" can read from it.

  • Since there's OpenWrt running on our gateway, we have all the iptables power we need to redirect traffic to our Squid proxy:
     iptables -t nat -A prerouting_rule \
         -i $IFACE ! -s $PROXY -p tcp --dport 80 -j DNAT --to $PROXY:$PROXY_PORT
     iptables -t nat -A postrouting_rule \
         -o $IFACE -s $SRC -d $PROXY -j SNAT --to $ROUTER
     iptables -A forwarding_rule \
         -i $IFACE -o $IFACE -s $SRC -d $PROXY -p tcp --dport $PROXY_PORT -j ACCEPT
    Note: We're using the internal OpenWrt chains here, instead of the predefined PREROUTING, POSTROUTING, FORWARD chains. This way our rules actually get inserted rather than appended to any existing rules.

mysqldump: Incorrect key file for table

Today, mysqldump complained about:
mysqldump: Couldn't execute 'SELECT /*!40001 SQL_NO_CACHE */ * FROM `COLUMNS`': 
        Incorrect key file for table '/var/run/mysqld/#sql_2935_0.MYI'; 
        try to repair it (126)
mysqldump: Got error: 126: Incorrect key file for table '/var/run/mysqld/#sql_2935_0.MYI'; 
        try to repair it when retrieving data from server
Well, the error message is pretty clear - but there's no table in /var/run/mysqld. In fact, no database should reside in this directory! The server's my.cnf has:
  datadir         = /var/lib/mysql
  tmpdir          = /var/run/mysqld
So, what was going on here? Of course, the internet was there to help :-) Because /var/run was mounted as tmpfs and only 10MB in size, mysqldump appears to have exceeded this space. Resizing /var/run helped:
$ mount -o remount,size=134217728 /var/run                # 128M
Watching /var/run during the next mysqldump run shows how much memory is needed for it to complete:
varrun                128M  2.2M  126M   2% /var/run
varrun                128M  3.8M  125M   3% /var/run
varrun                128M  4.3M  124M   4% /var/run
varrun                128M  5.9M  123M   5% /var/run
varrun                128M  7.6M  121M   6% /var/run
varrun                128M  8.6M  120M   7% /var/run
varrun                128M   11M  118M   8% /var/run
varrun                128M   12M  117M   9% /var/run
varrun                128M   13M  116M  10% /var/run
varrun                128M  1.3M  127M   2% /var/run
All databases combined are only ~750MB in size, but resizing /var/run a bit might have been long overdue anyway :-)

openSUSE dependency madness

No, I'm not bored - just curious what the other camps are up to. This time: openSUSE 12.1. Of course, after a minimal installation a few things are missing. Manpages, for example. But what the hell is this:
   $ zypper install man
   The following NEW packages are going to be installed:
     cups-libs fontconfig ghostscript-fonts-other ghostscript-fonts-std 
     ghostscript-library groff groff-devx lcms libfreetype6 libgimpprint
     libjpeg62 liblcms1 libpng14-14 libtiff3 man 

   The following recommended packages were automatically selected:
     ghostscript-fonts-other ghostscript-library 

   15 new packages to install.
   Overall download size: 14.9 MiB. After the operation, additional 67.8 MiB will be used.
15 new packages, almost 70 MB for a bunch of text files. And why would I need cups-libs or libjpeg62? Luckily, zypper too can be tought not to install recommended packages:
   $ zypper install --no-recommends man
   The following NEW packages are going to be installed:
     groff man 

   2 new packages to install.
   Overall download size: 2.4 MiB. After the operation, additional 10.0 MiB will be used.
Much better :-) There's an installRecommends switch in /etc/zypp/zypper.conf, but this was not honored by zypper 1.6.18 :-\

Sadly, this does not help with all packages:
   $ zypper install --no-recommends nginx
   The following NEW packages are going to be installed:
     fontconfig gd libfreetype6 libGeoIP1 libjpeg62 libpng14-14 libxslt1 nginx-1.0 
     xorg-x11-libICE xorg-x11-libSM xorg-x11-libX11 xorg-x11-libXau 
     xorg-x11-libxcb xorg-x11-libXext xorg-x11-libXpm xorg-x11-libXt 

   16 new packages to install.
   Overall download size: 3.0 MiB. After the operation, additional 11.8 MiB will be used.
Really? openSUSE has bundled a webserver with Xorg libraries? Sigh... I can already smell the b0rkage when the next update is due and zypper is trying to untangle all the dependencies for every friggin' package ever installed because of insane dependencies like this.

Oh, and while we're at it: you might want to remove patterns-openSUSE-minimal_base-conflicts otherwise lots of packages won't install.

No sound in MacOS 10.7?

Well, now that more and more fanboys have upgraded to 10.7, the forums are full of it: sometimes it happens that sound stops working in MacOS Lion.
As it turns out, restarting coreaudiod is all it takes to get it working again:
$ launchctl list | grep audio
7159    -

$ launchctl stop

$ sudo launchctl list | grep audio
7589    -