Skip to content

Ext4 on MacOS X

With the new Raspberry Pi 3 Model B at hand and Raspbian already running, I wanted to see if the AArch64 port of Arch Linux would run as well. As I didn't have a real computer available at that time, I tried to get the image on the microSD card on MacOS .

First, let's unmount (but not eject) the microSD card:
$ diskutil umountDisk disk2
Unmount of all volumes on disk2 was successful
Create two partitions on the device:
$ sudo fdisk -e /dev/rdisk2
fdisk: 1> erase
fdisk:*1> edit 1
Partition id ('0' to disable)  [0 - FF]: [0] (? for help) 0B
Do you wish to edit in CHS mode? [n] 
Partition offset [0 - 31116288]: [63] 
Partition size [1 - 31116225]: [31116225] 204800

fdisk:*1> edit 2
Partition id ('0' to disable)  [0 - FF]: [0] (? for help) 83
Do you wish to edit in CHS mode? [n] 
Partition offset [0 - 31116288]: [204863] 
Partition size [1 - 30911425]: [30911425] 

fdisk:*1> p
Disk: /dev/rdisk2       geometry: 1936/255/63 [31116288 sectors]
Offset: 0       Signature: 0xAA55
         Starting       Ending
 #: id  cyl  hd sec -  cyl  hd sec [     start -       size]
------------------------------------------------------------------------
 1: 0B    0   1   1 - 1023 254  63 [        63 -     204800] Win95 FAT-32
 2: 83 1023 254  63 - 1023 254  63 [    204863 -   30911425] Linux files*
 3: 00    0   0   0 -    0   0   0 [         0 -          0] unused      
 4: 00    0   0   0 -    0   0   0 [         0 -          0] unused      
fdisk:*1> write
Writing MBR at offset 0.
fdisk: 1> quit
Create a file system on each partition (we'll need e2fsprogs to create an ext4 file system):
$ sudo newfs_msdos -v boot /dev/rdisk2s1
$ sudo /opt/local/sbin/mkfs.ext4 /dev/rdisk2s2 
As MacOS is able to read FAT-32, we should be able to mount it right away:
$ diskutil mount disk2s1
Volume BOOT on disk2s1 mounted

$ df -h /Volumes/BOOT
Filesystem     Size   Used  Avail Capacity  Mounted on
/dev/disk2s1  100Mi  762Ki   99Mi     1%    /Volumes/BOOT
Mounting a ext4 file systems turned out to be more difficult and there are several solutions available.

ext2fuse

ext2fuse is said to provide ext2/ext3 support via FUSE, but it segfaults on our newly created ext4 file system:
$ sudo /opt/local/bin/ext2fuse /dev/disk2s2 /mnt/disk
/dev/disk2s2 is to be mounted at /mnt/disk
fuse-ext2fs: Filesystem has unsupported feature(s) while trying to open /dev/disk2s2
Segmentation fault: 11

$ mount | tail -1
/dev/disk2s2 on /mnt/disk (osxfuse, synchronous)

$ df -h /mnt/disk
Filesystem     Size   Used  Avail Capacity  Mounted on
/dev/disk2s2    0Bi    0Bi    0Bi   100%    /mnt/disk

$ touch /mnt/disk/foo
touch: /mnt/disk/foo: Device not configured
Maybe ext4 is just too new for ext2fuse, let's try with ext2 instead:
$ sudo /opt/local/sbin/mkfs.ext2 /dev/rdisk2s2
$ sudo /opt/local/bin/ext2fuse /dev/disk2s2 /mnt/disk
/dev/disk2s2 is to be mounted at /mnt/disk
fuse-ext2 initialized for device: /dev/disk2s2
block size is 4096
ext2fuse_dbg_msg: File not found by ext2_lookup while looking up "DCIM"
ext2fuse_dbg_msg: File not found by ext2_lookup while looking up "VSCAN"
ext2fuse_dbg_msg: File not found by ext2_lookup while looking up "DCIM"
ext2fuse_dbg_msg: File not found by ext2_lookup while looking up ".Spotlight-V100"
ext2fuse_dbg_msg: File not found by ext2_lookup while looking up ".metadata_never_index"
[...]
This command never completes but could be terminated with ^C. The same happens with an ext3 file system.

ext4fuse

ext4fuse aims for ext4 support via FUSE, let see how that goes:
$ sudo /opt/local/sbin/mkfs.ext4 /dev/rdisk2s2
$ sudo /opt/local/bin/ext4fuse /dev/disk2s2 /mnt/disk 
$ mount | tail -1
ext4fuse@osxfuse0 on /mnt/disk (osxfuse, synchronous)

$ df -h /mnt/disk
Filesystem          Size   Used  Avail Capacity  Mounted on
ext4fuse@osxfuse0    0Bi    0Bi    0Bi   100%    /mnt/disk

$ sudo touch /mnt/disk/foo
touch: /mnt/disk/foo: Function not implemented
So close! :-) But there's no write support for ext4fuse yet.

fuse-ext2

There's another option, called fuse-ext2 which appears to feature (experimental) write support. We'll need FUSE for macOS again and then build fuse-ext2 from scratch:
$ sudo port install e2fsprogs
$ git clone https://github.com/alperakcan/fuse-ext2.git fuse-ext2-git
$ cd $_
$ ./autogen.sh && LDFLAGS="-L/opt/local/lib" CFLAGS="-I/opt/local/include" \
    ./configure --prefix=/opt/fuse-ext2
$ make && sudo make install
So, let's try:
$ sudo /opt/fuse-ext2/bin/fuse-ext2 /dev/rdisk2s2 /mnt/disk -o rw+
Rats - a window pops up with:
FUSE-EXT2 could not mount /dev/disk2s2
at /mnt/disk/ because the following problem occurred:
But the error description is empty, and there's nothing in the syslog too. After some digging I decided to reboot and this time it worked:
$ sudo /opt/fuse-ext2/bin/fuse-ext2 /dev/rdisk2s2 /mnt/disk -o rw+
$ mount | tail -1
/dev/rdisk2s2 on /mnt/disk (osxfuse_ext2, local, synchronous)

$ df -h /mnt/disk/
Filesystem      Size   Used  Avail Capacity  Mounted on
/dev/rdisk2s2   15Gi  104Mi   14Gi     1%    /mnt/disk

$ sudo touch /mnt/disk/foo
$ ls -l /mnt/disk/foo
-rw-r--r--  1 root  wheel  0 Mar  5 14:29 /mnt/disk/foo
That should be enough for us to finally install the ArchLinux image on that microSD card:
$ tar -C /Volumes/BOOT/ -xzf ArchLinuxARM-rpi-3-latest.tar.gz boot
$ mv /Volumes/BOOT/{boot/*,} && rmdir /Volumes/BOOT/boot
And for the root file system:
$ sudo tar --exclude="./boot" -C /mnt/disk/ -xvzf ArchLinuxARM-rpi-3-latest.tar.gz 
x ./bin
x ./dev/: Line too long
tar: Error exit delayed from previous errors.
Apparently bsdtar has trouble when the --exclude switch is used, so let's try without and remove the superfluous /boot contents later:
$ sudo tar -C /mnt/disk/ -xzf ArchLinuxARM-rpi-3-latest.tar.gz
$ sudo rm -r /mnt/disk/boot/*
This takes quite a long while to complete, but completed eventually. Of course, all this could be avoided if would have used another operating system in the first place :-)

tr: Bad String

Trying to mangle some characters resulted in a weird error message:
$ echo hello | tr [:lower:] [:upper:]
Bad string
Huh? Before debugging any further, searching the interwebs returns quite a few results, of course, so let's look at our options then:

$ type tr
tr is /usr/bin/tr

$ find /usr -type f -perm -0500 -name tr -ls 2>/dev/null
32054   11 -rwxr-xr-x   1 root bin  9916 Jan 23  2005 /usr/ucb/tr
16674   19 -r-xr-xr-x   1 root bin 18540 Jan 23  2005 /usr/xpg6/bin/tr
  410   20 -r-xr-xr-x   1 root bin 19400 Jan 23  2005 /usr/bin/tr
75251   19 -r-xr-xr-x   1 root bin 18520 Jan 23  2005 /usr/xpg4/bin/tr
Besides our default from SUNWcsu, we have three other versions of tr(1) available. The UCB version tries do do...something:

$ echo hello | /usr/ucb/tr [:lower:] [:upper:]
heuup
Apparently it replaces each character (position) literally, but fails to recognize the bracket expressions. Since the UCB tools were removed in later versions anyway, let's skip that for now. The two X/Open versions seem to manage:

$ echo hello | /usr/xpg6/bin/tr [:lower:] [:upper:]
HELLO

$ echo hello | /usr/xpg4/bin/tr [:lower:] [:upper:]
HELLO
But why wouldn't it work with the SUNWcsu version? truss(1) reports a missing file, but this turns out to be a red herring:

$ echo hello | truss -elfda tr [[:lower:]] [[:upper:]]
Base time stamp:  1481011767.7308  [ Tue Dec  6 09:09:27 MET 2016 ]
26125/1:         0.0000 execve("/usr/bin/tr", 0xFFBFFC9C, 0xFFBFFCAC)  argc = 3
26125/1:         argv: tr [[:lower:]] [[:upper:]]
26125/1:         envp: LC_MONETARY=en_GB.ISO8859-15 TERM=xterm SHELL=/bin/bash
26125/1:          LC_NUMERIC=en_GB.ISO8859-15 LC_ALL=en_US.UTF-8
26125/1:          LC_MESSAGES=C LC_COLLATE=en_GB.ISO8859-15 LANG=en_US.UTF-8
26125/1:          LC_CTYPE=en_GB.ISO8859-1 LC_TIME=en_GB.ISO8859-15
[...]
26125/1:         0.0061 stat64("/usr/lib/locale/en_US.UTF-8/libc.so.1", 0xFFBFE8D0) Err#2 ENOENT
26125/1:         0.0063 open("/usr/lib/locale/en_US.UTF-8/LC_MESSAGES/SUNW_OST_OSCMD.mo", O_RDONLY) Err#2 ENOENT
26125/1:         0.0064 fstat64(2, 0xFFBFEA38)                          = 0
Bad string
26125/1:         0.0064 write(2, " B a d   s t r i n g\n", 11)          = 11
26125/1:         0.0065 _exit(1)
(Un)fortunately I had my share of weird experiences with character encodings and the like. And indeed, if we use a single-byte locale, /usr/bin/tr works just fine:

$ echo $LC_ALL
en_US.UTF-8

$ echo hello | LC_ALL=en_US tr [[:lower:]] [[:upper:]]
HELLO
Another workaround would be to use another expression, if possible:

$ echo hello | tr [a-z] [A-Z]
HELLO
In newer SunOS versions, /usr/bin/tr has been fixed and works as expected.

Encrypted network block device

While backing up with Crashplan works fine most of the time (and one trusts their zero-knowledge promise), sometimes new software updates, power outages or other unplanned interruptions cause Crashplan to fail and either stop backing up or discard the whole archive and start to backup from scratch, uploading the whole disk again :-\

So yeah, it mostly works but somehow I'd like to be a bit more in control of things. The easiest thing would be to order some disk space in the cloud and rsync all data off to a remote location - but of we need to encrypt it first. But how? There are a few solutions I've came across so far, I'm sure there are others, but let's look at them real short:

  • duplicity uses librsync to upload GnuPG encrypted parts to the remote destination. I've heard good (and bad) things about it, but the tought of splitting up data into small chunks and encrypting it, uploading thousands of small bits of random-looking data sounds cool and a bit frightening at the same time. Especially the restore scenario boggles my mind. I don't want to dismiss this entirely (and may even come back to it later on), but let's look for something saner for now.

  • Attic is a deduplicating backup program written in Python. I've haven't actually tried this one either, although it seems to support encryption and remote backup destinations, although the mentioning of FUSE mounts make me a bit uneasy.

  • Obnam supports encrypted remote backups, again via GnuPG. I gotta check this out if this really works as advertised.

  • Burp uses librsync and supports something called "client side file encryption" - but that turns off "delta differencing", which sounds like the whole purpose of using librsync in the first place is then gone.

  • Rclone supports encrypted backups, but only to some pre-defined storage providers and not to arbitrary SSH-accessible locations.

  • BorgBackup has the coolest name (after Obnam :-)) and supports deduplication, compression and authenticated encryption - almost too good to be true. This should really be my go-to-solution for my usecase and if my hand-stitched version isn't working out, I'll come back to this for sure.

With that, let's see if we can employ a Network Block Device to serve our needs.
As an example, let's install nbd-server on the remote location and set up a disk that we want to serve to our backup client later on:
$ sudo apt-get install nbd-server

$ cd /etc/nbd-server/
$ grep -rv ^\# .
./config:[generic]
./config:       user = nbd
./config:       group = nbd
./config:       listenaddr = localhost
./config:       allowlist = true
./config:       includedir = /etc/nbd-server/conf.d
./conf.d/local.conf:[testdisk]
./conf.d/local.conf:    exportname = /dev/loop1
./conf.d/local.conf:    flush = true
./conf.d/local.conf:    readonly = false
./conf.d/local.conf:    authfile = /etc/nbd-server/allow
./allow:127.0.0.1/32
We will of course serve a real disk later on, but for now a loop device will do:
$ dd if=/dev/zero bs=1M count=10240 | pv | sudo dd of=/var/tmp/test.img
$ sudo losetup -f /var/tmp/test.img
With that, our nbd-server can be started and should listen on localhost only - we'll use SSH port-forwarding later on to connect back to this machine:
$ ss -4lnp | grep nbd
tcp LISTEN  0 10 127.0.0.1:10809 *:* users:(("nbd-server",pid=9249,fd=3))
The client side needs a bit more work. An SSH tunnel of course, but also the nbd kernel module and the nbd-client program. However, I noticed that the nbd-client version that comes with Debian/8.0 contained an undocumented bug that made it impossible to gain write access to the export block device. And we do really want write access :-) Off to the source, then:
$ sudo apt-get install libglib2.0-dev
$ git clone https://github.com/NetworkBlockDevice/nbd.git nbd-git && cd nbd-git
While the repository appears to be maintained, the build system looks kinda archaic. And we don't want to install almost 200 MB in dependencies for the docbook-utils packages to provide /usr/bin/docbook2man to build man pages. So let's skip all that and build only the actual programs:
$ sed -r '/^make -C (man|systemd)/d' -i autogen.sh
$ sed    '/man\/nbd/d;/systemd\//d'  -i configure.ac

$ ./autogen.sh
$ ./configure --prefix=/opt/nbd --enable-syslog
$ make && sudo make install
The configuration file format changed (again) or be passed on the command line:
$ sudo modprobe nbd
$ sudo /opt/nbd/sbin/nbd-client -name testdisk localhost 10809 /dev/nbd0 -timeout 30 -persist
On the server side, this is noticed too:
nbd_server[9249]: Spawned a child process
nbd_server[9931]: virtstyle ipliteral
nbd_server[9931]: connect from 127.0.0.1, assigned file is /dev/loop1
nbd_server[9931]: Starting to serve
nbd_server[9931]: Size of exported file/device is 10737418240
We can now use /dev/nbd0 as if it were a local disk. We'll create a key, initialize dm-crypt and create a file system:
$ openssl rand 4096 | gpg --armor --symmetric --cipher-algo aes256 --digest-algo sha512 > testdisk-key.asc
$ gpg -d testdisk-key.asc | sudo cryptsetup luksFormat --cipher twofish-cbc-essiv:sha256 \
                  --hash sha256 --key-size 256 --iter-time=5000 /dev/nbd0
gpg: AES256 encrypted data
Enter passphrase: XXXXXXX
gpg: encrypted with 1 passphrase

$ gpg -d testdisk-key.asc | sudo cryptsetup open --type luks /dev/nbd0 testdisk
$ sudo file -Ls /dev/nbd0 /dev/mapper/testdisk
/dev/nbd0:            LUKS encrypted file, ver 1 [twofish, cbc-essiv:sha256, sha256] UUID: 30f41e4...]
/dev/mapper/testdisk: data

$ sudo cryptsetup status testdisk
/dev/mapper/testdisk is active.
  type:    LUKS1
  cipher:  twofish-cbc-essiv:sha256
  keysize: 256 bits
  device:  /dev/nbd0
  offset:  4096 sectors
  size:    20967424 sectors
  mode:    read/write

$ sudo mkfs.xfs -m crc=1,finobt=1 /dev/mapper/testdisk
$ sudo mount -t xfs /dev/mapper/testdisk /mnt/disk/
$ df -h /mnt/disk
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/testdisk   10G   33M   10G   1% /mnt/disk
Deactivate with:
$ sudo umount /mnt/disk 
$ sudo cryptsetup close testdisk
$ sudo pkill -f /opt/nbd/sbin/nbd-client
When mounted, the disk speed is limited of course by the client's upload speed and the CPU speed too (for SSH and dm-crypt). Let's play with this for a while and see how this works out with rsync workloads. Maybe I'll come back for BorgBackup after all :-)

Weird CDROM formats

So, I came across these files:
$ ls -goh
-rw-r--r-- 1 526M Sep 29 12:58 file.bin
-rw-r--r-- 1  478 Sep 29 12:50 file.cue
Does anyone remember cue sheets? Luckily, even today there are tools out there to make sense of these and convert them into something usable:
$ bchunk -v file.bin file.cue file.iso
Reading the CUE file:

Track  1: MODE1/2352    01 00:00:00 (startsect 0 ofs 0)
Track  2: AUDIO     01 22:46:13 (startsect 102463 ofs 240992976)
Track  3: AUDIO     01 25:25:74 (startsect 114449 ofs 269184048)
Track  4: AUDIO     01 28:01:35 (startsect 126110 ofs 296610720)
Track  5: AUDIO     01 31:14:31 (startsect 140581 ofs 330646512)
Track  6: AUDIO     01 34:51:35 (startsect 156860 ofs 368934720)
Track  7: AUDIO     01 37:51:22 (startsect 170347 ofs 400656144)
Track  8: AUDIO     01 41:22:03 (startsect 186153 ofs 437831856)
Track  9: AUDIO     01 44:18:34 (startsect 199384 ofs 468951168)
Track 10: AUDIO     01 46:38:03 (startsect 209853 ofs 493574256)
Track 11: AUDIO     01 49:12:05 (startsect 221405 ofs 520744560)

Writing tracks:

 1: file.iso01.iso
 mmc sectors 0->102462 (102463)
 mmc bytes 0->240992975 (240992976)
 sector data at 16, 2048 bytes per sector
 real data 209844224 bytes
 200/200  MB  [********************] 100 %

 2: file.iso02.cdr
 mmc sectors 102463->114448 (11986)
 mmc bytes 240992976->269184047 (28191072)
 sector data at 0, 2352 bytes per sector
 real data 28191072 bytes
  26/26   MB  [********************] 100 %
 3: file.iso03.cdr
[...]
In this case, we don't care for the audio part of the image, so we could discard all the .cdr files later on and just use the ISO image:
$ ls -goh file.*
-rw-r--r-- 1 526M Sep 29 12:58 file.bin
-rw-r--r-- 1  478 Sep 29 12:50 file.cue
-rw-r--r-- 1 201M Oct 31 16:01 file.iso01.iso

$ sudo mount -t iso9660 -o loop,ro file.iso01.iso /mnt/cdrom
$ ls /mnt/cdrom
AUTORUN.INF  Data  Install  readme.txt  Setup.exe  Splash
Oh, yeah :-)

Compression benchmarks

Some time has passed since the last compression benchmarks and new contenders entered the race, so let's do another round of benchmarks, shall we?

MacBook Pro 2009

This laptop ships with an Intel Core2 Duo P8700 processor, so these tests may take a while:
$ tar -cf test.tar /usr/share/ 
$ ls -goh test.tar
-rw-r--r--  1    384M Oct  6 08:00 test.tar

$ time for i in {1..10}; do ~/bin/compress-test.sh test.tar | tee results_${i}.out; done
[...]
real    2046m5.142s
user    222m1.302s
sys     3m30.933s
So, 10 rounds of compressing and decompressing this tarball took 34 hours to complete. The results break down to:
$ for o in 9c 1c dc; do
   for p in gzip pigz bzip2 pbzip2 xz lzma zstd pzstd brotli; do
      awk "/"$p"\/"$o"/ {sum+=\$3} END {print \"$p/$o\t\", sum/10}" results_*.out
   done | sort -nk2; echo
done
pzstd/9c         19.7
zstd/9c          53.4
brotli/9c       234.5
pigz/9c         746.4
pbzip2/9c       764.6
gzip/9c         775.2
lzma/9c        1180.2
bzip2/9c       1563.9
xz/9c          3825

pzstd/1c          2.4
brotli/1c         4.7
zstd/1c           6.1
pigz/1c           6.2
gzip/1c          10.4
pbzip2/1c       752
xz/1c           778.7
lzma/1c         779.5
bzip2/1c       1532.3

pzstd/dc          0.8
zstd/dc           1.8
gzip/dc           2.4
pigz/dc           2.4
brotli/dc         2.9
pbzip2/dc         9.1
lzma/dc          10.2
xz/dc            10.8
bzip2/dc        748

Thinkpad E431

This machine comes with an i7-3632QM CPU and our test tarball is somewhat bigger:
$ tar -cf test.tar /usr/share/locale/ /usr/share/games/quake3/
$ ls -goh test.tar
-rw------- 1 978M Oct  8 22:38 test.tar

$ time for i in {1..10}; do ~/bin/compress-test.sh test.tar | tee results_${i}.out; done
[...]
real	420m39.764s
user	529m13.192s
sys	3m46.148s
After 7 hours, the results are in:
$ for o in 9c 1c dc; do
    for p in gzip pigz bzip2 pbzip2 xz lzma zstd pzstd brotli; do
       awk "/"$p"\/"$o"/ {sum+=\$3} END {print \"$p/$o\t\", sum/10}" results_*.out
    done | sort -nk2; echo
done
pzstd/9c	 17.4
pigz/9c	         17.5
pbzip2/9c	 31.5
zstd/9c    	 70.4
gzip/9c    	 84.4
bzip2/9c	145.3
brotli/9c	260
xz/9c	        612.4
lzma/9c	        622.4

pzstd/1c 	  3.3
pigz/1c	          7.2
brotli/1c	  8
zstd/1c	         10.2
pbzip2/1c	 26
gzip/1c	         27.8
bzip2/1c	141.6
lzma/1c	        181.5
xz/1c	        185.2

pzstd/dc	  0.6
zstd/dc	          2.1
brotli/dc	  4.8
pigz/dc	          5
gzip/dc	          8
pbzip2/dc	  8.8
xz/dc	         36.5
lzma/dc	         40.2
bzip2/dc	 53.3

PowerBook G4

This (older) machine is still running 24/7, so let's see which compressor we should use in the future:
$ tar -cf test.tar /usr/share/doc/gcc-4.9-base/ /usr/share/perl5
$ ls -goh test.tar
-rw-r--r-- 1 41M Oct 15 02:53 test.tar

$ PROGRAMS="gzip bzip2 xz lzma brotli zstd" \
  ~/bin/compress-test.sh -n 10 -f test.tar | tee ~/r.log
$ ~/bin/compress-test.sh -r ~/r.log
### Fastest compressor:
### zstd/1c:      1.90 seconds / 63.300% smaller 
### brotli/1c:    2.20 seconds / 57.900% smaller 
### gzip/1c:      4.80 seconds / 58.800% smaller 
### zstd/9c:     11.30 seconds / 66.000% smaller 
### gzip/9c:     19.00 seconds / 62.500% smaller 
### bzip2/1c:    36.90 seconds / 63.800% smaller 
### lzma/1c:     37.80 seconds / 65.700% smaller 
### xz/1c:       40.20 seconds / 66.000% smaller 
### brotli/9c:   60.50 seconds / 66.800% smaller 
### bzip2/9c:    63.00 seconds / 66.000% smaller 
### xz/9c:      111.90 seconds / 68.000% smaller 
### lzma/9c:    115.90 seconds / 67.700% smaller 

### Smallest size:
### zstd/9c:     11.30 seconds / 66.000% smaller 
### zstd/1c:      1.90 seconds / 63.300% smaller 
### xz/9c:      111.90 seconds / 68.000% smaller 
### xz/1c:       40.20 seconds / 66.000% smaller 
### lzma/9c:    115.90 seconds / 67.700% smaller 
### lzma/1c:     37.80 seconds / 65.700% smaller 
### gzip/9c:     19.00 seconds / 62.500% smaller 
### gzip/1c:      4.80 seconds / 58.800% smaller 
### bzip2/9c:    63.00 seconds / 66.000% smaller 
### bzip2/1c:    36.90 seconds / 63.800% smaller 
### brotli/9c:   60.50 seconds / 66.800% smaller 
### brotli/1c:    2.20 seconds / 57.900% smaller 

### Fastest decompressor:
### zstd/dc:       .80 seconds
### brotli/dc:    1.20 seconds
### gzip/dc:      1.20 seconds
### xz/dc:        1.70 seconds
### lzma/dc:      3.20 seconds
### bzip2/dc:     7.20 seconds

Building NRPE for OpenWRT

In the last article we restored nrpe from backups to a running OpenWRT installation. After another power outage we have to do this again, but let's actually build nrpe this time and only restore its configuration from the backup.

The build process will happen in a VM running Debian/jessie(amd64), so missing utilities or header files will have to be installed via apt-get:
sudo apt-get autoconf binutils build-essential gawk gettext git libncurses5-dev libssl-dev libz-dev ncurses-term openssl sharutils subversion unzip
We'll check out the source and switch to the v15.05.1 branch, because we'll need to build for the release that's currently running on the router. Since OpenWrt switched to musl last year, we cannot build trunk as the running Chaos Calmer is still linked against uClibc.
git clone https://github.com/openwrt/openwrt.git openwrt-git
cd $_
git checkout -b local v15.05.1
Fetch an appropriate .config (again, we cannot use trunk just yet) and enter the configuration menu:
wget https://downloads.openwrt.org/chaos_calmer/15.05.1/ar71xx/generic/config.diff -O .config
make defconfig
make menuconfig
Here, we'll select our target profile and disable the SDK:
  • Target Profile => NETGEAR WNDR3700/WNDR3800/WNDRMAC
  • [_] Build the OpenWrt SDK (disabled)
Let's also disable all modular packages from the build and run the prerequisite check to verfiy that the configuration is still valid:
sed 's/=m$/=n/' -i.bak .config
make prereq
With that, we're ready to build and install the toolchain:
script -c "time make -j4 V=s tools/install && date && time make -j4 V=s toolchain/install" ~/build.log 
This will need some time (and diskspace) to complete. Once completed (check the build.log!), we can finally build our packages:
wget https://github.com/ckujau/openwrt/archive/master.zip -O ~/openwrt_master.zip
(cd ~/dev/ && unzip -l ~/openwrt_master.zip) && (cd ~/dev/openwrt-master/ && tar -cf - package) | tar -xvf -
make oldconfig
script -c "time make -j4 V=s package/nrpe/compile" -a ~/build.log
script -c "time make -j4 V=s package/monitoring-plugins/compile" -a ~/build.log
Note: this will build all dependencies as well:
$ grep -h DEP package/network/utils/{monitoring-plugins,nrpe}/Makefile 
  DEPENDS:=+libopenssl +libpthread
  DEPENDS:=+libopenssl +libwrap
When everything is built correctly, we should have two package files:
$ ls -hgotr bin/ar71xx/packages/base/
total 1.1M
-rw-r--r-- 1  35K Oct  2 13:10 libgcc_5.3.0-1_ar71xx.ipk
-rw-r--r-- 1 268K Oct  2 13:10 libc_1.1.15-1_ar71xx.ipk
-rw-r--r-- 1  857 Oct  2 13:10 libpthread_1.1.15-1_ar71xx.ipk
-rw-r--r-- 1  36K Oct  2 13:11 zlib_1.2.8-1_ar71xx.ipk
-rw-r--r-- 1 741K Oct  2 13:16 libopenssl_1.0.2j-1_ar71xx.ipk
-rw-r--r-- 1  24K Oct  2 13:17 nrpe_3.0.1-1_ar71xx.ipk
-rw-r--r-- 1 768K Oct  2 13:32 monitoring-plugins_2.2-1_ar71xx.ipk

$ file build_dir/target-mips*/*/src/nrpe
build_dir/target-mips_34kc_uClibc-0.9.33.2/nrpe-3.0.1/src/nrpe: ELF 32-bit MSB executable, MIPS, MIPS32 rel2 version 1, dynamically linked, interpreter /lib/ld-uClibc.so.0, not stripped
The installation should automatically install any dependencies, if needed:
router$ opkg install ./*.ipk
Installing monitoring-plugins (2.1.2-1) to root...
Installing nrpe (3.0.1-1) to root...

router$ /etc/init.d/nrpe enable
router$ /etc/init.d/nrpe start

router$ netstat -lnp | grep 5666
tcp 0 0 192.168.0.2:5666 0.0.0.0:* LISTEN 6771/nrpe
This was the easy part. The difficult part will be to get both packages upstream :-)

/bin/ls --wtf

So, I noticed this:
$ env -i /bin/bash                 # Clear the environment
$ touch foo bar\ baz               # Creates two files, "foo" 
                                   # and "bar baz"
$ ls -1
'bar baz'
foo
Why is ls(1) suddenly quoting filenames that contain spaces? After a bit of digging, this commit introduced this change into GNU/coreutils, but at least Debian is on the case and fixed it in their version:
$ ls
bar baz
foo

$ ls --quoting-style=shell
'bar baz'
foo

Mediawiki Upgrade

Upgrading Mediawiki through Git seemed like a cool idea and worked quite well for a long time. But since Mediawiki 1.25 the update process changed considerably and just wasn't fun any more. As updates are a rare occurence anyway, I decided to switch back to tarballs instead. Let's try this, for Mediawiki 1.27:

 curl https://www.mediawiki.org/keys/keys.txt | gpg --import
 wget https://releases.wikimedia.org/mediawiki/1.27/mediawiki-1.27.1.tar.gz{,.sig}
 gpg --verify mediawiki-1.27.1.tar.gz.sig
 
 export DOCROOT=/var/www/
 cd $DOCROOT/mediawiki
 tar --strip-components=1 -xzf ~/mediawiki-1.27.1.tar.gz
Perform the necessary (database) updates:
 cd $DOCROOT/mediawiki
 script -a -c "date; php maintenance/update.php --conf `pwd`/LocalSettings.php" ~/mwupdate.log 
While we're at it, re-generate the sitemap:
 cd $DOCROOT/mediawiki
 mkdir -p sitemap && chmod 0770 sitemap && sudo chgrp www-data sitemap
 sudo -u www-data MW_INSTALL_PATH=`pwd` php maintenance/generateSitemap.php \
     --conf `pwd`/LocalSettings.php --fspath `pwd`/sitemap --server https://www.example.net \
     --urlpath https://www.example.net/mediawiki/sitemap --skip-redirects
Remove/disable clutter:
 cd $DOCROOT/mediawiki
 rm -rf COPYING CREDITS FAQ HISTORY INSTALL README RELEASE-NOTES-1.27 UPGRADE
 chmod 0 docs maintenance tests
 sudo touch {cache,images}/index.html
Don't forget to upgrade the extensions as well:
 cd ../piwik-mediawiki-extension-git
 git checkout master && git pull && git clean -dfx
 git archive --prefix=piwik-mediawiki-extension/ --format=tar HEAD | tar -C $DOCROOT/mediawiki/extensions/ -xvf -
  
 cd ../MobileFrontend-git
 git checkout master && git pull && git clean -dfx
 git archive --prefix=MobileFrontend/ --format=tar origin/REL1_27  | tar -C $DOCROOT/mediawiki/extensions/ -xvf -
And with that, the new version should be online :-)

Installing NRPE in OpenWRT

With at least OpenWRT 15.05, the NRPE package appears to be unmaintained. We could should build the package manually, but before we do this, let's install an older version from our backups. For example:
$ ( cd ../backup/router/ && find . -name "*nrpe*" -o -name "check_*" | xargs tar -cf - ) | \
    ssh router "tar -C / -xvf -"
This should restore the NRPE binary, its configuration files and init scripts and all the check_* monitoring plugins. Did I mention that backups are important? :-)
With that, we're almost there:
 $ ldd /usr/sbin/nrpe
        libssl.so.1.0.0 => not found
        libcrypto.so.1.0.0 => not found
        libwrap.so.0 => not found
        libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x77a64000)
        libc.so.0 => /lib/libc.so.0 (0x779f7000)
        ld-uClibc.so.0 => /lib/ld-uClibc.so.0 (0x77a88000)
Let's install the dependencies:
opkg install libopenssl libwrap
Add the nagios user:
echo 'nagios:x:50:' >> /etc/group
echo 'nagios:x:50:50:nagios:/var/run/nagios:/bin/false' >> /etc/passwd
echo 'nagios::16874:0:99999:7:::' >> /etc/shadow
Configure nrpe:
 $ grep ^[a-z] /etc/nrpe.cfg
 pid_file=/var/run/nrpe.pid
 server_port=5666
 server_address=192.168.0.1
 nrpe_user=nagios
 nrpe_group=nagios
 allowed_hosts=192.168.0.10,192.168.0.11
 dont_blame_nrpe=0
 debug=0
 command_timeout=60
 connection_timeout=300
 
 command[check_dummy]=/usr/libexec/nagios/check_dummy 0
 command[check_dns]=/usr/libexec/nagios/check_dns -H test.example.net -s localhost -w 0.1 -c 0.5
 command[check_entropy]=/root/bin/check_entropy.sh -w 1024 -c 512
 command[check_http]=/usr/libexec/nagios/check_http -H localhost -w 0.1 -c 0.5
 command[check_load]=/usr/libexec/nagios/check_load -w 4,3,2 -c 5,4,3
 command[check_ntp_time]=/usr/libexec/nagios/check_ntp_time -H 0.openwrt.pool.ntp.org -w 0.5 -c 1.0
 command[check_ssh]=/usr/libexec/nagios/check_ssh -4 router
 command[check_softwareupdate_opkg]=/root/bin/check_softwareupdate.sh opkg
 command[check_users]=/usr/libexec/nagios/check_users -w 3 -c 5
Let's try to start it, and enable it if it works:
 $ /etc/init.d/nrpe start
 $ ps | grep nrp[e]
 5320 nagios    2908 S    /usr/sbin/nrpe -c /etc/nrpe.cfg -d
 
 $ /etc/init.d/nrpe enable
And that's about it. Of course: since we're using an outdated NRPE version, we won't receive any (security) updates - so this setup should only be used in a trusted environment, i.e. not over the internet.

gpgkeys: HTTP fetch error 60: SSL certificate problem: Invalid certificate chain

After installing GnuPG from Homebrew, gpg was unable to connect to one of its key servers:
$ gpg --refresh-keys
gpg: refreshing 47 keys from hkps://hkps.pool.sks-keyservers.net
gpgkeys: HTTP fetch error 60: SSL certificate problem: Invalid certificate chain
[...]
The trick was to install their root certificate and mark it "trusted":
$ wget https://sks-keyservers.net/sks-keyservers.netCA.pem
$ open sks-keyservers.netCA.pem
	=> Trust always
Now the operation was able to complete:
$ gpg --refresh-keys
[...]
gpg: Total number processed: 47
gpg:              unchanged: 19
gpg:           new user IDs: 5
gpg:            new subkeys: 4
gpg:         new signatures: 1698
gpg:     signatures cleaned: 2
gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
gpg: depth: 0  valid:  19  signed:  12  trust: 0-, 0q, 0n, 0m, 0f, 19u
gpg: depth: 1  valid:  12  signed:   4  trust: 12-, 0q, 0n, 0m, 0f, 0u
gpg: next trustdb check due at 2018-08-19

MacOS Gatekeeper: Verifying...

There's VLC installed on this Mac via Homebrew Cask and every time VLC starts up, the dreaded Verifying... progress bar comes up:
VLC verifying...
Now, this message of course is generated by MacOS Gatekeeper, trying to do its job. Eventually the verification completes and VLC is started - but the process repeats every time VLC starts! And it's only happening for VLC, it doesn't appear for other applications installed with Homebrew Cask.

Fortunately, there's an easy workaround to stop that behaviour - we need to remove the com.apple.quarantine extended attribute:
$ xattr -l /Applications/BrewBundle/VLC.app
com.apple.quarantine: 0002;5123a312;Safari;4CC444EB-4444-44A4-4C44-4B444FBC4444

$ sudo xattr -d com.apple.quarantine /Applications/BrewBundle/VLC.app
Now VLC can be started w/o the verification delay :-)

XFS: Corruption warning: Metadata has LSN ahead of current LSN

This just happened again on a different machine, right after running xfs_repair:
$ sudo xfs_repair /dev/mmcblk0
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 1
        - agno = 2
        - agno = 0
        - agno = 3
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
done

$ echo $?
0

$ sudo mount -t xfs /dev/mmcblk0 /mnt/disk
mount: wrong fs type, bad option, bad superblock on /dev/mmcblk0,

$ sudo dmesg -t | tail 
XFS (mmcblk0): Mounting V5 Filesystem
XFS (mmcblk0): Corruption warning: Metadata has LSN (20:50596) ahead of current LSN (1:2). Please unmount and run xfs_repair (>= v4.3) to resolve.
XFS (mmcblk0): log mount/recovery failed: error -22
XFS (mmcblk0): log mount failed
What happened here? Apparently, with the XFS v5 superblock the userspace tools (xfsprogs) also changed.

And so it happened that xfs_repair version 3.2.1 tried to check an XFS file system that had already enabled its v5 superblock format. But the version is too old to handle v5 superblocks and left the file system in an corrupt state.

Luckily it's easy to fix:
 > Kernel v4.4 and later detects an XFS log problem which is only fixed by
 > xfsprogs v4.3 or later. If you have encountered the inability to mount an
 > xfs filesystem, please update to this version of xfsprogs and run
 > xfs_repair against the filesystem.
And indeed:
$ /opt/xfsprogs/sbin/xfs_repair -V
xfs_repair version 4.5.0

$ sudo /opt/xfsprogs/sbin/xfs_repair /dev/mmcblk0
[...]
Phase 7 - verify and correct link counts...
Maximum metadata LSN (20:50596) is ahead of log (1:2).
Format log to cycle 23.
done

$ sudo mount -t xfs /dev/mmcblk0 /mnt/disk
$ mount | tail -1
/dev/mmcblk0 on /mnt/disk type xfs (rw,relatime,attr2,discard,inode64,noquota)
Phew! :-)

Character collation

So, recently I came across this funny behaviour on a SLES11sp4 machine:
sles11$ netstat -ni | awk '/^[a-z]/' 
Kernel Interface table
Iface   MTU Met    RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0   1500   0     3562      0      0      0     1955      0      0      0 BMRU
lo    16436   0       20      0      0      0       20      0      0      0 LRU
Wait, what? Why is the (uppercase) string "Kernel" matched against the lowercase "[a-z]" search expression? The same command on a SLES12sp1 machine does the Right Thing:
sles12$ netstat -ni | awk '/^[a-z]/' 
eth0   1500   0      685      0      0      0      438      0      0      0 BMRU
lo    65536   0       12      0      0      0       12      0      0      0 LRU
Apparently, this is not an unknown problem and can indeed be fixed by providing another LC_COLLATE variable:
$ netstat -ni | LC_COLLATE=C awk '/^[a-z]/' 
eth0   1500   0     3711      0      0      0     2032      0      0      0 BMRU
lo    16436   0       20      0      0      0       20      0      0      0 LRU
While providing a different LC_COLLATE variable did help, this still smells like a bug in SLES11, as the configured locales were exactly the same:
sles11$ locale 
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

sles11$ locale -k LC_COLLATE
collate-nrules=4
collate-rulesets=""
collate-symb-hash-sizemb=2039
collate-codeset="UTF-8"

sles11$ locale | md5sum 
677d9b3dbdf9759c8b604f294accd102  -

sles12$ locale | md5sum 
677d9b3dbdf9759c8b604f294accd102  -
Interestingly enough, both installations differ greatly in the way they look up locale information:
sles11$ echo | strace -e open awk '/^[a-z]/' 
open("/etc/ld.so.cache", O_RDONLY)      = 3
open("/lib64/libdl.so.2", O_RDONLY)     = 3
open("/lib64/libm.so.6", O_RDONLY)      = 3
open("/lib64/libc.so.6", O_RDONLY)      = 3
open("/usr/lib/locale/locale-archive", O_RDONLY) = 3
open("/usr/lib64/gconv/gconv-modules.cache", O_RDONLY) = 3



sles12$ echo | strace -e open awk '/^[a-z]/' 2>&1 | grep -v ENOENT
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libm.so.6", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/locale/en_US.utf8/LC_CTYPE", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib64/gconv/gconv-modules.cache", O_RDONLY) = 3
open("/usr/lib/locale/en_US.utf8/LC_COLLATE", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/locale/en_US.utf8/LC_MESSAGES", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/locale/en_US.utf8/LC_MESSAGES/SYS_LC_MESSAGES", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/locale/en_US.utf8/LC_NUMERIC", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/locale/en_US.utf8/LC_TIME", O_RDONLY|O_CLOEXEC) = 3
open("/dev/null", O_RDWR)               = 3
+++ exited with 0 +++
Alas, no bug has been reported yet :-\

While this appears to be documented behaviour, it's still very confusing and may even violate the Principle of Least Surprise. FWIW, GNU/grep behaves as expected on both systems, no matter the collation:
$ echo Abc | egrep --color '[[:lower:]]'
Abc

PS: I forgot to mention how cool SUSE Studio is - this SLE12 test VM was up & running in minutes and accessible via SSH too and I didn't even have to fire up my local VirtualBox instance! :-)

umask & symbolic links on MacOS X

This just annoyed me again:
$ umask 0022
$ touch foo
$ umask 0066
$ ln -s foo bar

$ ls -lgo foo bar
-rw-r--r--  1   0 Mar  9 14:17 foo
lrwx--x--x  1   3 Mar  9 14:17 bar -> foo

$ sudo -u nobody cat foo bar
$ 
OK, this seems to work (the permissions are checked on the target, not the symlink), but not so with directories:
$ umask 0022
$ mkdir -p foo/file
$ umask 0066
$ ln -s foo bar

$ ls -ldgo foo bar
drwxr-xr-x  3   102 Mar  9 15:02 foo
lrwx--x--x  1     3 Mar  9 15:03 bar -> foo

$ sudo -u nobody ls -l bar
ls: bar: Permission denied
lrwx--x--x  1 admin  wheel  3 Mar  9 14:23 bar
Interestingly enough, it works if we append a slash to the symlink:
$ sudo -u nobody ls -lgo bar/
total 0
drwxr-xr-x  2  68 Mar  9 14:24 dir
This is annoying when a user has a more stringent umask for normal use, but temporarily elevates its privileges to install software, without adjusting the umask first. To clean up this mess afterwards, we can re-create the affected symbolic links:
$ umask 0022
$ find . -type l ! -perm -g+r | while read l; do
   target=$(readlink "$l") && rm -f "$l" && ln -svf "$target" "$l"
done
./bar -> foo

$ ls -ld foo bar
drwxr-xr-x  4 admin  wheel  136 Mar  9 14:37 foo
lrwxr-xr-x  1 admin  wheel    3 Mar  9 14:38 bar -> foo
Note: this has been seen in MacOS 10.10.5 on a Journaled HFS+ file system.