Skip to content

Fun with Debian DKMS

Running VirtualBox on Debian needs the virtualbox-dkms package installed. DKMS stands for Dynamic Kernel Module Support and is an attempt to build out-of-tree drivers for the kernel versions installed instead of offering multiple package versions of the same driver.

So, virtualbox-dkms was installed and all was good - until a change in the kernel sources broke the VirtualBox build and needed a patch against the virtualbox-dkms sources. However, only recent kernel versions were affected, the patched virtualbox-dkms code would not run correctly with an older kernel.

On this box, two kernel versions are installed: linux-image-3.14-2-amd64 and linux-image-3.17.0-rc1+, compiled from vanilla sources. A "dpkg-reconfigure virtualbox-dkms" would build the virtualbox-dkms for both kernel versions, but for the reasons explained above, we can't do this now.

Let's rebuild virtualbox-dkms only for the kernel that needs to be built with the patched version of virtualbox-dkms:
# rmmod vboxpci vboxnetadp vboxnetflt vboxdrv

# ls -lgo /var/lib/dkms/virtualbox/
total 4
drwxr-xr-x 5 4096 Aug 31 01:49 4.3.14
lrwxrwxrwx 1   26 Aug 31 01:49 kernel-3.14-2-amd64-x86_64 -> 4.3.14/3.14-2-amd64/x86_64
lrwxrwxrwx 1   25 Aug 31 01:37 kernel-3.17.0-rc1+-x86_64 -> 4.3.14/3.17.0-rc1+/x86_64

# dkms remove virtualbox/4.3.14 -k 3.17.0-rc1+/x86_64

# cd /usr/src/virtualbox-4.3.14
# patch -p0 < ~/virtualbox-alloc_netdev.diff
# dkms install virtualbox/4.3.14 -k 3.17.0-rc1+/x86_64
And that should be all to it :)

On SSH ciphers, MACs and key exchange algorithms

Inspired by a some question on StackExchange on the taxonomy of Ciphers/MACs/Kex available in SSH, I wondered what would be the fastest combination of Ciphers, MACs and KexAlgorithms that OpenSSH has to offer.

I've tested with OpenSSH 6.6 (released 2014-03-14) on a Debian/Jessie system (ThinkPad E431). Initially I ran these tests against an SSH server in a virtual machine but realized that the server is not supporting newer Cipher/MAC/KexAlgorithm combinations, so before I ran the actual benchmark I ran to test all working combinations. Later on I ended up running the performance test on localhost, making the evaluation step obsolete. Still, I decided to keep it around so that one can peform the benchmark on real-world situations where the remote SSH server is not located on localhost :-)

This OpenSSH version supports 15 different Ciphers, 18 MAC algorithms and 8 Key-Exchange algorithms - that's 2160 combinations to test. will go through the output of and transfer a certain amount of data from local /dev/zero to remote /dev/null. Connecting to localhost was fast so I opted to transfer 4GB of data.

Before we get into the details, let's see the top-5 combinations of the results:
cipher: aes192-ctr mac: kex: ecdh-sha2-nistp256 - 6 seconds
cipher: aes192-ctr mac: kex: diffie-hellman-group1-sha1 - 6 seconds
cipher: aes128-ctr mac: kex: ecdh-sha2-nistp384 - 6 seconds
cipher: aes192-ctr mac: kex: ecdh-sha2-nistp384 - 6 seconds
cipher: aes192-ctr mac: kex: diffie-hellman-group-exchange-sha1 - 6 seconds
The UMAC message authentication code has been introduced in OpenSSH 4.7 (released 2007-09-04) and is indeed the fastest MAC in this little contest. Looking at the results reveals that there indeed some variation in the results when it comes to different MAC or Kex choices. Iterating through all ciphers, we calculate the average run time of each combination:
$ for c in `awk '{print $4}' ssh-performance.log | sort | uniq`; do
     printf "cipher: $c  "
     grep -w $c ssh-performance.log | awk '{sum+=$(NF-1); n++} END {print sum/n}'
done | sort -nk3
cipher:  8.8125
cipher:  9.23611
cipher: aes128-ctr  15.6875
cipher: aes192-ctr  15.6944
cipher: aes256-ctr  16.1319
cipher: arcfour     20.26391)
cipher: arcfour128  20.3403
cipher: arcfour256  20.5278
cipher: aes128-cbc  21.125
cipher: aes192-cbc  22.4583
cipher:  23.2361
cipher: aes256-cbc  23.9722
cipher: blowfish-cbc  55.6875
cipher: cast128-cbc  59.5139
cipher: 3des-cbc  200.854
So, (included in OpenSSH 6.2, released 2013-03-22) comes out fastest across all combinations while 3des-cbc is indeed the slowest cipher. While the major performance factor is still the choice of the cipher, both MAC and Kex still play a role. As an example, let's look at aes192-ctr mac, the results range from 6 to 46 seconds:
cipher: aes192-ctr mac: kex: ecdh-sha2-nistp256 - 6 seconds
cipher: aes192-ctr mac: kex: ecdh-sha2-nistp256 - 46 seconds
Let's see how MAC and Kex choices rank up across all (15) different ciphers. That is, we calculate the average time for each MAC:
$ for m in `awk '{print $6}' ssh-performance.log | sort | uniq`; do
     printf "mac: $m  "
     grep -w $m ssh-performance.log | awk '{sum+=$(NF-1); n++} END {print sum/n}'
done | sort -nk3
mac:  28.45
mac:  28.8167
mac:  29.8583
mac:  30.1
mac: hmac-sha1-96  33.4417
mac:  33.5167
mac:  33.6333
mac: hmac-sha1  33.7104
mac: hmac-md5-96  33.7792
mac:  33.8167
mac: hmac-md5  33.825
mac:  34.2
mac:  38.2333
mac: hmac-sha2-512  38.2833
mac:  43.775
mac: hmac-ripemd160  43.7792
mac: hmac-sha2-256  44.3792
mac:  44.45
And again for the key exchange algorithms:
$ for k in `awk '{print $8}' ssh-performance.log | sort | uniq`; do
     printf "kex: $k  "
     grep -w $k ssh-performance.log | awk '{sum+=$(NF-1); n++} END {print sum/n}'
done | sort -nk3
kex: ecdh-sha2-nistp256  35.2926
kex: diffie-hellman-group14-sha1  35.3148
kex: diffie-hellman-group1-sha1  35.4296
kex: diffie-hellman-group-exchange-sha256  35.563
kex: ecdh-sha2-nistp521  35.563
kex: diffie-hellman-group-exchange-sha1  35.6926
kex: ecdh-sha2-nistp384  35.8333
kex:  35.8667
The differences for Kex here are in the sub-second range, so even the recently added Curve25519 option would not be much of a performance impact here.

So, what do we make of all this? Another StackExchange question suggests that SSH in general holds up pretty good security-wise and even dismisses problems with CBC. Assuming all of that is true, what can we do to get the most performance when transferring big files over SSH? Let's look at the defaults again, from ssh_config(5) of OpenSSH 6.6:
Ciphers: aes128-ctr aes192-ctr aes256-ctr arcfour256 arcfour128 [...]
MACs: [...]
KexAlgorithms: ecdh-sha2-nistp256 ecdh-sha2-nistp384 ecdh-sha2-nistp521 [...]
So, according to the results of this little contest, a faster default for a recent version of OpenSSH could be:
  • The GCM ciphers have been implemented with OpenSSH 6.2 (released 2013-03-22).
  • The EtM (Encrypt-then-MAC) modes and 128-bit UMAC variants have only been supported since OpenSSH 6.2 (released 2013-03-22).
  • The KexAlgorithms option has been added with OpenSSH 5.7 (released 2011-01-24)
As always, when it comes to benchmarks: other SSH implementations (e.g. HPN-SSH) or different setups will most certainly return different results. So please test yourself before drawing any conclusions from these results.

Update October 2014): OpenSSH 6.7 disables the CBC ciphers by default because of vulnerabilites found when used with SSH (found in 2008).

Update (June 2016): new benchmark script and new results (averaged over 3 runs):
### Top-5 overall
cipher: aes128-ctr   mac:  kex: -  10 seconds avg.
cipher: aes128-ctr   mac:  kex: ecdh-sha2-nistp521 -  10 seconds avg.
cipher: aes128-ctr   mac:  kex: diffie-hellman-group-exchange-sha256 -  10 seconds avg.
cipher: aes128-ctr   mac:  kex: diffie-hellman-group14-sha1 -  10 seconds avg.
cipher: aes128-ctr   mac:  kex: ecdh-sha2-nistp256 -  9 seconds avg.

If we were to use the Secure Secure Shell recommendations and exclude some of the weaker choices, we would lose only a few seconds (and would basically switch to chacha20-poly1305):
### Top-6 overall
cipher: mac: kex: -  14 seconds avg.
cipher: mac: kex: diffie-hellman-group-exchange-sha256 -  13 seconds avg.
cipher: mac: kex: -  13 seconds avg.
cipher: mac: kex: -  13 seconds avg.
cipher: mac: kex: diffie-hellman-group-exchange-sha256 -  13 seconds avg.
cipher: mac: kex: -  12 seconds avg.

OS X Mavericks & NTP

Only recently I noticed that the system time on this machine running OS X v10.9 is off by almost a second:
$ /opt/local/libexec/nagios/check_ntp_time -H -w 0.5 -c 1.0
NTP WARNING: Offset -0.7461649179 secs|offset=-0.746165s;0.500000;1.000000;
Lengthy discussions and explanations describe the issue quite nicely:
$ cat /var/db/ntp.drift
So, clock drift on this machine is -47.901 PPM or 172.44 ms/h. But the NTP offset is even larger:
$ ntpq -c peers
     remote           refid      st t when poll reach   delay   offset  jitter
 akamai.ruselabs  2 u 1245   64    7  221.876  -740.67 164.885   2 u  37m   64    7  236.627  -731.73 145.745
 gw-kila-wl0.ipv  2 u  35m   64    7  133.689  -768.34 125.425

$ sntp -p no
2014 Aug 06 13:51:09.1000 -0.75187 +/- 0.024338 secs
That's -0.75 seconds off the correct time! Kinda weird for a high precision machine like this. A workaround suggest to disable pacemaker and restart ntpd:
$ ps -ef | grep pac[e]
0  106   1 0    18Jul14   ??  0:12.82 /usr/libexec/pacemaker -b -e 0.0001 -a 10

$ sudo launchctl unload -w /System/Library/LaunchDaemons/

$ sudo launchctl list | grep ntp
84577   -       org.ntp.ntpd
$ sudo launchctl stop org.ntp.ntpd
$ sudo launchctl list | grep ntp
74279   -       org.ntp.ntpd
With pacemaker disabled, our NTP offset is now within acceptable range:
$ sntp -p no
2014 Aug 06 13:54:05.461870 +0.046836 +/- 0.037628 secs

$ /opt/local/libexec/nagios/check_ntp_time -H -w 0.5 -c 1.0
NTP OK: Offset 0.01386797428 secs|offset=0.013868s;0.500000;1.000000;

User and system time per process

Long time no blog post - so let's change that :-)

Today I've come across a system with a pretty high load, but couldn't quite make out which processes were responsible for the load. The ps command showed:
$ uptime
 20:16pm  up 1 day  2:40,  12 users,  load average: 132.81, 132.79, 133.43

$ ps -eo user,pid,rsz,vsz,pmem,pcpu,time,args --sort -pcpu
user0    18428   3124    58236   0.0 1.1 00:00:00 LLAWP /www/user0/eam/WebAgent.conf -A
user0     8387  14976   154220   0.0 4.2 01:07:44 LLAWP /www/user0/eam/WebAgent.conf -A
user0     4508  14828   152864   0.0 4.2 01:07:21 LLAWP /www/user0/eam/WebAgent.conf -A
user0     8045  15000   154220   0.0 4.2 01:07:33 LLAWP /www/user0/eam/WebAgent.conf -A
user0    23814  14892   152868   0.0 4.2 01:06:47 LLAWP /www/user0/eam/WebAgent.conf -A
user0    18384   3124    58236   0.0 0.8 00:00:00 LLAWP /www/user0/eam/WebAgent.conf -A
user0    17224  14932   152952   0.0 4.1 01:05:39 LLAWP /www/user0/eam/WebAgent.conf -A
So, while each of these processes is using some CPU time, it didn't quite explain the high load. top(1) shows:
Tasks: 1439 total,   9 running, 1430 sleeping,   0 stopped,   0 zombie
Cpu0  : 14.1%us, 49.5%sy,  0.0%ni, 34.3%id,  0.0%wa,  0.0%hi,  2.0%si,  0.0%st
Cpu1  : 19.0%us, 51.4%sy,  0.0%ni, 28.6%id,  1.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  : 15.2%us, 47.6%sy,  0.0%ni, 35.2%id,  0.0%wa,  0.0%hi,  1.9%si,  0.0%st
Cpu3  : 16.8%us, 45.8%sy,  0.0%ni, 37.4%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu4  : 17.4%us, 39.4%sy,  0.0%ni, 41.3%id,  0.9%wa,  0.0%hi,  0.9%si,  0.0%st
Cpu5  : 14.7%us, 39.4%sy,  0.0%ni, 14.7%id,  0.0%wa,  0.0%hi, 31.2%si,  0.0%st
Cpu6  : 14.5%us, 30.0%sy,  0.0%ni, 52.7%id,  0.0%wa,  0.0%hi,  2.7%si,  0.0%st
Cpu7  :  9.4%us, 34.9%sy,  0.0%ni, 55.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
So we have 1439 processes, but only 9 are running and 1430 are sleeping - and there's a quite a lot system time being used. But there's no ps option to display user oder system time. However, this being a Linux system, we still have the /proc filesystem. The /proc/PID/stat file of a 2.6.30 kernel contains 52 fields, among them are:
   2: tcomm   filename of the executable
   3: state   state (R=running, S=sleeping, D=uninterruptible wait, Z=zombie, T=traced/stopped)
  14: utime   user mode jiffies
  15: stime   kernel mode jiffies
So let's try this:
$ for PID in `ps -eo pid`; do 
   awk '{print "user: " $14 " kernel: " $15 " state: " $3 " process: " $2}' /proc/$PID/stat
done > foo
$ sort -t: -nk3 foo | tail -5
user: 71232 kernel: 335302 state: D process: (LLAWP)
user: 71105 kernel: 342684 state: D process: (LLAWP)
user: 71290 kernel: 346119 state: D process: (LLAWP)
user: 71009 kernel: 347570 state: D process: (LLAWP)
user: 71278 kernel: 348388 state: D process: (LLAWP)
This is only the top-5, but LLAWP was using the most system on this system while really doing nothing - it was in uninterruptible sleep ("D"). Restarting those processes helped and the system load returned to normal :-)