Skip to content

gzip vs. pigz vs. bzip2 vs. pbzip2

Shortly after the last benchmark, I came across pigz (parallel gzip) and a bigger (real-world) task to complete:
$ time gzip -c file.tar > file.tar
real     41m52.636s
user     33m58.392s
sys       2m26.903s

$ time pigz -c file.tar > file.tar.pigz
real     18m34.894s
user     54m07.784s
sys       3m47.910s

$ time bzip2 -c file.tar > file.tar.bz2
real    838m47.771s
user    830m48.621s
sys       2m18.429s

$ time pbzip2 -c file.tar > file.tar.pbz2
real     58m06.466s
user   1748m17.785s
sys       4m49.537s

$ ls -lhgo
-rw-r--r--   1  15G Jun 24 02:03 file.tar
-rw-r--r--   1 598M Jun 24 22:10 file.tar.gz
-rw-r--r--   1 600M Jun 24 21:02 file.tar.pigz
-rw-r--r--   1 304M Jun 25 12:44 file.tar.bz2
-rw-r--r--   1 306M Jun 25 13:42 file.tar.pbz2
Hardware: Sun SPARC Enterprise T5120, 1.2GHz 8-Core SPARC V9, 4GB RAM

gzip vs. bzip2 vs. pbzip2 vs. xz vs. lzma

Yes, another benchmark. And yes, YMMV - big time:
$ ls -lh file.tar
-rw-r--r-- 1 bob bob 80M May 27 12:36 file.tar

$ compress-test.sh file.tar
### gzip/9c:    6 seconds / 48.700% smaller
### bzip2/9c:   44 seconds / 50.300% smaller
### pbzip2/9c:  22 seconds / 50.200% smaller
### xz/9c:      79 seconds / 53.400% smaller
### lzma/9c:    120 seconds / 53.200% smaller
### gzip/1c:    4 seconds / 46.200% smaller
### bzip2/1c:   30 seconds / 48.100% smaller
### pbzip2/1c:  15 seconds / 48.000% smaller
### xz/1c:      35 seconds / 51.100% smaller
### lzma/1c:    18 seconds / 49.700% smaller
### gzip/dc:    0 seconds
### bzip2/dc:   7 seconds
### pbzip2/dc:  4 seconds
### xz/dc:      4 seconds
### lzma/dc:    5 seconds
Versions used:

ext2 vs. ext3 vs. ext4

I always wondered if those ext* mountoptions did anything performance wise. Turns out they do, kind of:
FS   mount option    avg over 3 runs
------------------------------------
ext2 option: orlov        - 56.3333 sec
ext2 option: oldalloc     - 66.3333 sec
ext2 option: atime        - 62.6667 sec
ext2 option: noatime      - 57.3333 sec
ext2 option: data=journal   ---
ext2 option: data=ordered   ---
ext2 option: data=writeback ---
ext2 option: acl          - 59 sec
ext2 option: noacl        - 57.6667 sec
ext2 option: user_xattr   - 59 sec
ext2 option: nouser_xattr - 59 sec

ext3 option: orlov        - 61.3333 sec
ext3 option: oldalloc     - 62.3333 sec
ext3 option: atime        - 62.3333 sec
ext3 option: noatime      - 60.6667 sec
ext3 option: data=journal   - 114 sec
ext3 option: data=ordered   - 62.6667 sec
ext3 option: data=writeback - 61.6667 sec
ext3 option: acl          - 62.6667 sec
ext3 option: noacl        - 61.6667 sec
ext3 option: user_xattr   - 64.3333 sec
ext3 option: nouser_xattr - 60.6667 sec

ext4 option: orlov        - 49.6667 sec
ext4 option: oldalloc     - 52.6667 sec
ext4 option: atime        - 49.6667 sec
ext4 option: noatime      - 50 sec
ext4 option: data=journal   - 101.333 sec
ext4 option: data=ordered   - 49.3333 sec
ext4 option: data=writeback - 51 sec
ext4 option: acl          - 48.6667 sec
ext4 option: noacl        - 51.6667 sec
ext4 option: user_xattr   - 49.6667 sec
ext4 option: nouser_xattr - 50.6667 sec
This was done by a script extracting a ~800MB tarball onto a freshly created ext* filesystem, 3 times in a row.

space vs. time

# time pbzip2 -c wordlist.txt > wordlist.txt.bz2
real    41m53.295s
user    67m17.972s
sys     5m38.981s

# time 7z a -m0=lzma -mx=9 -mfb=64 -md=32m -ms=on wordlist.txt.7z wordlist.txt
real    525m35.861s
user    446m31.866s
sys     32m20.861s

# ls -lhgo
total 31G
-rw------- 1  25G 2008-12-16 00:55 wordlist.txt
-rw------- 1 776M 2008-12-17 01:09 wordlist.txt.7z
-rw------- 1 5.0G 2008-12-16 08:46 wordlist.txt.bz2
....'nuff said.

and the winner is...

Ah, benchmarks - what else would we spend our CPU cycles on anyway? Quite a long time ago I was surprised to see that awk was so much slower than grep. This was a long time ago and I don't remember all the details, but there was sort involved too, and it was GNU/grep vs. Solaris/awk, IIRC. Anyway, here's what I did just now:
# ls -lhgo du.all; wc -l du.all 
 -rw-r--r--    1     2.2M Jan  7 17:26 du.all
          23773 du.all

# time sort -n du.all | grep -v /home > /dev/null 
real	0m8.939s
user	0m8.920s
sys	0m0.010s

# time grep -v /home du.all | sort -n > /dev/null 
real	0m25.694s
user	0m25.670s
sys	0m0.010s

# time awk '!/\/home/' du.all | sort -n > /dev/null 
real	0m0.622s
user	0m0.620s
sys	0m0.010s
Yes, the sort(1) is not even relvant here, it's really grep(1) taking so long. There's a --mmap switch to grep, promising better performance and sometimes coredumps, neither of both happened. This was done with GNU sort-4.5.3, GNU Awk 3.1.1, GNU grep 2.5.1. Oh, yeah - these may have been "current" versions back in ~2002 :)

my bonnie++ lies over the ocean

Got myself new diskspace today: 2x1TB Samsung HD103UJ, enclosed in an Taurus RAID case. First tests in Leopard say 35MB/s over FireWire400, more testing ahead. I was surprised to see that the bonnie++ filesystem benchmark was not available via Fink. Compiling from source failed with:
g++ -O2  -DNDEBUG -Wall -W -Wshadow -Wpointer-arith -Wwrite-strings -pedantic 
        -ffor-scope  -c getc_putc.cpp
getc_putc.cpp: In function ‘int main(int, char**)’:
getc_putc.cpp:174: error: no matching function for call to 
                          ‘min(long unsigned int, unsigned int)’
make: *** [getc_putc.o] Error 1
There is a patch for some 64-bit platforms, but the ifdef does not catch MacOS X, I guess. So, here's a slightly edited version of this very patch:
--- getc_putc.cpp.ORIG	2008-07-12 22:43:34.000000000 +0200
+++ getc_putc.cpp	2008-07-12 22:53:11.000000000 +0200
@@ -17,6 +17,10 @@
 #include "duration.h"
 #include "getc_putc.h"
 
+/* Work around for: line 168, no matching function for call to... */
+#include <sys/param.h>
+#define min MIN
+
 static void usage()
 {
   fprintf(stderr, "usage:\n"
Btw, first benchmarks of GNU/Linux 2.6.24 with the internal 120GB drive are here.

testing xfs again with slightly more interesting results ;)

Since my last benchmark with XFS was kinda stupid (testing 512 MB of data on a box with 1GB RAM), I tested again, this time with 4GB of data.
  • "Sequential output" and "random delete" seem to be higher with an external logdev set (here: l_logdevhda5_size67108864)
  • In other places the external log seem to slow down operations (well, the logdev (hda5) *is* slower that the devices the tests were run on, but I somehow thought the journal would fit into RAM anyway. Hm, OTOH a journal written to RAM makes no sense, does it?
  • adjusting "-b size" from 4096 to 512 during mkfs(8) does not seem to change much, except for "sequential output" (+1MB/s) and 'sequential delete' (twice as much deletes/s)
  • adjusting the "-l size" to 4MB decreased 'random deletes' (with 64MB it's twice as fast)
The mountoptions do not seem to do much, but I really need to learn gnuplot(1) to generate nice graphs out of all these fine numbers....

testing -mm, playing around with xfs

I'm tracking -mm too and finally got around to benchmark it. Out of curiosity about the numerous options for mkfs(8) and mount(8) I did a few benchmarks. The results however are a bit boring and I for one have no reason to tweak these options for a desktop machine. OTOH, the bonnie++ options could be altered again to test each combination more thoroughly.

benchmarks, 2.6.19-rc6-git11 @ raid1

yet another testrun: this time the 2xHD400LD were combined as a RAID-1 and the benchmarks were done while X11 was running and so was the Folding@home client. As I'd like to use dm-crypt in favour of loop-aes, I've skipped a few "uninteresting" ciphers. Here are the results.

benchmarks galore

I'm thinking about giving up my old benchmark frontpage and posting benchmark results right here in this weblog. Easier for me to maintain and much healthier for your eyes ;-) Let me start with all the old stuff first, my next post will deal with more current data: The filesystems tested are the usual suspects: ext2, ext3, jfs, reiser, reiser4 (only a few results available) and xfs for different kernels and different benchmark programs like tiobench, Bonnie++, ioZone and even generic system tools like cp, rm, tar, dd. Also loop-aes was used to test encrypted volumes as well. I've used several benchmark scripts, starting with tio-bench.sh, which has been superseded by bench.sh. And here's yet another one, I've forgotten when I was using this though. Being ugly as hell they served my needs and all they did was executing these commands:
bonnie++ -f -s 2048 -r 0 -n 10:10240:10
tiobench.pl --size 2048 --numruns 1
iozone -s 2048 -f /mnt/bench/iozone.img
tar -xjf linux.tar.bz2 -C /mnt/bench/tarball
find /mnt/bench/tarball
cp -a /mnt/bench/tarball /mnt/bench/tarball_copy
rm -rf /mnt/bench/tarball
dd if=/dev/zero of=/mnt/bench/test.img bs=1M count=2048
dd if=/mnt/bench/test.img of=/dev/null bs=1M count=2048
the mount options used:
-t ext2fs -o noatime
-t ext3fs -o noatime,data=ordered
-t jfs -o noatime,integrity
-t reiserfs -o noatime,notail
-t reiser4 -o noatime
-t xfs -o noatime,notail
I've tested different machines, very irregularly and only a few results are available for each box: Enjoy!