Skip to main content

rTorrent: Hash check on download completion found bad chunks, consider using "safe_sync"

rTorrent would not complete a download and print the following:
* [OPEN]  506.3 /  785.3 MB Rate: 0.0 / 0.0 KB Uploaded: 248.8 MB  [T R: 0.49]
* Inactive: Hash check on download completion found bad chunks, consider using "safe_sync".
Initiating a check of the torrent's hash (^R) succeeded and then rTorrent tried to download the remaining part of the file - only to fail again, printing the same message :-\

Setting safe_sync (which got renamed to pieces.sync.always_safe) did not help. There's a longish and old ticket that got closed as "invalid". While this might have been the Right ThingTM to do (see the LKML discussion related to that issue) there was another hint: decreasing max_open_files (which got renamed to network.max_open_files) to a lower value, say 64. Needless to say that this didn't help either, so maybe there's something else going on here.

strace might be able to shed some light on this, so let's give it a try. After several hours (and a good night's sleep) a 2GB strace(1) logfile was waiting to be analyzed. I only needed the part of the logfile up to where the error message occured first - and from there on upwards I'd search for negative exitcodes, as they will denote some kind of error. And lo and behold, there it was:
    mmap2(NULL, 292864, PROT_READ, MAP_SHARED, 13, 0x31100) = -1 ENOMEM (Cannot allocate memory)
Before we continue to find out why we failed, let's see how much memory we tried to allocate here. mmap2() is supposed to "map files or devices into memory":
    void *mmap2(void *addr, size_t length, int prot, int flags, int fd, off_t pgoffset);
In our case, size_t is 292864 (bytes) with an offset of 0x31100. However, this offset is in "pagesize units". So, what is our page size?
$ uname -rm && getconf PAGE_SIZE
3.9.0-rc4 ppc
Let's calculate the size rTorrent was trying to mmap2() here:
$ bc

1000 * 31100                <== PAGE_SIZE * 0x31100

823132160 + 292864          <== add size_t
So, 823425024 bytes are 786 MB - we have 1.2 GB RAM on this machine and some swapspace too. Not too much, but this box mmap()'ed larger files than this before - why would mmap2() fail with ENOMEM here?

Maybe this "reduce max_open_files" hint tipped me off but now I remembered playing around with ulimit(3) a while ago. So maybe these ulimits were too tight?

And they were! Setting ulimit -v ("per process address space") to a larger value made the ENOMEM go away and rTorrent was able to complete the download:
$ ls -lgo
-rw-r----- 1 823425024 Apr  1 11:38
...with the exact same size mmap2() was trying to allocate. Btw, we could've checked the file before rTorrent completed the download, because it's an sparse file anyway.

Update: while raising the ulimit(3) certainly resolved the ENOMEM issue, the torrent would still not complete successfully. Turns out it was a kernel bug after all, but it was resolved rather quickly.