26/06/2005 @21:19:25 ^23:20:38
Do I conform sufficiently to the ideals of nonconformity?
The eternal worry of goths, punks, ravers, hipsters and all scenesters of all retarded subcultures everywhere
ARSE, INCORPORATED
I'd like to take a break from the usual litany of obsession with a 12-year old game to tell you about how I have found myself confounded by a subtle bug in rsync. In January, I purchased a few bits of hardware including a second attempt to add a large hard disc for the purposes of making live backups of my files (the first, in May 2003, similarly never happened and then the disc itself ended up as caco's primary disc a few months later)
I'd found the page on snapshot backups with rsync and decided to use it, but I put off actually doing anything for a long time because I knew I'd have to write quite a complicated script to get it to work how I wanted. Since my mind never formed a sufficiently complete plan on how to do this, it never got started. But then somehow or other I discovered rsnapshot. It seemed to be the script I hadn't written, and a lot more besides!
OKAY, GET ON WITH IT, WHAT'S THE PROBLEM
After rotating previous backups away, rsnapshot first uses cp -al to make a copy of the most recent backup using hard links so they don't take up any disc space. It then updates the current backup using rsync commands to only update the changed files. In essence they look like this
rsync -a --delete --relative /home/ backup_point
rsync -a --delete --relative /etc/ backup_point
...
where backup_point is an rsnapshot-specific thing which will look something like /backup/hourly.0/localhost/. Note the use of both --delete and --relative and the fact that the source directory has a trailing slash. rsnapshot is very insistent on trailing slashes for directories.
However there is a bug in rsync 2.6.4, fixed in 2.6.5, which causes it to behave badly in exactly the circumstances it is called by rsnapshot. In particular it never seems to delete anything so the amount of disc used by the backups just grows and grows as temporary files that should expire from the backups never actually do (e.g. if you rename a file the backups will forever contain two copies of the file, under the old and the new name) It's not fatal but as you might expect I find this incredibly annoying and wasteful. The bug - at least I think this is it - is in flist.c, and was fixed in revision 1.292 (diff)
Of course the final irritation is that rsync 2.6.5 wasn't released in time to make it into Debian Sarge which was released earlier this month. Now I'm not about to carry on tracking the testing distribution, Etch or whatever it is called, since the most likely time for it to break really horribly is in the first six months or a year after a new stable release. And I don't think this thing is critical enough that they'll patch the version in stable, it's not a security bug (although when I was testing things I found it can delete things from the destination that it is not supposed to, with the trailing slash. It could be a higher priority bug than normal)
There are at least three solutions/workarounds, but I don't like any of them.
- Patch rsnapshot to drop the trailing slash. Possibly introduce other bugs, and lose possible security updates from Debian
- Patch rsync, or just manually upgrade it to the 2.6.5 package from Etch. This will also kill automatic tracking of security updates - and with rsync being a network program, you need to be more careful
- Carry on tracking testing instead of stable. Lose all security updates and have all sorts of problems when it breaks (e.g. you just watch when they do the C++ ABI transition)
Meanwhile, my backups directory continues to grow unchecked, bloated by things that should naturally expire from it...