Back up your SVN repositories to Mosso CloudFile

First published at Saturday, 11 April 2009

Warning: This blog post is more then 14 years old – read and use with care.

Back up your SVN repositories to Mosso CloudFile

There is nearly no data I fear more to loose, then all the SVN repositories I am hosting for my projects and the projects of some others. With Mosso CloudFile there is a quite cheap service, where you can put your data easily into "the cloud". To cite them about this service:

Cloud Files is reliable, scalable and affordable web-based storage for backing up and archiving all your static content.


So I quickly calculated the costs for this, and then migrated - sadly there are only libraries for accessing the service through PHP, and some other languages, but not for bash - even it should be quite simple to do with curl just from the command line.

So I quickly hacked together a short script which does that for me. "That" means in some more detail:

  1. Schedule each repository for nightly backup, if a commit occurred during the day.

  2. Package the repository and upload it into the cloud.

  3. Ensure it arrived safely in the cloud, reschedule for backup otherwise.


The scheduling itself is quite easy. There is a very small script which does that for me:

#!/bin/sh # POST-COMMIT HOOK # # Schedules the current repository for a backup cycle during night, because of # an update. DATABASE="/usr/local/svn/backup/scheduled" REPOS="$1" echo "$REPOS" >> "$DATABASE"

Just use that as a post-commit hook - you may need to modify the "DATABASE" location though. It will contain a list of all repositories already scheduled for backup. If you already got a post-commit hook installed, just add this to the existing hook:

/usr/local/svn/bin/scheduleBackup $@

You might need to modify the path, of course, to point to the script shown above.

Perform the backup

The second script is installed as a cron job on my server and I let it run once a night, which seems sufficant to me, like:

0 2 * * * /usr/local/svn/bin/backupRepositories

It only echoes something on STDERR on failure, so there is no need to silence it in any way.

The script:

#!/bin/sh # Mosso PAI keys MOSSO_USER="user" MOSSO_KEY="key" MOSSO_CONTAINER="svn_backup" # Locations and paths DATABASE="/usr/local/svn/backup/scheduled" STORAGE="/usr/local/svn/backup" # Read scheduled repositories REPOS=`cat "${DATABASE}" | sort | uniq` if [ -z "${REPOS}" ]; then exit 0 fi # Clear scheduled backups, everything will be rescheduled on failure echo -n > "${DATABASE}" # Request mosso auth token AUTH_RESPONSE=`curl -s --include -H "X-Auth-User: ${MOSSO_USER}" -H "X-Auth-Key: ${MOSSO_KEY}" ""` AUTH_KEY=`echo "${AUTH_RESPONSE}" | tr '\r' '\n' | grep "X-Auth-Token" | awk '{print $2}'` AUTH_URL=`echo "${AUTH_RESPONSE}" | tr '\r' '\n' | grep "X-Storage-Url" | awk '{print $2}'` # Create archives of all scheduled repositories for REPO in $REPOS do REPONAME=`basename "${REPO}"` SVN_HOTBACKUP_NUM_BACKUPS="1" svn-hot-backup --archive-type=bz2 "${REPO}" "${STORAGE}" > /dev/null # Normalize repository name, to not waste backup space ARCHIVE="${STORAGE}/${REPONAME}.tar.bz2" mv `ls "${STORAGE}"/"${REPONAME}"-*.tar.bz2` "${ARCHIVE}" # Gather additional repository information REVISION=`svnlook youngest $REPO` DATE=`date` LAST_CHANGE=`svnlook date $REPO` # Upload MD5=`md5sum "${ARCHIVE}" | awk '{print $1}'` FILENAME=`basename "${ARCHIVE}"` RETURN=`curl -s -X PUT \ -H "ETag: ${MD5}" \ -H "X-Auth-Token: ${AUTH_KEY}" \ -H "X-Object-Meta-Date: ${DATE}" \ -H "X-Object-Meta-Revision: ${REVISION}" \ -H "X-Object-Meta-LastChange: ${LAST_CHANGE}" \ -H "Expect:" \ -H "Content-Type: application/x-bzip" \ --data-binary "@${ARCHIVE}" "${AUTH_URL}/${MOSSO_CONTAINER}/${FILENAME}"` if [ -n "${RETURN}" ]; then echo "Upload failed: ($ARCHIVE, Revision: ${REVISION}): ${RETURN}" >&2 # Reschedule backup echo "${REPO}" >> "${DATABASE}" fi # Remove repository local archive rm "${ARCHIVE}" done

You should of course modify the variables in the first lines of the file according to your needs. You also need to have curl installed with SSL support and the `svn-hot-backup` script, which is at least part of the distribution package on Gentoo.

Even the containers on CloudFile are "private" by default - I would not upload any sensitive data unencrypted. You might want to pipe your repository archives through GPG or similar. My repositories all contain open source stuff anyways, so I don't care about that.

Have fun using it, or playing with it. Any feedback is welcome. Even the script is probably to trivial for any copyright to apply, I hereby state it is public domain - if you need to care about licenses.


Jakob Westhoff at Saturday, 11.4. 2009

Thanks for this neat little script.

As you know I am already using it to backup my SVN repositories as well. It works like a charm.

greetings Jakob

Franziskus Domig at Sunday, 12.4. 2009

This is the reason (or at least one) why SVN/CVS/Random-Centralized-SCM-Tool sucks.

Do it (really) right. Use GIT. ;)

kore at Sunday, 12.4. 2009

@Franziskus Domig: Actually you should also back up the master branch of any decentralized SCM-Tool. You might not loose that much in case of hardware failure, but you still wouldn't want to reconstruct the possibly missing parts.

And DVCS vs. VCS is not only about the tool (and you know, they all have issues), but also about the project management. I do use git-svn for example, but still - a central source tree, to which multiple persons have partial access, fits some projects best.

Frank at Thursday, 29.4. 2010

Do you have an idea how long the X-Auth-Token stays valid? Is it indefinite, or only for a period of time?

Kore at Thursday, 29.4. 2010

@Frank: No, for long running scripts one should handle the de-validation gracefully, I guess.

Frank at Thursday, 29.4. 2010

@Kore - any idea how to determinate the length ? :) Tried to ask the mosso support, but they never seem to be sure of their answers ....

Subscribe to updates

There are multiple ways to stay updated with new posts on my blog:

And finally you can also subscribe to the mailing list, where every new blog post is also posted.