Back up your SVN repositories to Mosso CloudFile - Kore Nordmann

Back up your SVN repositories to Mosso CloudFile

There is nearly no data I fear more to loose, then all the SVN repositories I am hosting for my projects and the projects of some others. With Mosso CloudFile there is a quite cheap service, where you can put your data easily into "the cloud". To cite them about this service:

Cloud Files is reliable, scalable and affordable web-based storage for backing up and archiving all your static content.

Mosso

So I quickly calculated the costs for this, and then migrated - sadly there are only libraries for accessing the service through PHP, and some other languages, but not for bash - even it should be quite simple to do with curl just from the command line.

So I quickly hacked together a short script which does that for me. "That" means in some more detail:

  1. Schedule each repository for nightly backup, if a commit occurred during the day.

  2. Package the repository and upload it into the cloud.

  3. Ensure it arrived safely in the cloud, reschedule for backup otherwise.

Scheduling

The scheduling itself is quite easy. There is a very small script which does that for me:

#!/bin/sh # POST-COMMIT HOOK # # Schedules the current repository for a backup cycle during night, because of # an update. DATABASE="/usr/local/svn/backup/scheduled" REPOS="$1" echo "$REPOS" >> "$DATABASE"

Just use that as a post-commit hook - you may need to modify the "DATABASE" location though. It will contain a list of all repositories already scheduled for backup. If you already got a post-commit hook installed, just add this to the existing hook:

/usr/local/svn/bin/scheduleBackup $@

You might need to modify the path, of course, to point to the script shown above.

Perform the backup

The second script is installed as a cron job on my server and I let it run once a night, which seems sufficant to me, like:

0 2 * * * /usr/local/svn/bin/backupRepositories

It only echoes something on STDERR on failure, so there is no need to silence it in any way.

The script:

#!/bin/sh # Mosso PAI keys MOSSO_USER="user" MOSSO_KEY="key" MOSSO_CONTAINER="svn_backup" # Locations and paths DATABASE="/usr/local/svn/backup/scheduled" STORAGE="/usr/local/svn/backup" # Read scheduled repositories REPOS=`cat "${DATABASE}" | sort | uniq` if [ -z "${REPOS}" ]; then exit 0 fi # Clear scheduled backups, everything will be rescheduled on failure echo -n > "${DATABASE}" # Request mosso auth token AUTH_RESPONSE=`curl -s --include -H "X-Auth-User: ${MOSSO_USER}" -H "X-Auth-Key: ${MOSSO_KEY}" "https://api.mosso.com/auth"` AUTH_KEY=`echo "${AUTH_RESPONSE}" | tr '\r' '\n' | grep "X-Auth-Token" | awk '{print $2}'` AUTH_URL=`echo "${AUTH_RESPONSE}" | tr '\r' '\n' | grep "X-Storage-Url" | awk '{print $2}'` # Create archives of all scheduled repositories for REPO in $REPOS do REPONAME=`basename "${REPO}"` SVN_HOTBACKUP_NUM_BACKUPS="1" svn-hot-backup --archive-type=bz2 "${REPO}" "${STORAGE}" > /dev/null # Normalize repository name, to not waste backup space ARCHIVE="${STORAGE}/${REPONAME}.tar.bz2" mv `ls "${STORAGE}"/"${REPONAME}"-*.tar.bz2` "${ARCHIVE}" # Gather additional repository information REVISION=`svnlook youngest $REPO` DATE=`date` LAST_CHANGE=`svnlook date $REPO` # Upload MD5=`md5sum "${ARCHIVE}" | awk '{print $1}'` FILENAME=`basename "${ARCHIVE}"` RETURN=`curl -s -X PUT \ -H "ETag: ${MD5}" \ -H "X-Auth-Token: ${AUTH_KEY}" \ -H "X-Object-Meta-Date: ${DATE}" \ -H "X-Object-Meta-Revision: ${REVISION}" \ -H "X-Object-Meta-LastChange: ${LAST_CHANGE}" \ -H "Expect:" \ -H "Content-Type: application/x-bzip" \ --data-binary "@${ARCHIVE}" "${AUTH_URL}/${MOSSO_CONTAINER}/${FILENAME}"` if [ -n "${RETURN}" ]; then echo "Upload failed: ($ARCHIVE, Revision: ${REVISION}): ${RETURN}" >&2 # Reschedule backup echo "${REPO}" >> "${DATABASE}" fi # Remove repository local archive rm "${ARCHIVE}" done

You should of course modify the variables in the first lines of the file according to your needs. You also need to have curl installed with SSL support and the `svn-hot-backup` script, which is at least part of the distribution package on Gentoo.

Even the containers on CloudFile are "private" by default - I would not upload any sensitive data unencrypted. You might want to pipe your repository archives through GPG or similar. My repositories all contain open source stuff anyways, so I don't care about that.

Have fun using it, or playing with it. Any feedback is welcome. Even the script is probably to trivial for any copyright to apply, I hereby state it is public domain - if you need to care about licenses.

Comments