Back up your SVN repositories to Mosso CloudFile
First published at Saturday 11 April 2009
Warning: This blog post is more then 16 years old – read and use with care.
Back up your SVN repositories to Mosso CloudFile
There is nearly no data I fear more to loose, then all the SVN repositories I am hosting for my projects and the projects of some others. With Mosso CloudFile there is a quite cheap service, where you can put your data easily into "the cloud". To cite them about this service:
Cloud Files is reliable, scalable and affordable web-based storage for backing up and archiving all your static content.
So I quickly calculated the costs for this, and then migrated - sadly there are only libraries for accessing the service through PHP, and some other languages, but not for bash - even it should be quite simple to do with curl just from the command line.
So I quickly hacked together a short script which does that for me. "That" means in some more detail:
Schedule each repository for nightly backup, if a commit occurred during the day.
Package the repository and upload it into the cloud.
Ensure it arrived safely in the cloud, reschedule for backup otherwise.
Scheduling
The scheduling itself is quite easy. There is a very small script which does that for me:
#!/bin/sh
# POST-COMMIT HOOK
#
# Schedules the current repository for a backup cycle during night, because of
# an update.
DATABASE="/usr/local/svn/backup/scheduled"
REPOS="$1"
echo "$REPOS" >> "$DATABASE"
Just use that as a post-commit hook - you may need to modify the "DATABASE" location though. It will contain a list of all repositories already scheduled for backup. If you already got a post-commit hook installed, just add this to the existing hook:
/usr/local/svn/bin/scheduleBackup $@
You might need to modify the path, of course, to point to the script shown above.
Perform the backup
The second script is installed as a cron job on my server and I let it run once a night, which seems sufficant to me, like:
0 2 * * * /usr/local/svn/bin/backupRepositories
It only echoes something on STDERR on failure, so there is no need to silence it in any way.
The script:
#!/bin/sh
# Mosso PAI keys
MOSSO_USER="user"
MOSSO_KEY="key"
MOSSO_CONTAINER="svn_backup"
# Locations and paths
DATABASE="/usr/local/svn/backup/scheduled"
STORAGE="/usr/local/svn/backup"
# Read scheduled repositories
REPOS=`cat "${DATABASE}" | sort | uniq`
if [ -z "${REPOS}" ]; then
exit 0
fi
# Clear scheduled backups, everything will be rescheduled on failure
echo -n > "${DATABASE}"
# Request mosso auth token
AUTH_RESPONSE=`curl -s --include -H "X-Auth-User: ${MOSSO_USER}" -H "X-Auth-Key: ${MOSSO_KEY}" "https://api.mosso.com/auth"`
AUTH_KEY=`echo "${AUTH_RESPONSE}" | tr '\r' '\n' | grep "X-Auth-Token" | awk '{print $2}'`
AUTH_URL=`echo "${AUTH_RESPONSE}" | tr '\r' '\n' | grep "X-Storage-Url" | awk '{print $2}'`
# Create archives of all scheduled repositories
for REPO in $REPOS
do
REPONAME=`basename "${REPO}"`
SVN_HOTBACKUP_NUM_BACKUPS="1" svn-hot-backup --archive-type=bz2 "${REPO}" "${STORAGE}" > /dev/null
# Normalize repository name, to not waste backup space
ARCHIVE="${STORAGE}/${REPONAME}.tar.bz2"
mv `ls "${STORAGE}"/"${REPONAME}"-*.tar.bz2` "${ARCHIVE}"
# Gather additional repository information
REVISION=`svnlook youngest $REPO`
DATE=`date`
LAST_CHANGE=`svnlook date $REPO`
# Upload
MD5=`md5sum "${ARCHIVE}" | awk '{print $1}'`
FILENAME=`basename "${ARCHIVE}"`
RETURN=`curl -s -X PUT \
-H "ETag: ${MD5}" \
-H "X-Auth-Token: ${AUTH_KEY}" \
-H "X-Object-Meta-Date: ${DATE}" \
-H "X-Object-Meta-Revision: ${REVISION}" \
-H "X-Object-Meta-LastChange: ${LAST_CHANGE}" \
-H "Expect:" \
-H "Content-Type: application/x-bzip" \
--data-binary "@${ARCHIVE}" "${AUTH_URL}/${MOSSO_CONTAINER}/${FILENAME}"`
if [ -n "${RETURN}" ]; then
echo "Upload failed: ($ARCHIVE, Revision: ${REVISION}): ${RETURN}" >&2
# Reschedule backup
echo "${REPO}" >> "${DATABASE}"
fi
# Remove repository local archive
rm "${ARCHIVE}"
done
You should of course modify the variables in the first lines of the file according to your needs. You also need to have curl installed with SSL support and the `svn-hot-backup`
script, which is at least part of the distribution package on Gentoo.
Even the containers on CloudFile are "private" by default - I would not upload any sensitive data unencrypted. You might want to pipe your repository archives through GPG or similar. My repositories all contain open source stuff anyways, so I don't care about that.
Have fun using it, or playing with it. Any feedback is welcome. Even the script is probably to trivial for any copyright to apply, I hereby state it is public domain - if you need to care about licenses.
Subscribe to updates
There are multiple ways to stay updated with new posts on my blog:
Comments
Jakob Westhoff at Saturday, 11.4. 2009
Thanks for this neat little script.
As you know I am already using it to backup my SVN repositories as well. It works like a charm.
greetings Jakob
Franziskus Domig at Sunday, 12.4. 2009
This is the reason (or at least one) why SVN/CVS/Random-Centralized-SCM-Tool sucks.
Do it (really) right. Use GIT. ;)
kore at Sunday, 12.4. 2009
@Franziskus Domig: Actually you should also back up the master branch of any decentralized SCM-Tool. You might not loose that much in case of hardware failure, but you still wouldn't want to reconstruct the possibly missing parts.
And DVCS vs. VCS is not only about the tool (and you know, they all have issues), but also about the project management. I do use git-svn for example, but still - a central source tree, to which multiple persons have partial access, fits some projects best.
Frank at Thursday, 29.4. 2010
Do you have an idea how long the X-Auth-Token stays valid? Is it indefinite, or only for a period of time?
Kore at Thursday, 29.4. 2010
@Frank: No, for long running scripts one should handle the de-validation gracefully, I guess.
Frank at Thursday, 29.4. 2010
@Kore - any idea how to determinate the length ? :) Tried to ask the mosso support, but they never seem to be sure of their answers ....