Making the Most of Commit Hooks with Subversion

If you're already using Subversion for version control, extend it with commit hooks to make it a more integrated part of your development workflow.

You may well have already encountered Subversion: a centralised version control system which arose from CVS but which loses many of the bugs and awkwardnesses in CVS. If you’re not familiar with the idea, version control systems allow you to keep multiple drafts or versions of files, in a space-efficient way.

They’re just about essential if you have a project with more than one person working on it, and very useful even for solo projects. As well as allowing you to revert changes if you decide that you preferred an earlier version of something, they also act as an ad-hoc backup system.

In this article, I’m going to assume that you already use or have used Subversion, and am going to talk about improving the way in which you use it — by making use of commit hooks.

What are commit hooks?

So, what then are these commit hooks and why might you use them? Well, the process of using Subversion (or most other similar version control systems) once it’s set up goes roughly like this:

  1. User checks out a local version of the project (whatever that project may be — code, text, binary files, anything else you can keep as a file).
  2. User makes their own local changes.
  3. User commits their changes back to the repository.

It’s at that stage three that commit hooks come in. A commit hook is a script that is triggered by a repository event. The hooks I’m going to discuss here are triggered by commit events – you can also have hooks that are triggered by revision property changes, which I’ll touch on briefly at the end. There are three commit hooks which are attached to particular stages in the commit process:

  1. start-commit: this is run before the transaction begins.
  2. pre-commit: this is run at the end of the transaction, but before the changes are actually committed.
  3. post-commit: this is run after the transaction has been committed.

To set up a hook, you need only to put the script in a hooks directory in your Subversion repository directory (e.g. /home/username/svn/project/hooks). This directory should already exist if your repository was created in the usual way, and will have a couple of example (template) files ending .tmpl in it. You can read these to get more information on how the hook scripts work.

You can use any script or program you like, in any language you like, as a commit hook. Simply place it in the hooks directory and name it start-commit, pre-commit, and post-commit as required. It must be executable, and if it is a script, make sure that the shebang (#! line is present and points to the correct location for whatever language you are using.

Normal practice is for the hook script to call another program or script which does the bulk of the work. This makes things neater and more maintainable, but of course you can arrange things differently if you prefer.

Invoking Commit Hooks

The hooks are invoked with ordered arguments. The first one in each case is $REPOS (the path to the repository), and then the second argument is different in each case:

  1. start-commit: $USER (the user attempting to commit)
  2. pre-commit: $TXN-NAME (the name of the transaction about to be committed. This is by default generated from the number of the current revision.)
  3. post-commit: $REV (the number of the revision just committed).

Whilst these are the default arguments and the standard names for them, of course the script won’t know the argument names until you set them! They are passed in just as ordered arguments (so $1 and $2 in bash, for example). It’s good practice to set the appropriate named variables at the start of the script, and certainly before you call any other script, to avoid confusing yourself unnecessarily. So for example with a post-commit script in sh:

#!/bin/sh

REPOS="$1"
REV="$2"

SCRIPT="/home/username/svn/repository/hooks/script.pl"

"$SCRIPT" "$REPOS" "$REV" thirdarg || exit 1

You can of course call several other scripts/programs in turn from your hook script. Remember that you want to exit the hook script with a non-zero return code if any of your other scripts fail. It’s the return code of the hook script that controls whether or not the commit continues.

Environment and Output

You may have noticed, looking at that script above, that I used a variable for the script name. This is because hook scripts don’t inherit environment variables – so you can’t assume that $PATH will be correct. In general, it won’t be. You therefore need to either specify everything in full (as I’ve done there, or set up any environment variables you may need at the start of the script.

You may want to get output from your script, particularly if you’re debugging. However, unfortunately if you have a print statement (or similar) in the hook script or in a script called from it, it won’t be printed to the screen. Output is redirected by default.

To get error output from your secondary script to the screen if the secondary script is exiting non-zero, you can pass the output with the return value. In Perl, you do this using the die function:

die "Something went wrong!\\n";

More generally, however, your best bet is to write output to a log file. Bear in mind that you need to specify the full path to the log file, because your hooks may not necessarily (depending on your setup) be run from the hooks directory. It saves a lot of time in searching for the log to just give the full path in the first place! Try something like this:

#!/bin/sh

REPOS="$1"
REV="$2"
DIR="$REPOS/hooks"
SCRIPT="$DIR/script.pl"
LOG="$DIR/log.txt"

"$SCRIPT" "$REPOS" "$REV" thirdarg \\
>> $LOG 2>&1 || exit 1

Languages

The below examples are all in Perl or shell script, because those are my own preferences. Lots of other people seem to use Python, and any other scripting language you have a particular fondness for will do just fine.

Indeed, any other language at all, as long as you have an appropriate executable with the right name in the right place. For most tasks a scripting language will make more sense, though, at least for the initial hook script. You can of course call programs in other languages from the hook — for example, your test suite may be in whichever language your software is written in.

Sample Commit Hook Scripts

The scripts that tend to first come to mind, and which are most useful to implement initially, are post-commit scripts.

One obvious option for which a commit hook is useful, and one which most environments will have in place, is to send an email when a commit occurs. This will be a post-commit hook, using the variables that Subversion hands to the hook on commit.

Here’s a very basic sample mailer script, which gets the file changes from svnlook and sends an email. svnlook is likely to be useful quite often in your hook scripts — it’s intended to be both human-parseable and machine-parseable, and is a tool provided as part of Subversion to enable you to examine changes in and commits to the repository.

#!/usr/bin/perl -w

use strict;
use Net::SMTP;

sub sendmail();

my $svnlook     = '/usr/bin/svnlook';
my $email       = 'user@example.com';
my $smtp_server = 'mail.example.com';

my $repos = $ARGV[0];
my $rev   = $ARGV[1];

my @svnlooklines = `$svnlook changed $repos -r $rev`;

my $svndata = join(@svnlooklines, "\\n");

sendmail();

exit 0;

sub sendmail() {
	my $s = Net::SMTP->new($smtp_server);
	$s->mail($email);
	$s->to($email);
	$s->data("Subject: Subversion commit", \\
	  "\\n", "\\n", $svndata, "\\n");
	$s->quit;
}

Since this is a post-commit script, you don’t need to worry about the return code — the commit has already happened so can’t be cancelled. (Although you can of course revert it manually, if you review the email and realise that someone has done something wrong.)

A slight improvement on this script would be to send email only in certain circumstances — for example, if a bug ID features in the commit message, if a change happens in a particular branch, if more than a certain number of files are changed at the same time, or even if one particular individual makes a commit! svnlook should enable you to get at any of this information.

Another useful post-commit hook is one that runs a backup of the repository after each commit. You can simply back up to another local disk, or run rsync to a non-local server, using passphraseless ssh. A suitable post-commit script using non-local rsync might look like this:

#!/bin/sh

REPOS="$1"
REV="$2"

RSYNCDIR="user@backup.example.com:/backup/svn"
SSHKEY="${REPOS}/.ssh/rsync"

rsync -avuz --delete -e "ssh -i \\
	${SSHKEY}" ${REPOS} ${RSYNCDIR}

Here the passphraseless ssh key would need to be created as .ssh/rsync in the repository directory. This is a great way of making sure that your backups happen more reliably — automating backups is always, always a good idea, and the more often you back up the better. It’s a good idea to also have a nightly backup, run independently of the commits, as well — perhaps to another disk or to another server.

You may want to run a test suite after a commit, and send an email if the test suite fails. Of course, this would never happen, because everyone runs the test suite before committing — right? Glad to hear it. Just for peace of mind, though… your post-commit script could look like this:

#!/bin/sh

REPOS="$1"
REV="$2"
TEMP="/local/testsuite/temp"
TEST="$REPOS/hooks/runtest.pl"
LOG="/local/testsuite/log.txt"

/usr/bin/svn checkout "$REPOS"@"$REV" "$TEMP"
"$TEST" "$TEMP" >> "$LOG" 2>&1

exit 0

Here your test suite should exit with an error if there is a problem — although, again, as this is running post-commit, the exit value of the hook script won’t actually make any difference to the repository. You might want to add something in either the test suite or in this script to email an admin if the test comes up with an error.

You can also use something like this if you use a configuration management system like Puppet, to run a parser over your configuration before you commit it.

Of course, this doesn’t prevent broken changes from being committed. Ideally, you would be able to run the test suite on the transaction ahead of time, but this would require a test repository before the commit to the main one, and would also lock up the main repository while the test ran.

So in practice that isn’t a great option for any multi-user system, and running tests after commit instead, and issuing a warning to the relevant people in the event of an error, is a better bet. Finding out (e.g. via svnlook) who made the last commit and including that information in the email is a good way of putting some pressure on people to make sure they run proper tests before committing!

Post-commit scripts are not, of course, the only useful ones.

Start-commit Scripts

If a start-commit hook script exits non-zero (failure), the commit is stopped before a transaction is even created. So you could use a start-commit hook to check whether a user has appropriate permissions. For example, you might run a repository which anyone can check out and build, but to which commits are restricted.

#!/usr/bin/perl -w

use strict;

my $repos = $ARGV[0];
my $user  = $ARGV[1];

my @auth_users = ("jkemp", "other", "dsmith");

my $found = grep { $_ eq $user } @auth_users;

if $found == 0 {
	die ("Error: user $user not authorized.\\n");
}
else {
	exit 0;
}

Pre-commit script

As with start-commit, if pre-commit returns non-zero, the commit is aborted. But in this case, the commit has already started in the sense that a transaction has been generated. The repository will therefore be locked while this script is run — unlike with the start-commit script, where the transaction has yet to be generated. Bearing this in mind, you probably don’t want your pre-commit script to take too long, as no one else will be able to make a commit while it’s running.

The sample pre-commit hook given as a template in the svn repository directory uses svnlook to examine the log message — in that case, to check that it exists. You could also parse it to check that it conforms to a particular standard — for example, that it includes a RT (Request Tracker, a bug-tracking system) ticket number, as in this script:

#!/usr/bin/perl -w

use strict;

my $repos   = $ARGV[0];
my $txn     = $ARGV[1];
my $svnlook = '/usr/bin/svnlook';
my $require = 'RT: \\d+';

my $log = `$svnlook log -t $txn $repos`;

if ($log =~ /$require/) {
	exit 0;
}
else {
	die "RT ticket number not found \\
	in log message.  Commit aborted.\\n";
}

If you have multiple desktop environments (Windows, Mac, Linux) in an office, one problem that can arise is that linebreaks vary between OSes. Another possibility for a pre-commit script is to sanitise or homogenise linebreaks to an accepted standard. Similarly, to avoid tabbing standards wars (or at least, to relocate them…), you can write a script that standardises tabbing. Beware that this may not be popular, though!

Other Scripts

Those are not, of course, the only options. Anything you can write a script to do, you can create as a Subversion commit hook. There are a few sample hooks available in the Subversion repository at http://tinyurl.com/2yxvud and http://tinyurl.com/5dknwb — these include a Python script which checks that log messages end with a single newline, a script to check for filename clashes (case-insensitive), a joke script which blocks all commits, and a script to check syntax. There are also access control scripts available which provide finer-grained control than the example script given here.

You can also find scripts elsewhere online. For example, if you use the project management software Trac, it is possible to implement time-tracking solutions (see http://tinyurl.com/r4gww) and to tie these in to Subversion commit hooks. The relevant post-commit script is provided at that page — it enables developers to add specific time used/remaining notes, which will then be parsed and recorded in Trac.

As well as the commit hooks, there are four other types of hook available: pre-revprop-change, post-revprop-change, pre-unlock, and post-unlock. The first two occur before and after a revision property is added, modified, or deleted – the examples in the default hooks directory cancel changes that aren’t simply log changes (pre-revprop-commit) and send an email (post-revprop-commit).

The second two are triggered before and after an exclusive lock is destroyed — these could check that the person trying to undo the lock is its owner, or again send emails. They’re arguably less useful for most users than the various commit hooks, though.

Testing

Remember that it’s a good idea to test your commit hook scripts before implementing them — which can be difficult if you don’t want to make “fake” commits repeatedly. Obviously, you can run them by hand, but they won’t have access to relevant environment variables.

One option is to create a temporary/fake repository and use that to develop your script. Alternatively, you can create a switch to your script which takes a revision number, and use that to test your script on old revisions before you send it live.

Conclusion

Subversion is a fantastic and very useful version control system right out of the box. With hook scripts, you can extend it with ease, just as far as your programming skills, imagination, and repository needs will permit.

Fatal error: Call to undefined function aa_author_bios() in /opt/apache/dms/b2b/linux-mag.com/site/www/htdocs/wp-content/themes/linuxmag/single.php on line 62