Authenticated Remote Updates

Suppose my friend Fred has a Web site that has grown too big for him to handle by himself. So he gets his buddy Barney to create some of the HTML and draw up a few of the images. How can Barney edit the files on Fred's hard drive, especially if Barney is on the wrong side of some corporate firewall? Well, Fred could create a CGI script to upload the files into the right place. However, then the script runs as the Web user and not as Fred. This would require Fred to mess with wide-open permissions (or setuid wrappers) and either https authentications or (worse) repeatedly sending the update password over the wire during Basic Authentication handshaking.

Suppose my friend Fred has a Web site that has grown too big for him to handle by himself. So he gets his buddy Barney to create some of the HTML and draw up a few of the images. How can Barney edit the files on Fred’s hard drive, especially if Barney is on the wrong side of some corporate firewall? Well, Fred could create a CGI script to upload the files into the right place. However, then the script runs as the Web user and not as Fred. This would require Fred to mess with wide-open permissions (or setuid wrappers) and either https authentications or (worse) repeatedly sending the update password over the wire during Basic Authentication handshaking.

So Fred takes a different route. He’s running procmail, so he can set up an action based on a specific mail header to cause the content of the message to be dropped into the right place. But what if the content is an image? And what if someone else finds out about that header and fakes a message from Barney?

Well, sticking with the mail route, all we really need is a way to verify that Barney is really the sender and that an arbitrary binary file can get through. It would also be nice if it were encrypted so that no one in the middle can see the secret stuff.

It’s nice that the RSA public-key encryption patent recently expired, because I can now recommend (without fear of criminal prosecution) tools like the GNU Privacy Guard (http://www.gnupg.org) as a critical component of the solution to this problem. And nicely enough, there’s a Perl wrapper in the CPAN for GPG. If you’re not familiar with GPG, I suggest visiting the Web site before reading further.

To make this work, Fred and Barney invent GPG keys and exchange their public keys. They don’t have to use their standard keys if they don’t want to (and in fact shouldn’t, for reasons described later). I used fred@localhost and barney@localhost since these two keys will not be used in any public server. To avoid using the standard keys, they both used GPG’s –homedir option.

Fred installs a mail handler (described later), and Barney goes to work creating content. He edits his files underneath his “Web image” directory, then invokes the program given in Listing One , listing the files that he’s changed. It’s okay to list more than those that have been changed unless there is concern about the number of bytes transmitted.

The configuration in lines 7 to 12 defines what happens next. Line 7 defines the top-level directory of his “Web image” directory. This permits Barney the freedom to invoke the encryptor from any directory using relative or absolute names. Next comes the GPG “home directory” in line 8. If Barney is using his standard GPG key database, then this can be the directory of .gnupg in his home directory. Otherwise, Barney can construct a special GPG home directory specifically for this program. Since the passphrase is included within the program, it’s probably good to have a distinct private key and passphrase. This directory will have Barney’s private key ring and the public key ring containing both Barney’s and Fred’s public keys. Fred will have a similar, but complementary, directory on his machine.

Speaking of passphrase, that’s in line 9. This is needed because Barney’s private key is being used to sign the message, verifying that it was Barney that published the message, and the private key is locked up with a passphrase.

Line 10 defines the GPG user that signs the message — in this case, Barney. Similarly, line 11 defines the GPG user for which the message is encrypted — Fred. By keeping the GPG databases small for this program, these names can be kept simple. Finally, line 12 gives the e-mail address to which the encrypted packet is finally mailed.

Lines 16 and 17 set up the modules using autouse so that they’re pulled in when the corresponding subroutine is called.

Lines 19 and 20 remind me that we’re setting up a three-process pipeline. The first stage generates the Storable object and writes it to STDOUT. The second stage reads that object from its STDIN and encrypts it using GPG, writing that to STDOUT. Finally, the third stage reads its STDIN and e-mails it to the required address. I’d normally have done this in one process, but even with the wrapper around GPG, it still really wants to read from STDIN and write to STDOUT, so it was simpler to fork a few times to fix that.

Lines 23 to 38 define the first stage. Line 23 forks, opening up the parent’s STDIN as the child’s STDOUT, returning true in the parent process. (The child’s STDIN is unchanged and can be used to read from the original standard input.)

Lines 26 to 34 construct the “payload.” We’ll make up a hash whose keys are the filenames relative to the $SRC_ ROOT. Afterwards, Barney doesn’t need to tell Fred where he keeps his files. Lines 29 and 30 handle that.

Lines 32 and 33 grab the content of the file, note its modification time, and create a hashref value corresponding to that payload key.

Finally, line 36 takes the payload, encodes it with Storable, and sends that to STDOUT. This is the end of the task, so that process exits in line 37.

The second stage detaches itself similarly in line 41. At this point, its STDIN is the first stage’s standard output. We pull in the GnuPG wrapper in line 43 and trigger the encryption using all the needed parameters specified in lines 44 through 48. The armor parameter here ensures that the output is nice ASCII text, suitable for e-mailing (even if the source material was a binary, like an image).

This stage is now a pipeline, reading the payload from standard input, writing the encrypted text message to standard output, and exiting in line 49.

Okay, I admit it. When I first sketched out the program, I was envisioning the use of Net::SMTP or one of the other dozen mailing modules. Then I said, “whatever,” and decided the programs were long enough, so I punted and just invoked /bin/mail. Leverage — remember “leverage?”

So Barney invokes this program, and an encrypted chunk of data is now scurrying along toward Fred’s domain. Let’s say Fred routes all of his domain mail through procmail. To extract the mail, he just adds a few simple lines to his .procmailrc to route all mail for web-update into the standard input of the decryptor module, as follows:

* To: web-update@webhaus.com
:0 W
| /home/fred/lib/web-update

:0 e

Here, we’re telling procmail to take all appropriately addressed e-mail and feed it into the program presented in Listing Two (pg. 78). If the web-update program fails for some reason, procmail will kindly drop the original mail into the “bad mail” drop for later analysis or reprocessing.

Lines 7 to 20 control the actions of this Web-updating program. Remembering that this program acts with Fred’s privileges based on incoming e-mail, we want to be very careful about what can happen. In particular, the only files that can possibly be changed will need to be below $DEST_ ROOT in line 7, so any damage will be limited to there.

Line 8 defines a logfile for all actions, probably something to be watched over time (perhaps summarized by another Perl program). Line 9 defines Fred’s Web-publishing GPG “home directory.” Again, this could be Fred’s normal GPG directory, but then the passphrase would potentially have to be compromised by living within the program in line 10.

Lines 11 to 20 define the roles and authorizations for those roles. A given GPG signature (obtained by looking at the output of GPG’s –list-signatures option) represents an individual. The mapping in %ROLES gives that person a series of “roles” that they play. Here I’m letting Barney be the “HTML updater” and “GIF updater.” Other users would have overlapping or distinct roles.

Those roles are then mapped into permissions using %AUTHS. For each role, a series of anonymous subroutines will be executed against a five-item list:

  • The basename of the file being uploaded
  • The pathname relative to $DEST_ROOT
  • The absolute pathname
  • The absolute directory of the destination file
  • An info hash (described later)

So each user maps into one or more roles, and each role maps into one or more authorizations for that role. For each file uploaded by a given user, if any of the authorization subroutines return true, then that file is permitted to be uploaded. I could have made this more complicated, and it may not be sufficient for the general case, but it’s a good start.

Lines 29 to 36 map all the warn and die messages so that they have a nice process ID number and timestamp.

Lines 44 and 45 remind me that there are only two processes in this pipeline, but they’re a little more complicated. The first stage will read the original standard input (the text of the incoming e-mail) and then deliver the original Storable-encoded object to the standard input of the second stage.

The second stage must act on that object only if the decryption is signed by the right key. However, that’s known only to the first stage! So before forking, we create a separate pipe that the first and second stage can use to communicate. This is done in line 48. It’s important that the first stage close the read side and the second stage close the write side, or we’ll never get an EOF at the right time.

Lines 51 to 70 set up the first stage. The decryption takes place in lines 57 and 58. If the original message is successfully decrypted and signed, $h is a hashref with parameters for the signature. If so, we send it down the secondary channel as a Storable-encoded object. If not, we send an empty hash down that channel.

The processor beginning in line 72 has all the tough logic. This is where we really get the work done and have to be very conservative and distrust anything that is even slightly fishy. Lines 75 and 76 grab the hashref that came from the decryption process. This is not the payload but the authentication information about the payload, which will be the same value as $h in the other process if all went well. We then fetch the payload in line 82.

Line 84 is a reminder that one of the parameters unused (so far) is the “sigid,” which is unique for every signed document. We can record this sigid and reject any other message with the same sigid as having been “already processed.” This prevents a “replay attack,” where an intermediate bad guy resends intercepted e-mail, sending it over and over again, hoping to retrigger the same action. As you’ll see in a moment, for this particular use a replay attack would be useless except to reinstate deleted files (if those weren’t being checked in the authorization).

Line 85 logs the GPG user for which this message was signed. Note that this has to be a GPG user that Fred has in his GPG public key listing.

Lines 87 to 94 extract all the possible coderef subroutines from which this particular GPG user receives authorization. We first fetch all the roles for the given fingerprint. If there are no roles there’s no point in going further. We then fetch all the subroutines for those roles and stash them away in @auths for use with each file.

Line 96 gets an “attic prefix,” used to save the previous version when a file is later updated.

Lines 98 to 140 are executed for each file in the payload. First, the full path is computed in line 100. The relative path is then recomputed in line 101 and verified in line 102 to prevent any funny business — remember, trust no one!

Lines 104 to 110 determine if the authenticated user is authorized for this file by running all the authorization subroutines. Note the fifth subroutine parameter is a hashref containing information about the entry, including the modification time and the contents.

If we’re authorized, it’s time to get cracking. Line 111 makes the directory containing the entry, including any parent directories if needed. Line 113 gives default permissions for the file unless the file already exists. Hmmm…maybe we can give Barney some control over this value in another version of the program.

Lines 114 to 128 deal with the previous version of the existing file. If the file exists, and it’s not newer (lines 117 to 120), we need to stash the current version in the “attic.” Line 122 defines this attic as a subdirectory within the same directory as the file, named .attic. The current file is linked into this attic with a name that depends on the time of day and process ID number.

Lines 129 to 138 create a new file near the destination file and give it the right permissions and modification time. If all goes well, the file is renamed into place in line 137. This is all performed with the possibility of concurrent file accesses in mind (such as the live data of a Web server). Line 139 logs the successful update.

There you have it, a mechanism to deliver encrypted, authenticated Perl objects by e-mail and a system built to provide controlled remote publishing, all in under 200 lines of Perl (hacked out in a few hours between rounds of Ridge Racer V). It may not be CVS, but it’s a good base from which to do some very cool things. Until next time, enjoy!

Listing One: Sending GPG-Encrypted Packages

1 #!/usr/bin/perl -w
2 use strict;
3 $|++;
5 ## begin config
7 my $SRC_ROOT = ‘/home/barney/webfiles’;
8 my $GPG_HOME = ‘/home/barney/.publish-gpg’;
9 my $GPG_PASSPHRASE = ‘barneyphrase’;
10 my $LOCALUSER = ‘barney’;
11 my $REMOTEUSER = ‘fred’;
12 my $EMAILTO = ‘web-update@webhaus.comm‘;
14 ## end config
16 use autouse ‘Storable’ => qw(store_fd retrieve_fd);
17 use autouse ‘File::Spec::Functions’ => qw(abs2rel rel2abs);
19 ## set up pipeline
20 ## generate object | encrypt | mailer
22 ## first stage: generate object
23 unless (my $pid = open STDIN, “-|”) {
24 die “Can’t fork: $!” unless defined $pid;
26 my %payload;
27 for (@ARGV) {
28 local *F;
29 my $abs = rel2abs($_);
30 my $rel = abs2rel($abs, $SRC_ROOT);
31 die “$abs is not below $SRC_ROOT” if $rel =~ m/\A\.\.(\/|\z)/s;
32 open F, “<$abs” or die “cannot open $abs: $!”;
33 $payload{$rel} = {content => join (“, <F>), mtime => (stat F)[9]};
34 }
36 store_fd \%payload, \*STDOUT;
37 exit 0;
38 }
40 ## second stage: encrypt
41 unless (my $pid = open STDIN, “-|”) {
42 die “Can’t fork: $!” unless defined $pid;
43 require GnuPG;
44 GnuPG->new(homedir => $GPG_HOME)->
45 encrypt(passphrase => $GPG_PASSPHRASE,
46 recipient => $REMOTEUSER,
47 ‘local-user’ => $LOCALUSER,
48 armor => 1, sign => 1);
49 exit 0;
50 }
52 ## third stage: send mail
53 exec “/bin/mail”, $EMAILTO; # punt :)
54 die “cannot exec /bin/mail: $!”;

Listing Two: Retrieving and Decrypting GPG Packages

1 #!/usr/bin/perl -w
2 use strict;
3 $|++;
5 ## begin config
7 my $DEST_ROOT = “/home/httpd/htdocs/”;
8 my $LOGFILE = “/home/fred/lib/web-update.log”;
9 my $GPG_HOME = “/home/fred/.publish-gpg”
10 my $GPG_PASSPHRASE = ‘fredphrase’;
11 my %ROLES = (
12 ## barney:
13 “80989563762BC0677D96542EFAA
3AAF8282564B7″ => ['html','gif'],
14 );
15 my %AUTHS = (
16 ‘images’ => [sub { $_[1] =~ /^images\/.*\.(gif|jpe?g)$/ }],
17 ‘editor’ => [sub { $_[0] !~ /^\./ }],
18 ‘html’ => [sub { -f $_[2] and $_[0] !~ /^\./ and $_[0] =~ /\.html$/ }],
19 ‘gif’ => [sub { -f $_[2] and $_[0] !~ /^\./ and $_[0] =~ /\.gif$/ }],
20 );
22 ## end config
24 use autouse ‘Storable’ => qw(store_fd retrieve_fd);
25 use autouse ‘File::Spec::Functions’ => qw(abs2rel rel2abs catfile);
26 use autouse ‘File::Basename’ => qw(fileparse);
27 use autouse ‘File::Path’ => qw(mkpath);
29 sub __stamp {
30 my $message = shift;
31 my(@now) = localtime;
32 my $stamp = sprintf “[%d] [%02d@%02d:%02d:%02d] “,
33 $$, @now[3,2,1,0];
34 $message =~ s/^/$stamp/gm;
35 $message;
36 }
38 $SIG{__WARN__} = sub { warn __stamp(shift) };
39 $SIG{__DIE__} = sub { die __stamp(shift) };
41 open STDOUT, “>>$LOGFILE” or die “Cannot append to $LOGFILE: $!”;
42 open STDERR, “>&STDOUT”;
44 ## set up pipeline
45 ## <message decrypt | processor (with secondary channel)
47 ## establish secondary channel…
48 pipe FROM, TO or die “cannot pipe: $!”;
50 ## first stage: decrypt
51 unless (my $pid = open STDIN, “-|”) {
52 die “Can’t fork: $!” unless defined $pid;
54 close FROM;
56 require GnuPG;
57 my $h = GnuPG->new(homedir => $GPG_HOME)->
58 decrypt(passphrase => $GPG_PASSPHRASE);
60 if ($h and ref $h) {
61 store_fd $h, \*TO;
62 } else {
63 store_fd {}, \*TO;
64 warn $h ? “not signed\n” : “cannot decrypt\n”;
65 }
67 close TO;
69 exit 0;
70 }
72 ## second stage: processor
73 close TO;
75 die “BAD PARENT RESPONSE, aborting” if eof(FROM);
76 my $h = retrieve_fd \*FROM;
77 close FROM;
79 die “failed validation” unless keys %$h;
81 ## we’ve got validation, so fetch the payload
82 my $payload = retrieve_fd \*STDIN;
84 ## TODO: record $h->{sigid} and reject
duplicate as a replay attack
85 warn “processing an update from $h->{user} …\n”;
87 my @auths = do {
88 my $roles = $ROLES{$h->{fingerprint}}
89 or die “No roles for $h->{fingerprint}”;
90 map {
91 my $auths = $AUTHS{$_};
92 $auths ? @$auths : ();
93 } @$roles; # list of coderefs
94 };
96 my $prefix = time . “.$$.”;
98 while (my($rel, $info) = each(%$payload)) {
99 local *F;
100 my $abs = rel2abs($rel, $DEST_ROOT);
101 $rel = abs2rel($abs, $DEST_ROOT); # should be same as original $rel
102 die “$abs is not below $DEST_ROOT” if $rel =~ m/\A\.\.(\/|\z)/s;
103 my ($basename, $dirname) = fileparse($abs);
# dirname ends in slash
104 do {
105 my $ok = 0;
106 for (@auths) {
107 last if $ok = $_->($basename, $rel,
$abs, $dirname, $info);
108 }
109 $ok;
110 } or warn(“$rel: not authorized, skipping\n”), next;
111 mkpath([$dirname], 0, 0755);
112 -d $dirname or die “Missing $dirname”;
113 my$perms = 0644; #default unlessprevious
114 if (-e $abs) {
115 my $mtime = (stat _)[9];
116 $perms = (stat _)[2] & 0777; # previous perms
117 if ((my $age = $mtime – $info->{mtime}) >= 0) {
118 warn “$rel: skipping older file ($age seconds)\n”;
119 next;
120 }
122 my $attic = catfile($dirname, “.attic”);
123 mkpath([$attic], 0, 0755);
124 -d $attic or die “Missing $attic”;
125 my $atticfile = catfile($attic, “$prefix$basename”);
126 link $abs, $atticfile or die “Cannot ln $abs $atticfile: $!”;
127 warn “$rel: previous saved in .attic/$prefix$basename\n”;
128 }
129 {
130 my $tmp = “$basename.$$”;
131 open F, “>$tmp” or die “Cannot create $tmp: $!”;
132 print F $info->{content};
133 close F;
134 chmod $perms, $tmp or warn “cannot chmod($perms,$tmp): $!”;
135 utime$info->{mtime},$info->{mtime}, $tmpor
136 warn”cannot set mtimeon $tmp: $!\n”;
137 rename $tmp, $abs or die “Cannot mv $tmp $abs: $!”;
138 }
139 warn “$rel: new version installed\n”;
140 }

Randal L. Schwartz is the chief Perl guru at Stonehenge Consulting and co-author of Learning Perl and Programming Perl. He can be reached at merlyn@stonehenge.com. Code listings for this column can be found at:http://www.stonehenge.com/merlyn/LinuxMag/.

Comments are closed.