Launching Processes

Perl has many ways of launching and managing different programs. This is a Good Thing, because Perl's ability to launch and manage programs -- or child processes -- is one of the reasons it makes such a great "duct-tape of the Internet." The easiest way to launch a child process is with system:

Perl has many ways of launching and managing different programs. This is a Good Thing, because Perl’s ability to launch and manage programs — or child processes — is one of the reasons it makes such a great
“duct-tape of the Internet.” The easiest way to launch a child process is with system:

 system “date”;

The child process here is the date command. Anything that can be invoked from a shell command prompt can be used in this string. The child process inherits Perl’s standard input, output, and error output, so the output of this datecommand will show up wherever Perl’s STDOUT was going.

The command can be arbitrarily complex, including everything that /bin/sh (or its equivalent) can handle:

 system “for i in *; do echo == \$i ==;
cat \$i; done”;

Here, we’re dumping out the contents of a directory, one file at a time. The $i vars are backslashed here because Perl would have expanded them to their current Perl values and we want the shell to see its own $i instead. A quick solution here is to use single quotes instead of double quotes. Single quotes, unlike double quotes, prevent the evaluation of references to Perl variables.

 system ‘for i in *; do echo == $i ==;
cat $i; done’;

Or, you can just set the value of Perl’s $i to ‘$i’, but that’s pretty twisted, and will probably drive the maintenance programmer who inherits your code crazy.

This might look better spaced over multiple lines, so we can use a here-string to fix it:

 system <<’END’;
for i in *
echo == $i ==
cat $i

Yeah, that cleans it up a bit.

If the argument is simple enough, Perl avoids the shell, finding the program directly. You may wish to adjust $ENV{PATH} before calling system so that the program is found in the right place. Anything complicated forces a shell though.

That shell can get in the way at times. Imagine invoking grep on a few files based on a string in a scalar variable:

 system “grep $look_for brief1 brief2 brief3″;

Now if $look_for is a nice easy string like Monica,no big deal. But if it’s complicated like White House, we now have a problem, because that’ll interpolate like this:

 system “grep White House brief1 brief2 brief3″;

which is looking for White in the other four names, including a file named House. That’s broken. Badly. So, perhaps we can fix it by including some quotes:

 system “grep ‘$look_for’ brief1 brief2 brief3″

This works for White House, but fails on Don’t lie!. And if we change the shell single quotes to double quotes, that will just mess up when $look_for contains double quotes!

Luckily, we can avoid the shell entirely, using the multiple argument version of system:

 system “grep”, $look_for, “brief1″,
“brief2″, “brief3″;

When system is given more than one argument, the first one must be a program found along the PATH. The others are handed, uninterpreted by any shell, directly to the program.If it were another Perl script, the elements of @ARGVin the called program would match the same elements as this list.

Because we now no longer call a shell, things like I/O redirection no longer work. There are tradeoffs to this method, but it comes in handy. It’s also more secure — there is no chance that a nefarious user to sneak in a new line or semi-colon. Some popular CGI scripts didn’t get this right, and ended up triggering a CERT notification as a security hole.

While the child process is executing, Perl is stopped. So if a command takes 35 seconds to run, Perl is stopped for 35 seconds. You can fork a child process in the background by adding the ampersand, just as you would in the shell:

 system “long_running_command and the parameters &”

Be aware that you’ll have no easy way to interact with this command, or even know its PID to kill it.

The return value of the system operator is the value from the wait (or waitpid) system call. That is, if the child process exited with a zero value (everything went OK), so too will the return value from system be zero. A non-zero value is shifted left 8 bits (or multiplied by 256, if you prefer). If a signal killed the process, that’s bitwise-or’ed into the number, and a 128 is added if there’s a core file cluttering up the directory now.

If you don’t grab the result from system, the same number is available in the special $? variable. That is, until another process is waited for, because $? records only the most recently waited-for process status. So, to get the specs on the most recent exit, it’s something like:

 $status = ($? >> 8);
$core_dumped = ($? & 128) > 0;
$signal = ($? & 127);

Because the “zero if everything is OK” is backwards from most of the rest of Perl, you shouldn’t use or die directly. Instead, the easiest fix is to invert the output of system with a logical “not” operation:

 !system “some_maybe_failing_command”
or die “we broke it”;


The exec operator works much like system. However, instead of creating a child process to run the selected command, the Perl process becomes the selected command. Think of this as a goto instead of a subroutine call:

exec “date”;

Once this date command begins executing, there’s no Perl to come back to. The only reason to put Perl code after an exec is to explain that date was not found along the command path:

exec “date”;
die “date not found in $ENV{PATH}”;

In fact, if you turn on compile-time warnings and have anything but a die after exec, you’ll get notified.

One use of exec is to use Perl to set up the operating environment for a long-running command:

 $ENV{DATABASE} = “MyDataBase”;
$ENV{PATH} = “/usr/bin:/bin:/opt/DataBase”;
chdir “/usr/lib/my.data” or die “Cannot
chdir: $!”;
exec “data_mangler”;
die “data_mangler not found”;

Replacing exec with system here would have still invoked data_mangler, but then we’d have a mostly useless Perl program sitting around just waiting for data_mangler to exit.

The processes started with system and exec can be interactive, since they’ve inherited Perl’s input and output. And, to aid in the interaction with complicated programs like vi, Perl ignores SIGINT during the system invocation, so that hitting control-C doesn’t abort Perl early.


Sometimes, you’ll be invoking commands to capture their output value as a string in the program. You can do this with backquotes, which act like backquotes in the shell:

 $now = `date`;

Here, the standard output of date is a 30-ish character string followed by a newline. Everything sent to standard output is captured as a string value, returned by the backquotes, and here saved into $now. If the value contains multiple lines, we may want to split it on newline to get each line. But it’s probably easier to use backquotes in a list context, which does this for us:

 @logins = `who`;

Here, @logins will have one element for each line of who’s output. We can parse that in a loop like this:

 for (`who`) {
($user, $tty, $when_where) =
$logins{$user}{$tty} = $when_where;

Each iteration through the loop gathers a different user’s login, shoving it into a two-level hash keyed by user name and terminal. Then, we can dump it out ordered by user:

 for $user (sort keys %logins) {
for $tty (sort keys %{$logins{$user}}) {
print “$user is on $tty from

Standard input for a backquoted command is inherited from Perl’s standard input, making it possible to have an external command suck down all of STDIN returning a modified version:

 @sorted_input = `sort`;

Here, the sort command (not the built-in Perl operator) is reading all of standard input, sorting it, and then returning that to Perl as a very large string value.

The backquoted command is double-quote interpolated, meaning that we can use escapes like \n and \t, but also include Perl variables to build parts of the command:

 $checksum = `sum $file`;

However, my earlier warnings about the single-argument system operator apply here as well. What if $file has embedded whitespace or other shell-significant characters?

One solution is to use yet-another way of invoking a child process: the process-as-filehandle. Let’s start with the easy form of that first, and return to this whitespace problem after getting the basics down. If the second argument to open ends in a vertical bar (pipe symbol), Perl treats that as a command to launch rather than a filename:

 open DATE, “date|”;

At this point, a date command is launched, with its standard output connected to the DATE filehandle open for reading. The rest of the program doesn’t know, doesn’t care, and would have to work pretty hard to figure out that this is not a file but just another program. So, we’ll read from the output using the normal filehandle operations:

 $now = <DATE>;

The process is running parallel with Perl, with all the coordination provided for standard pipe read/writes. So if the date command sent its output before Perl was ready, it would just wait there, and if Perl read before date was ready to write, the Perl process would simply block until output was available, consuming no CPU.

The Fork and Pipe

So how does this solve our whitespace problem of earlier? Well, there’s a special kind of command opening:

 my $pid = open CHILD, “-|”;

which is a combination of a pipe-open and a fork. You may recall that a fork splits the current process into two processes: a parent process and a child process. Initially, both are running identical code, but we distinguish them by the return value from the fork call. The parent gets back the child’s process ID number (PID), and the child gets a zero value.

This fork-and-pipe opening operates similarly: the Perl process forks, and the parent and child see differing results from the open just like a fork. However, the child’s STDOUT is attached to the parent’s CHILD filehandle automatically, meaning that the child can act like the date command above, sending data to its standard output, and we can read from that in the parent process. So, to finish out the date example, we could do it all within Perl like so:

 if ($pid) { # I’m the parent
$now = <CHILD>; # read child
} else { # I’m the child
print scalar localtime, “\n”;
exit 0;

And now $now is set to the output of the child process. We can also exec in the child, like so:

 if ($pid) {
$checksum = <CHILD>;
} else {
exec `sum`, $myfile;
die “sum not found: $!”;

And now we have used the two-argument form of exec so there’s never any shell-character worries!

Perl also offers arbitrary invocations of fork, waitpid, pipe and file-descriptor shuffling, so you can access the full range of underlying Unix system calls. Until next time, have fun launching processes!

Randal L. Schwartz is the chief Perl guru at Stonehenge Consulting and co-authored Learning Perl and Programming Perl. He can be reached at merlyn@stonehenge.com.

Comments are closed.