One of the nice features of Linux (and other Unix-like operating systems) is its ability to chain together a number of small utility programs so that they act like one larger program. I'm referring, of course, to the "pipe" feature that is supported by most popular shells and is denoted by the | character. This feature allows data to flow from the standard output of one application directly into the standard input of the next application in the chain, much as if you placed a pipe from the end of the first application to the beginning of the second. For example, the four utilities cat, grep, sort, and less can all be chained together like this:
One of the nice features of Linux (and other Unix-like operating systems) is its ability to chain together a number of small utility programs so that they act like one larger program. I’m referring, of course, to the “pipe” feature that is supported by most popular shells and is denoted by the | character. This feature allows data to flow from the standard output of one application directly into the standard input of the next application in the chain, much as if you placed a pipe from the end of the first application to the beginning of the second. For example, the four utilities cat, grep, sort, and less can all be chained together like this:
cat foo.txt bar.txt | grep baz | sort | less
This chain of commands will find all the lines that contain the word baz in the two files foo.txt and bar.txt, sort those lines in alphabetical order, and display the output one page at a time.
Many of you have probably used the pipe feature of your shell before, but for this type of interprocess communication, this is just the tip of the iceberg. Pipes and “named” pipes (which we will discuss later in this column) allow processes to communicate with each other (even if they were not originally designed to communicate with other processes). In this article, we will look at various functions that your processes can use to communicate with each other in this way.
Through the Pipe
Let’s start with a quick definition — the term FIFO describes the way a pipe transfers data and is an acronym for “First In, First Out.” Imagine pushing different colored marbles through a physical pipe. If you push a red, then a blue, then a yellow marble through, the marbles will come out red, then blue, and finally yellow at the other end. The same goes for data that you “push” through a pipe. For those of you who may not be familiar with different types of data structures, a queue is another example of a FIFO structure, and a stack, which is a LIFO (Last In, First Out) structure, is just the opposite (if you pushed the red marble in first, it would come out last).
Now that we’ve gotten some terminology out of the way, we can move on to the functions you’ll use to make your applications work with pipes. The first, and simplest, pipe function is “pipe” (what a surprise!). Its prototype, as defined in unistd.h, is:
int pipe ( int filedes ) ;
This creates two file descriptors, one to be read from filedes and one to be written to filedes. (A file descriptor is an integer that is used to reference the actual data structure containing information about a file.) This function, and most of the others mentioned in this article, return -1 on failure and set the variable errno to some value indicating why the function failed. On success, they return 0. Check the man pages for more information on the values returned in errno.
The two file descriptors can be thought of as the two ends of the pipe. Going back to our marble analogy, you push marbles into filedes and they come out filedes. Those of you who are familiar with the open() / read() / write() system calls are probably used to file descriptors as they relate to files and directories in your filesystem. However, the two file descriptors given by pipe do not represent anything external to your program. They are simply an abstraction used as a means to communicate data. The beauty of this abstraction is that it allows you to use the returned file descriptors just as you would use file descriptors that represent files. You simply “write” data into filedes and “read” data from filedes as if filedes and filedes represented files open for writing only and reading only, respectively. Let’s look at a simple example of all this. (The following is taken from the info page on libc):
/* Create the pipe. */
if (pipe (mypipe))
fprintf (stderr, “Pipe failed.\n”);
/* Create the child process. */
pid = fork ();
if (pid == (pid_t) 0)
/* This is the child process. */
read (mypipe, buffer, 100);
printf (“%s\n”, buffer);
else if (pid < (pid_t) 0)
/* The fork failed. */
fprintf (stderr, “Fork failed.\n”);
/* This is the parent process. */
write (mypipe, “hello world!”, 13);
As mentioned earlier, pipes are primarily used to communicate between two processes. In this example, we use the function fork() to create a new process and then we read from the read side of the pipe in one process (the child process). In the original process (the parent process), we write to the write side of the pipe. When you fork, each resulting process (the original process and the new, child process) gets a copy of all the state of the original process, including any open file descriptors. Therefore, if you create a pipe and then fork, you can use this pipe to communicate data between the two processes.
This example demonstrates quite nicely how you can use the fork function to create new processes and then communicate data between the original process and the one that has been newly created.
When reading from a pipe, you must be careful to assure that your program behaves in a predictable fashion. If you attempt to read more data than has been written to the pipe, you will only receive the amount that has been written (and read() will return an integer indicating the number of bytes read). However, if no data has been written to the pipe since the last time you read all the data from it either, and you attempt to read from it, your program will block until there is data available to be read, or the write end of the pipe is closed. When the write end is closed, a call to read() on the read end of the pipe will return 0 after all of the data has been read. Of course, if you close either the read or write end before trying to read to/write from it, the call will fail.
A Pipe With No Name
While all of this is very powerful, there are times when just having “anonymous” pipes is not sufficient for your programming needs. If two processes expect to communicate through a pipe, but don’t have the luxury of being exec-ed from the same process, using the aforementioned pipe() function will not do the trick. Fortunately, there is an alternative — fifo files. A fifo file is a special file in your file system that acts just like a pipe. Process A can open the file for reading, and process B can open the file for writing. Anything process B writes to this file can be read in FIFO order by process A. Using a couple of Linux utilities, let’s look at an example of fifo files in action. First, create the new fifo file with the command:
% mkfifo foo
Now if you look at the file foo with the ls command, you should see something like:
Notice it has the | character at the end of the name and a C in the special permissions field of the directory listing. These are the markings the filesystem uses to indicate that this is a fifo file. We’d like to now have one application open foo for reading and another open it for writing. For this demonstration, we’ll make use of the cat utility and your shell’s ability to redirect output. Pop open a couple of terminal windows; on one of them type:
% cat foo
You’ll notice that it just seems to hang there. Remember, when trying to read from a pipe, the function call will block until something is written to the pipe. This works the same way for fifo files. So in a different terminal window, type:
% cat > foo
This tells cat to use standard in as input, and the > tells the shell to write the output to the file foo instead of to standard out. In effect, we are causing cat to write anything that we write on the standard input to the file foo. Now you can type things at this terminal, and they should print out on the other terminal! Everything you type is being piped through the fifo file foo.
Once you have a fifo file on your filesystem, you can use it between any two processes to communicate. To create a fifo file from your program, simply call mkfifo(). Its prototype looks like the following:
intmkfifo(const char*pathname,mode_tmode) ;
You may recognize this structure, as it is identical to the prototype for creat(), used to create regular files in your file system:
intcreat(const char*pathname,mode_t mode) ;
To use the mkfifo() function, you need to include the header files sys/types.h and sys/stat.h. Once the file is created, you can open it for reading or writing just as you would a normal file on your filesystem. This can be especially useful for programs that normally read from/write to a file but are not expecting that file to be a pipe. This way, you can use fifo files to feed data to practically any application that reads from normal files.
Keeping Things Simple
I hope this article has given you a good sense of how you can use pipes and fifo files as a means of facilitating interprocess communication. The pipe feature, which allows processes to communicate with each other even if those processes were not originally designed to be used together, is an incredibly powerful tool.
Of course, there are many more uses for pipes and named pipes than we have space to touch upon in this article. For example, in a client/server application, if you know that both the client and the server will be running on the same machine, then you may wish to use a pipe rather than a socket to communicate. This would simplify your program and allow you to avoid transmitting data over the network. However, this is about all we have time to cover right now. So have fun with pipes, and until next month, happy hacking!