grepmail

With all of the fancy, graphical email applications available for Linux, newcomers are often surprised to learn that many long-time Linux users still use old, text-based email management tools. These old-timers thoroughly embrace the ancient Unix philosophy of using several small, discrete command-line tools rather than a single monolithic application. This month, we look at grepmail, one of the most indispensable command-line email utilities.

http://grepmail.sourceforge.net

With all of the fancy, graphical email applications available for Linux, newcomers are often surprised to learn that many long-time Linux users still use old, text-based email management tools. These old-timers thoroughly embrace the ancient Unix philosophy of using several small, discrete command-line tools rather than a single monolithic application. This month, we look at grepmail, one of the most indispensable command-line email utilities.

A Virtual Needle in a Digital Haystack

At a glance, grepmail seems simple. As its name implies, grepmail is a grep-like program designed to search email. So how fancy can it be? After all, even your run-of-the-mill grep can search a standard mailbox file for whatever text you’re looking for. Well, it’s not quite that simple.

Certainly, grep is effective when each file represents a singular piece of content. For example, the command…


$ grep -l needle haystack1 haystack2 haystack3

… lists which haystack files (if any) contain the string needle. However, mbox, the standard mailbox file format on Linux, is

a single large file that stores multiple mail messages one after another. It’s quite common to store thousands of messages — thousands of pieces of content — in a single mbox file. While the mbox format makes email software easy to build, it renders grep somewhat useless. Want to find the message that contains needle? If you use grep, you can find out if needle is in the mailbox, but that’s about it. What you really want is the email message itself — the one that contains needle.

grepmail to the rescue! Standard grep is line-oriented, but grepmail is message-oriented. Let’s see an example.


$ grepmail -b needle mailbox

Here, if any messages in mailbox contain needle in the body (-b), grepmail spits them out in mbox format. That means you can derive a new mailbox from an existing one.

Perhaps you’d like to process a few hundred megabytes of archived messages to break them down by sender. To generate a mailbox containing all the messages from your favorite spammer, you might run:


$ grepmail -h ‘^From: badguy@aol.com‘ > \ badguy.mail

The -h flag constrains the search to the message headers.








gtkgrepmail
Figure One: gtkgrepmail

Feature-packed

So grepmail is a grep-alike that knows how to handle mailboxes. But it’s features go far beyond that of any standard grep.


  • grepmail is written in Perl. Patterns too complex for standard grep tools are a piece of cake with grepmail.

  • It’s common to archive old mail in compressed files. grepmail can search compressed files.

  • Perhaps you know that a particular message arrived before a given date or time. Simply supply a date argument to grepmail like “before July 1, 2002” and it will restrict the search to email messages received before that date.

Those are just the highlights of course. grepmail is to email what Perl is to text processing — a Swiss Army chainsaw. It’s a tool that’s both powerful and amazingly flexible.

grepmail Integration

To bridge the gap between old school, command-line power users and many of today’s GUI-oriented new users, look no further than Jeremy Malcom’s gtkgrepmail (http://www.terminus.net.au/services/gtkgrepmail.html). gtkgrepmail, pictured in Figure One, simplifies interaction with grepmail, organizing the options into three tabs: Search for, Search in, and Search options.

In addition to gtkgrepmail, several open source hackers have integrated grepmail into text-based email applications. On the grepmail home page you’ll find links to useful scripts that make it easy to use grepmail from within mutt, pine, VM, and Gnus.



Do you have an idea for a project we should feature? Drop a note to potm@linux-mag.com and let us know.

Fatal error: Call to undefined function aa_author_bios() in /opt/apache/dms/b2b/linux-mag.com/site/www/htdocs/wp-content/themes/linuxmag/single.php on line 62