dcsimg

Ten Things You Didn’t Know Apache (2.2) Could Do

Apache 2.2 has some great hidden treasures in it that a lot of folks are simply unaware of.

httpd -M

Apache loads modules in two different ways. You can compile them into the server binary when you first install Apache, or you can load them dynamically at startup time using the AddModule directive. Almost every Apache installation has some of each kind. Until recently, if you wanted to know what modules you had loaded, you had to look two different places. You’d run httpd -l to get a list of the compiled-in type:

$ httpd -l
Compiled in modules:
core.c
prefork.c
http_core.c
mod_so.c

Then you’d have to go look in your server configuration file and see what modules had AddModule directives. This is actually harder than it sounds, because a lot of third-party distributions of Apache put each AddModule directive in a separate file, with names like php.load and mod_perl.conf and so on.

In another minor change with a big impact, Apache 2.2 adds the -M flag, which allows you to list all of the modules that are loaded, both static and shared:

$ httpd -M
Password:
Loaded Modules:
core_module (static)
mpm_prefork_module (static)
http_module (static)
so_module (static)
authn_file_module (shared)
...
php5_module (shared)
pony_module (shared)

Each module indicates whether it is static or shared, and now you know for certain what modules were successfully loaded and which ones you forgot.

And, yes, that’s mod_pony. Seriously.

httxt2dbm

If you’re like the rest of us, you have, over the years, accumulated lengthy lists of RewriteRule and Redirect directives to map old URLs to the new ones. These stack up, and, over time, can cause a great deal of confusion about where your content actually lives, not to mention a big performance hit when all the rules have to be processed every time a request is made to your server.

One way to consolidate these redirects is with RewriteMap, a directive in mod_rewrite that allows you to define an external map of rewrite rules. This may be as simple as a text file that lists the mappings, or as complicated as an external script or program, or a database query, that determines the rules.

So, for example, if you have a bunch of old URLs that you want to redirect to new ones (a very typical case), or perhaps just friendly, easier-to remember URLs that you want to redirect to the actual ugly back-end ones then you might have a RewriteMap file like this, called dogs.txt:

/collie /dogs.php?id=875
/doberman /dogs.php?id=12
/daschund /dogs.php?id=99
/siamese /cats.php?id=84

Then, you would use this file in a RewriteMap:

RewriteMap dogmap txt:/path/to/file/dogs.txt

And use the RewriteMap in a RewriteRule:

RewriteRule ^/dogs/(.*) ${dogmap:$1}

The trouble is that this is a plain text file, and, as such, unindexed and therefore slow. Every time you request a URI, mod_rewrite looks through this list, one item at a time, until it finds the one that it needs. And the more items you add to the list, the longer each lookup takes.

For years, the documentation suggested that you could convert the text file to a dbm, and offered a Perl script for doing so. Unfortunately, the script didn’t work particularly well, and, if you could get it to work, there was always the problem of picking the right type of dbm for your particular operating system.

With the 2.2 version, there’s a utility that comes with the server, and is installed alongside the other binaries, that not only converts your text file into a dbm, but correctly selects the same dbm library that your installation of Apache was built with, thus ensuring compatibility.

This script, called httxt2dbm, is used as follows.

httxt2dbm -- Program to Create DBM Files for use by RewriteMap
Usage: httxt2dbm [-v] [-f format] -i SOURCE_TXT -o OUTPUT_DBM

Options:
 -v    More verbose output

 -i    Source Text File. If '-', use stdin.

 -o    Output DBM.

 -f    DBM Format.  If not specified, will use the APR Default.
           GDBM for GDBM files (unavailable)
           SDBM for SDBM files (available)
           DB   for berkeley DB files (unavailable)
           NDBM for NDBM files (unavailable)
           default for the default DBM type

For most of us, the -f option is not particularly useful. Of course, we want it to use the APR default – that is, whatever Apache was built with. If you actually know what the differences are between the various dbm formats, perhaps you have reasons for using a different one, and can do that if you really want to.

$ httxt2dbm -i dogs.txt -o dogs.map

Now, you can modify your RewriteMap directive to use this new file:

RewriteMap dogmap dbm:/path/to/file/dogs.map

Lookups are now performed against the dbm, and so are much faster.

PCRE Zero-Width Assertions

I said I wasn’t going to leave the best to last, but this last one is very cool, and answers one of the most frequently asked questions, although often the folks asking the question wouldn’t think to ask for this particular solution.

The question that tends to get asked often goes something like, “How can I redirect everything except for a particular directory.” For example, requests for anything on this server, I want to redirect over to that other server, except for requests for the images directory.

Now, Apache offers a RedirectMatch directive that allows you to use regular expressions to specify a class of URIs that you want to redirect. Unfortunately, it does not have a negation operator, so you can’t simply say “everything that doesn’t match images.” very easily.

At least until now.

One of the changes with the 2.2 version of the server is that RedirectMatch and all of the other *Match directives now use the Perl Compatible Regular Expression library (PCRE) and so have the full power of the regular expressions that you know and love from your favorite programming language.

One of the cooler of these features is zero-width assertions. Now, I’m not going to go into all the details of what these are. That’s covered very nicely in the tutorial at http://www.regular-expressions.info/lookaround.html. Instead, I’ll give you a specific way that they can be used in Apache to answer this frequently asked, seldom answered question.

RedirectMatch ^/(?!images/)(.*) http://dynamic.myhost.com/$1

This RedirectMatch redirects all URLs to http://dynamic.myhost.com/, unless the URL starts with /images/. This regular expression syntax is called a negative lookahead, and allows you to assert that a string does not contain a particular thing.

It makes me happy when something that I’ve always answered with “you can’t do that” becomes possible, and even easy.

Summary

Apache 2.2 has some great hidden treasures in it that a lot of folks are simply unaware of. 2.4 has even more of them. I can hardly wait.

Fatal error: Call to undefined function aa_author_bios() in /opt/apache/dms/b2b/linux-mag.com/site/www/htdocs/wp-content/themes/linuxmag/single.php on line 62