Spinning a Lightweight Web

Using the thttpd Web Server.

The World Wide Web is an integral part of the Internet. In fact, in many peoples’ eyes, the Web is the Internet. Although this view is incorrect (it ignores email, FTP, and many other protocols), it does reflect the importance and visibility of the Web to average users.

Just what is the Web, though, and how is it implemented? At its core, the Web is a series of server computers that run server programs implementing the Hypertext Transfer Protocol (HTTP) and its secure variant, HTTPS. According to an ongoing Netcraft survey available at http://news.netcraft.com/archives/web_server_survey.html, the most popular Web server software by far is Apache (http://httpd.apache.org/). In January, 2006 (the latest figures available as the magazine goes to press), Apache manages almost 70% of the Web sites surveyed. The next most popular server package, Microsoft’s IIS, manages just under 21% of the Web’s sites. Apache is available for Linux — in fact, all major Linux distributions ship with Apache — so if you want to run a Web site on the market- leading package, you can do so.

Despite its hugely dominant position, though, Apache isn’t the only choice in web servers, nor is it necessarily the best choice. Apache is a large program with lots of features, which makes it very flexible. These characteristics also mean that Apache consumes a lot of resources and, at least in theory, make it more susceptible to bugs than smaller packages. (In practice, of course, other factors can affect “bugginess”; Apache isn’t necessarily any more buggy than a slimmer web server package.) If you’re running a Web site on a computer with minimal hardware or if you don’t need all of Apache’s features and would rather avoid the possibility of encountering bugs related to features you don’t use, you might want to look into alternatives.

Web Server Features

Just why is Apache such a big program? Apache supports several features that add to its size, including the ability to manage dynamic content (CGI scripts), SSL encryption, output filtering, proxy load balancing, and so on. Many of these features are important for large Web sites but are unimportant for small ones. Even much of the content served from large sites doesn’t rely on these features; such sites might combine Apache with a smaller server (perhaps running on another computer) to help reduce the load on the main Apache system.

Ideally, a lightweight web server should provide fairly basic features to avoid the size and complexity of Apache. Small Web servers sometimes have specialized features, as described shortly, but some are simple, basic, and general-purpose Web servers that simply lack the range of features provided by Apache.

Lightweight Web Server Options

So what options are there, aside from Apache? Given that Apache and IIS together hold over 90% of the Web server market, you might think that the pickings for alternatives are slim. This isn’t true, though. Quite a few options exist for Linux alone, including:

*Anti-Web HTTPD. headquartered at http://www.hcsw.org/awhttpd/, is a simple web server that nonetheless supports CGI. In its most basic mode of operation, it requires no configuration file.

*Athana. This server is unusual because it’s written entirely in Python. You can learn more at http://www.hcsw.org/awhttpd.

*EHS. The Embedded HTTP Server (EHS) isn’t a standalone program; rather, it’s a C++ class that enables you to add a complete Web server to your own C++ programs. The main EHS web page is http://www.hcsw.org/awhttpd/.

*Gatling. This web server, found at http://www.hcsw.org/awhttpd/, is designed for speed and takes advantage of platform-specific application programming interfaces (APIs) to achieve its goal.

*Screws. This server is designed to optimize extensibility rather than speed. It’s built as a small core that calls external programs to process requests. You can learn more at http://www.nopcode.org/blog/screws.html.

*thttpd. The tiny/turbo/throttling HTTP server (thttpd) is a small server that’s suitable for handling static content and CGI scripts. It’s headquartered at http://www.acme.com/software/thttpd/.

Clearly, some of these web servers are rather specialized; I’ve mentioned them to give you some idea of the wide range of web servers that are available. This list also only scratches the surface. Check http://www.linuxlinks.com/Software/Internet/WebServers/ or do a Web search on “Linux Web servers” to find more.

Because it’s a moderately popular, small, and suitable for general purpose use, let’s look at thttpd in more detail. thttpd is capable of handling a basic web site, or even one that delivers CGI content. It can’t handle SSL encryption, though. If you need SSL support, you’ll either need to use another server or use two servers, one for SSL and one for non-SSL connections. Because of its small size and efficiency, though, thttpd is excellent for use on weak computers or on more powerful systems that must handle a large number of simple requests.

Obtaining and Installing thttpd

Many Linux distributions include thttpd as part of their standard packages, but thttpd isn’t likely to be installed by default. Check your distribution’s CD-ROMs or use a network-enabled package installer, such as apt-get, yum, or emerge, to install the thttpd package for your distribution. For instance, you might type:

# apt-get install thttpd

(Of course, the details of the command you use will vary from one distribution to another.)

If you can’t find thttpd on your distribution’s list of supported packages, you can download it from the main thttpd site at http://www.acme.com/software/thttpd/. The download link is entitled” Fetch version 2.25b,” although the version number may change by the time you read this. This link points to a source code tarball. Unpack it with tar and perform the typical set of commands to compile and install the software

$ tar xvfz ~/thttpd-2.25b.tar.gz
$ cd thttpd-2.25b/
$ ./configure
$ make
$ sudo make install

The README file provides important information, so you may want to read it before you configure or build the software. Also, the installation step assumes the existence of a www group. Create the group before you type sudo make install, if it doesn’t already exist on your system.

Configuring thttpd

The thttpd server can be configured via command-line options when it starts; however, it also supports an optional configuration file. Your distribution might set up thttpd to use /etc/thttpd.conf, /etc/thttpd/thttpd.conf, or some other file by default. Look for such files or use your package management tools to find a configuration file. If you can’t find a configuration but would like to use one, tell thttpd to use the file when you configure it to launch.

A typical thttpd configuration file looks something like the one shown in Listing One.

Listing One: A Simple thttpd configuration file

dir=/exports/httpd/html/
logfile=/var/log/thttpd.log
pidfile=/var/run/thttpd.pid

The options in the configuration file correspond to command-line switches, as summarized in Table One.

Table One: thttpd options

Command-Line Option Configuration File Option Effect
–C N/A Specifies the filename of the configuration file.
–p portnum port portnum Specifies the port number to which the server listens. The default is 80.
–d directory dir directory Identifies a directory to use as a base for other file and directory references, similar to chdir in a shell. This is not the same as a chroot jail, though.
–r chroot Tells the server to lock itself into a chroot jail built around the directory specified by –d / dir.
–nor nochroot Disables the chroot option, if it was enabled as a compile-time default (which is normally not the case).
–dd data_dir Similar to –d dir, but specifies a directory within the chroot jail when –r / chroot is used.
–nos nosymlinkcheck Ordinarily, thttpd checks symbolic links to be sure they point to files within the original directory tree. This option disables this check, saving CPU time at the expense of reduced security. For both speed and security, use the –r / chroot option.
–v vhost Enables simple virtual hosting support.
–nov novhost Disables simple virtual hosting support, if the compile-time default enables it.
–g globalpasswd Protects all files in the document tree using a single password set in .htpasswd at the top of the tree.
–nog noglobalpasswd Reverses the effect of the –g / noglobalpasswd option if it’s set as the default by compile-time options.
–u user Specifies the user as which the server should run when started as root. The default is nobody.
–c cgipat Specifies a wildcard pattern for locating CGI programs. If a user specifies a CGI script whose name doesn’t match this pattern, thttpd won’t run it.
–t filename throttles filename Identifies a file containing throttling information. This feature lets you limit the data transfer rate for specific clients or sets of clients. Consult the thttpd man page for more details.
–h hostname host hostname The name of the host to use, for a multi-hosting configuration.
–l filename logfile filename The name of a file to use for logging. If unspecified, thttpd logs via syslog.
–i filename pidfile filename The name of a file to which thttpd writes its process ID (PID) number.
–T setname charset setname The name of the character set to use for MIME text file types. The default value is iso-8859-1.
–P header p3p header The P3P privacy header to include with responses. See http://www.w3.org/P3P/ for details.
–M seconds max_age The number of seconds clients should retain a cache of the page. The default of no value is acceptable for most sites.
–V N/A Displays the thttpd version number and exit.
–D N/A Runs thttpd as a normal (non-daemon) process. Useful for debugging or if you run the server in a wrapper script that restarts the server if it exits.

When using command-line switches, the switch and its value (if one is required) are separated by a space. When setting options in a configuration file, though, use an equal sign (=) between the option name and its value, as shown in Listing One.

Running thttpd

Ordinarily, thttpd runs as a standalone process, not from a super server (such as inetd or xinetd). If you installed thttpd from a package for your distribution, chances are the package included a SysV startup script, such as /etc/init.d/thttpd. This script might or might not run the next time you restart the computer. You can use tools such as chkconfig, rc-update, ntsysv, sysv-rc-conf, system-config-services, or YaST to ensure that the thttpd startup script runs when you boot your computer. Which tool you should use and how you should use it depends on your distribution, so consult distribution-specific documentation.

You should check your thttpd SysV startup script to learn what options are passed to the server. In some cases, these may be hidden in an environment variable, such as thttpd_OPTS, which may be set in another file, such as /etc/conf.d/thttpd. (These examples are taken from a Gentoo system.) You may need to change these options if you want to customize your configuration or use a thttpd configuration file.

If you want to launch thttpd without rebooting your computer, you can run the startup script manually by typing its name at a root command prompt or via su or sudo.

If you installed thttpd by compiling the source code yourself, you’ll need to find some other way to start the program. The simplest way is usually to add a reference to thttpd in a local startup script, such as SuSE’ s /etc/init.d/boot.local or Fedora’ s /etc/rc.d/rc.local. Add a line, probably to the end of the file, that calls thttpd,0, including any options you want to pass to the server, as in:

# /usr/local/bin/thttpd –C /etc/thttpd.conf

This example launches thttpd and passes it the /etc/thttpd.conf file for additional configuration options.

Once thttpd is up and running, you can test its operation by typing http://localhost in a browser running on the local computer, or by using a remote computer’s web browser to point to the server computer. You may need to attend to local or remote firewalls, though; many Linux distributions now ship with default configurations that block access to port 80. If your thttpd server is listening to a port other than 80, you may need to change the port number, as well. (Note that some distributions’ default thttpd configurations bind the server to port 8080 by default.) If you access the thttpd server immediately after installing and running it, chances are you’ll see either a directory listing or a sample home page, depending on whether or not the package you installed included a sample home page.

Check your configuration’s –d / dir option to learn where to place your web site’s files. Ordinarily, you’ll want a master file called index.html,, which contains your site’s main page, as well as files and directories referred to by index.html. Add your own Web pages to this directory and refresh the view in your Web browser; you should now see your Web site.

Of course, the task of maintaining a web server and the web site it manages is an ongoing one. You should check the thttpd log files on a regular basis, monitor CPU, memory, and network bandwidth use, and be vigilant for signs of intrusions.

Roderick W. Smith is the author or co-author of over a dozen books, including Advanced Linux Networking and Linux Power Tools. He can be reached at class="emailaddress">rodsmith@rodsbooks.com.

Fatal error: Call to undefined function aa_author_bios() in /opt/apache/dms/b2b/linux-mag.com/site/www/htdocs/wp-content/themes/linuxmag/single.php on line 62