Throttling Your Web Server

The Web server for www.stonehenge.com is a nicely configured Linux box (of course) located at a nice co-location facility and maintained by my ISP. I share the box with a dozen other e-commerce clients, and that keeps me and everyone else on our toes about overloading the server, because we all have to share. I bought a digital camera some large number of months ago and started putting nearly every picture I took up on the site. I've got a nice mod_perl picture handler to show the thumbnails, provide the navigation, and even generate half-size images on the fly using PerlMagick.

The Web server for www.stonehenge.com is a nicely configured Linux box (of course) located at a nice co-location facility and maintained by my ISP. I share the box with a dozen other e-commerce clients, and that keeps me and everyone else on our toes about overloading the server, because we all have to share. I bought a digital camera some large number of months ago and started putting nearly every picture I took up on the site. I’ve got a nice mod_perl picture handler to show the thumbnails, provide the navigation, and even generate half-size images on the fly using PerlMagick.

However, as I put more and more pictures online, I started to notice some pretty creepy CPU loads. Worse than that, my ISP neighbors were also starting to complain. After investigation, I determined that I was getting hit by not-so-nice “spiders”: Web programs that, given a few starting points, recursively (and rapidly) fetch the contents of many pages. I believe most of these to be people on fast data connections (like my current cable modem that brings the equivalent of 2 T-1′s into my house for $40 per month, yes!) innocently asking their Web browser to download a whole area.

So, rather than pull my pictures offline, I decided to implement a throttler. I didn’t care as much about transfer bandwidth as I did CPU, so I chose to track recent CPU activity for each visitor. Of course, HTTP has no concept of a “session,” so I took a very easy shortcut: tracking by IP address. Yes, I know I’ve ranted in discussion forums a lot about how an IP address is not a user. But, for the purpose of throttling it seemed the most expedient choice.

Once I put my throttler in place, no IP address is allowed to suck more than seven percent of my CPU over a period of 15 seconds. Once the CPU threshold is reached, any additional request is met with a 503 error (service unavailable), which, according to RFC2616 (the HTTP/1.1 specification), also allows me to give a “retry after” value of 15 seconds to advise the program that this was a temporary condition.

The throttler consists of two related mod_perl handlers: an “access” handler to note whether an IP address is currently permitted, and a “log” handler to track the CPU used. Also, there’s an external program triggered by cron to clean up the status files needed by the handlers. So, let’s take a look at the handlers in Listing One.

Listing One: The Throttler – Part I

1 package Stonehenge::Throttle;
2 use strict;

4 ## usage: PerlAccessHandler Stonehenge::Throttle

6 my $HISTORYDIR = “/home/merlyn/lib/Apache/Throttle”;

8 my $WINDOW = 15; # seconds of interest
9 my $DECLINE_CPU_PERCENT = 7; # CPU percent
in window before we 503 error
11 use vars qw($VERSION);
12 $VERSION = (qw$Revision: 2.7 $ )[-1];
14 use Apache::Constants qw(OK DECLINED);
15 use Apache::File;
16 use Apache::Log;
18 sub handler {
19 ## use Stonehenge::Reload; goto &handler
if Stonehenge::Reload->reload_me;
21 my $r = shift; # closure var
22 return DECLINED unless $r->is_initial_req;
23 my $log = $r->server->log; # closure var
25 my$host=$r->get_remote_host;#closurevar
26 return DECLINED if $host =~ /\.(holdit|
27 return DECLINED if $host =~ /
\.metronomicon\ .com$/; # poor purl
28 $host = “googlebot.com” if $host =~
30 my $historyfile = “$HISTORYDIR/
$host-times”; # closure var
31 my $blockfile = “$HISTORYDIR/
$host-blocked”; # closure var
32 my @delta_times = times; # closure var
33 my $fh = Apache::File->new; # closure var
35 $r->push_handlers
36 (PerlLogHandler =>
37 sub {
39 ## record this CPU usage
40 @delta_times = map { $_ – shift
@delta_times } times;
41 my $cpu_hundred = 0;
42 $cpu_hundred += $_ for
43 $cpu_hundred = int 100*
($cpu_hundred + 0.005);
44 ## $log->notice
(“throttle: $host got
$cpu_hundred/ 100 in this
slot”); # DEBUG
45 open $fh, “>>$historyfile” or
return DECLINED;
46 my $time = time;
47 syswrite $fh, pack “LL”, $time, $cpu_hundred;
48 close $fh;
50 my $startwindow = $time – $WINDOW;
52 if (my @stat = stat
($blockfile)) {
53 if ($stat[9] > $startwindow) {
54 ## $log->notice
(“throttle: $blockfile
is already blocking”);
55 return OK;
# nothing further to
see… move along
56 } else {
57 ## $log->notice
(“throttle: $blockfile
is old, ignoring”);
58 }
59 }
61 # figure out if we should be
62 my $totalcpu = 0;
# scaled by 100
64 open $fh, $historyfile or
return DECLINED;
65 while ((read $fh,my$buf,8)>0){
66 my ($time, $cpu) = unpack
“LL”, $buf;
67 next if $time < $startwindow;
68 $totalcpu += $cpu;
69 }
70 close $fh;
72 if ($totalcpu < $WINDOW *
73 ## $log->notice
(“throttle: $host got
$totalcpu/100 CPU in
$WINDOW secs”); # DEBUG
74 unlink $blockfile;
75 return OK;
76 }
78 ## about to be nasty… let’s see
how bad it is:
79 open $fh, “/proc/loadavg”;
80 chomp(my $loadavg = <$fh>);
81 close $fh;
83 my $useragent = $r->header_in
(‘User-Agent’) || “unknown”;
85 $log->notice
(“throttle:$host got$totalcpu/
100 CPU in $WINDOW secs,
enabling block [loadavg
$loadavg, agent$useragent]“);
86 open $fh, “>$blockfile”;
87 close $fh;
89 return OK;
90 });
92 ## back in the access handler:
94 if (my @stat = stat($blockfile)) {
95 if ($stat[9] > time – $WINDOW) {
96 $log->warn
(“throttle access: $blockfile
is blocking”);
97 $r->header_out(“Retry-After”,
98 return 503;
# Service Unavailable
99 } else {
100 ## $log->notice
(“throttle access: $blockfile
is old, ignoring”); # DEBUG
101 return DECLINED;
102 }
103 }
105 return DECLINED;
106 }
107 1;

Line 1 puts the module into Stonehenge::Throttle. I use Stonehenge as a private prefix for all my local mod_ perl goodies in order to keep it separate from any CPAN-installed modules. Because mod_perl shares the namespace across all modules, it’s very important to have a workable naming allocation to keep things from colliding.

Line 2 selects the critically important compiler restrictions. Designing code for mod_perl handlers requires careful attention to details. The use strict restrictions are a good start for this.

Line 4 reminds me that this module needs to be installed as a PerlAccessHandler. I have it selected at the top-level configuration file of my site, but I could have put the access handler inside a Directory, a Files restriction or even a .htaccess file in a subdirectory.

Lines 6 through 9 define some configuration constants. Line 6 is a directory that must be writable by the Web user id (in my case, nobody). This directory will hold the historical information about CPU usage. Line 8 defines the seconds in which we compute CPU history. If we make this too large, the throttle will be slow to react. If we make it too small, there will be a knee-jerk reaction. I’ve tweaked this number up and down from time to time, but the current number is 15 (as shown here). Line 9 defines how much CPU a particular IP address is allowed to consume (in percent) over the period of time given by $WINDOW. I found the seven percent solution to be appropriate.

Lines 11 and 12 define a version string which can be queried using the mod_perl maintenance tools. The string comes from an RCS keyword, so I just check the file out and in and get the right version number automatically.

Lines 14 through 16 pull in some standard constants and modules from the mod_perl interface.

Line 18 begins the handler called on each requested transfer. Line 19 uses my Stonehenge::Reload module to automatically reload this module whenever it changes. Since I’m pretty happy with the stability of this module, I’ve commented the line out. (Stonehenge::Reload hasn’t been published, even though I’ve now referred to it in a few of my other published works. Perhaps someday soon I should talk about it.

Line 21 fetches the incoming request. This will be an Apache::Request object, as defined by the mod_perl interface. Line 22 ignores any requests that are not generated by an external query. This keeps internal lookups (like to get the MIME type for a directory index) from accidentally triggering the throttle. Line 23 grabs a log object for later use.

Lines 25 to 28 get the hostname of the remote server and perform some slight massaging. If the hostname is my ISP, it means I’m performing some request directly, and I sure don’t want to be throttling myself. Also, I decided that all Google fetches should be charged to the same host even though they appear to be coming from different hosts. Yes, I throttle even Google if it gets too sucky on my pages.

Lines 30 through 33 set up a few variables that will be needed for both this handler and the “log” handler that will be set up later. We’ll note the filename of the CPU history file, the flag file indicating the host is currently blocked, and the current CPU usage for both this process and its children.

Lines 35 through 59 “push” a log handler. This technique allows one handler phase to create a handler for another phase “on the fly”. More importantly, it allows me to share the values of some of the variables into the later phase.

Line 40 subtracts the current value of the output of the times operator from its previous value (saved earlier in line 32). Lines 41 to 43 compute the sum total of CPU used and rounds it off to the nearest hundredth of a second. Line 44 posts a notice in the error log (which I used for debugging but have commented out now).

Lines 45 to 48 add this CPU usage as an eight-byte value to the end of a history file. The first four bytes define the timestamp second at which the observation is being taken. The last four bytes are the CPU seconds in units of hundredths of a second. This format makes it very easy to go back to a value (no decimal conversion), and an append will always be atomic. There’s no need to flock the file!

The rest of the log handler determines whether future requests should be blocked or not. First, line 50 defines the beginning of the window of interest. If there’s already a current blockfile, lines 52 through 59 note that and exit the loghandler. So, we don’t even have to think very hard.

Lines 62 to 70 walk through the history file, grabbing each eight-byte string as a separate entry and converting it back to the timestamp and CPU used. For all the entries that occur within the window, we’ll figure a total CPU.

Lines 72 to 76 determine if the CPU is below the throttling percentage, and if so, remove any blockfile that may be present, thus letting future transactions proceed unthrottled.

However, if we make it to line 78, we’ve got an IP address out there that has exceeded our threshold. Lines 79 to 81 grab the load average for logging purposes only. Line 83 likewise grabs the user agent for the log (I’ve used this to determine if I should categorically deny bad user agents based on name rather than action). Line 86? This line 86s them from the establishment by creating an empty blockfile. The presence or absence of the blockfile is all that matters to the access hander.

So, that’s it for the log handler. Back in the access handler starting in line 94, we look for the blockfile that the log handler manages. If it’s there, and new enough, we’re blocking. Line 97 adds a clue for the client that we do indeed want them to come back (just not right away). Line 98 triggers the 503 error and aborts any further access within this transfer.

That’s the mod_perl side of things. However, we now have these CPU history files being created in $HISTORYDIR, and there’s nothing in either handler to clean them up. I can’t add anything there, because the only time the file should be removed is when there’s nothing happening, but the only time I’m in a handler is when something is happening! So, there’s a little program invoked from cron on a regular basis, using a crontab entry similar to:

3-59/10 * * * * /home/merlyn/lib/Apache/throttle-cleaner

This invokes the program in Listing Two every 10 minutes on minutes that end on 3 (3, 13, 23, etc). I always try to invoke my cron stuff on unlikely minutes to avoid crowding with all those losers that use multiples of 5 or 15. Bleh!

Listing Two: Cleaning the HISTORYDIR

1 #!/usr/bin/perl -w
2 use strict;
4 # $Id: throttle-cleaner,v 1.1 1999/10/28
19:44:09 merlyn Exp $
6 my $DIR = “/home/merlyn/lib/Apache/Throttle”;
7 my $SECS = 360; # more than
Stonehenge::Throttle $WINDOW
9 chdir $DIR or die “Cannot chdir $DIR: $!”;
10 opendir DOT,”.” or die “Cannot opendir .: $!”;
11 my $when = time – $SECS;
12 while (my $name = readdir DOT) {
13 next unless -f $name;
14 next if (stat($name))[8] > $when;
15 ## warn “unlinking $name\n”;
16 unlink $name;
17 }

Because this is a standalone program, we’ve got the “sh-bang” line with warnings turned on in line 1. Line 2 is the normal compiler restrictions.

Line 6 defines the same directory as the $Stonehenge: :Throttle::HISTORYDIR, so if I change one, I need to change the other. It won’t help to delete files that aren’t in the same place. Line 7 similarly needs to be at least twice as large as the throttling window.

Lines 9 through 17 skip through the directory looking for any file that has not been accessed in at least $SECS. For blocking files, this means that we’ve not seen a transaction since the blocking started (good, they went away permanently). For history files, it means that we’ve not seen a transaction recently. In either case, the information is no longer of use. So we can destroy the file (in line 16).

There you have it: a mechanism to keep people from making your ISP-neighbors mad at you. As a testimony to its value, I recently got “slashdotted” by having my picture archive for “YAPC 19100″ mentioned on http://www.slashdot.org. My hits per hour went to 20 times their normal pace for about 36 hours after the mention, yet the load average never got above 1 or 2 during the entire ordeal. So, I’ve now survived a slashdot attack.

Another success story comes from one of my clients (a very large on-line toy and games e-tailer). They told me that they had seen an earlier version of my throttler mentioned on the mod_perl mailing list and had put it in place (with some modifications) during the past Christmas buying rush. Amazingly enough, it caught many attempts by people attempting to download their entire online catalog for offload browsing (something that would be useless and prohibitively expensive). Without the throttle, they might have literally lost millions of dollars. They did, in fact, buy me dinner for that. Thank you. I’m interested to hear how this kind of code saved your bacon, so if you adapt it, please let me know.

Randal L. Schwartz is the chief Perl guru at Stonehenge Consulting and co-author of Learning Perl and Programming Perl. He can be reached at merlyn@stonehenge.com.

Comments are closed.