x
Loading
 Loading
Featured Paper: Xen Virtualization with Novell SUSE Linux
Hello, Guest | Login | Register

Benchmarking with Apache Bench

Last month, we looked at some of the issues that affect PHP performance and explored PHP caches and optimizers, two kinds of add-ons that can provide a substantial performance boost to your PHP web applications. Rather than dig into any of those products (they all have sufficient documentation and good support communities), let’s focus on a related issue: performance testing. Or, said another way, once you’ve installed a performance boosting add-on or made a configuration change, how can you determine if it’s helping or hurting?

Community Tools
RSS
Recommend This [?]
1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...
Users That Liked This [?]
No one yet. Be the first.
Tags:
Tag This!
 No Comments

Last month, we looked at some of the issues that affect PHP performance and explored PHP caches and optimizers, two kinds of add-ons that can provide a substantial performance boost to your PHP web applications. Rather than dig into any of those products (they all have sufficient documentation and good support communities), let’s focus on a related issue: performance testing. Or, said another way, once you’ve installed a performance boosting add-on or made a configuration change, how can you determine if it’s helping or hurting?

Benchmarking and stress testing is a complex topic — especially on large, multi-tier web applications. Completing a valid, comprehensive benchmark of a complex application can take weeks of planning and development and is well beyond the scope of this article. So, let’s cover some basic ideas and introduce some of the tools you’ll need to get started.

When It’s Already Bad

It’s easiest to notice (but quite difficult to properly measure) improvements in performance when your system is already stressed. When your site is slower than you’d like, or when you’re serving fewer requests per second than you’d like, or when visitors complain of timeouts, you know you’ve got a problem. After installing an add-on or making a configuration change, you really shouldn’t have much trouble noticing if things improve. If you’re in doubt, simply go back to the original symptom and ask a few questions: Are you serving more requests per second than before? Are fewer users complaining? Does the CPU have more idle time?

Of course, this all assumes that you were monitoring performance before you began messing around with things. Often times you don’t think to measure performance beforehand because you just know that the simple changes you’re making will help. Or perhaps the server had been running fine for a year and you never paid much attention to it until it was too late. That’s usually when Murphy appears to remind you that he’s been watching all along.

By characterizing performance in advance of changes, you’ll not only be able tell how various tweaks affect server performance, you’ll also have a good idea of what the breaking point is for your configuration. Knowing that, you stand a much better chance of avoiding a bad situation.

Watching Linux

To get a feel for what’s happening on your server, it’s best to spend some time understanding what its normal system load is. In particular, you should watch the three main choke points in any server: CPU, memory, and disk I/O.

You can get a good feel for the state of the system using a number of tools and monitoring software. For the sake of simplicity, we’ll focus on one of the oldest and most common tools: vmstat. Using vmstat, you get an easy-to-read summary of critical performance information on your system, refreshed every N seconds, where you specify N. For example, vmstat 5 refreshes the display every five seconds.

Figure One shows one minute’s worth of vmstat data on a relatively idle (3 requests per second) web server. (In the next section, we’ll see how the light test load was generated.)




Figure One: The output of vmstat

$ vmstat 5
   procs        memory                  swap    io           system cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  sy  id
 1  0  0  34992   9932 237540 112352   0   0     0    45  123    55   0   0 100
 0  0  0  34992   9908 237540 112364   0   0     0    44  322   123   4   1  95
 0  0  0  34992   9904 237540 112368   0   0     0    63  339   119   3   1  96
 0  0  0  34992   9824 237552 112436   0   0     9   100  356   138   4   2  94
 0  0  0  34992   9808 237552 112448   0   0     0    48  330   120   3   1  96
 0  0  0  34992   9800 237552 112452   0   0     0    32  324   116   3   1  96
 0  0  0  34992   9780 237552 112472   0   0     1    31  323   117   4   1  95
 2  0  1  34992   9780 237552 112480   0   0     0    51  318   116   4   2  94
 0  0  0  34992   9772 237552 112488   0   0     0    47  321   121   3   1  96
 0  0  0  34992   9764 237552 112496   0   0     0    58  329   113   2   1  97
 0  0  0  34992   9756 237552 112504   0   0     0    37  327   125   3   0  97
 

Briefly, the output of vmstat is grouped into 6 sections: procs, memory, swap, io, system, and cpu. Let’s focus on procs, swap, and cpu.

The procs section tells you how many processes are running (r), blocked (b), or swapped, but ready to run (w). As the number of incoming requests increases, you’ll likely see the number of running processes increase. However, if you’re serving static content (rather than using PHP or mod_perl), you might find that requests are handled so efficiently that the numbers don’t really change.

The swap section tells you how many kilobytes of memory per second are swapped in (si) or swapped out (so) from disk during each interval. Ideally, you want to see zeroes across the board. If you begin to see swapping, performance is likely to degrade quickly. One solution is to lower the MaxClients setting in your Apache configuration — remember, each Apache process uses a few megabytes of memory (possibly much more in the case of mod_perl), so watch your memory usage carefully.

The cpu section breaks down CPU usage into user us, system sy, and idle time id. Typically, you’ll find that the percentage of user time increases as the server becomes busy. Eventually you’ll hit the point where there is zero idle time left. If you see a significant amount of time being spent on system (or kernel) tasks, you may have other problems.

There’s not enough space here to adequately describe all of vmstat’s output, so spend a few minutes reading its man page.

It’s often useful to verify where CPU time is really being spent by running top in another window. top provides a much clearer, and sometimes surprising picture of which processes are hogging the CPU. It’s not uncommon to find that MySQL is using 85% of your CPU, while Apache is using a mere 5%. That’s a clear sign that your database is the bottleneck and is probably in need of tuning.

Watching Apache

It’s also helpful to know how many requests and kilobytes per second your web server is pumping out. Apache’s server-status handler provides a top-like view of your web server. Make sure that you’ve enabled the server-status handler in Apache using something like this in your httpd.conf file:

<Location /server-status>
    SetHandler server-status
    Order deny,allow
    Deny from all
    Allow from 127.0.0.1
    Allow from 1.2.3.4
</Location>








lamp_01
Figure Two: The ouput of Apache’s server-status handler

Then you can visit http://www.yoursite.com/server-stats to get a feel for what Apache is doing. As seen in Figure Two, server-stats provides uptime and traffic statistics, as well as details about each child process, any requests being served, open process slots, and so on.

Control the Environment

To perform any meaningful testing, you need to control the environment, including the server and its configuration, as well as the load-generating clients and (if possible) even the network that connects them. Ideally, you have a test server configured to match one of your production machines as closely as possible. During testing, you should ensure that nobody else is using the test server and that the only processes running on it are the same as you’d find running in production.

For example, if your test machine has MySQL and sendmail on it, but your production machines do not, shut them off while you’re testing.

It’s also important that you keep a record of the configuration (hardware and software) along with any changes you make along the way. It may seem like overkill, but after a couple weeks pass by, it’ll be pretty hard to remember what that one little tweak before “test #3″ was. By keeping all of your important configuration files in a revision control system such as CVS (preferably on another host), you’ll have a record of what changed when.

To the Bench

One of the easiest ways to characterize performance is to throw the switch and perform some simple benchmarking. Apache itself comes with a command-line tool called ab (Apache Bench) that does a good job of measuring web server performance. Unfortunately, most Apache users have never seen nor heard of it.

At ab’s simplest, you supply a URL to test, the number of concurrent users that you’d like to simulate, and the total number of requests to be made. For example, the command

asks Apache Bench to simulate five users accessing your site a total of 200 times. It will run for a bit and then, upon completion, produce some statistics about the test.

First, you’ll see some information about the host and URL you tested:

Server Software:        Apache/1.3.26
Server Hostname:        www.yoursite.com
Server Port:            80

Document Path:          /index.php
Document Length:        4110 bytes

That’s followed by some aggregate statistics that provide a quick idea of how well the test went. See Figure Three.




Figure Three: Apache Bench statistics

Concurrency Level:      5
Time taken for tests:   11.862 seconds
Complete requests:      200
Failed requests:        0
Broken pipe errors:     0
Total transferred:      886600 bytes
HTML transferred:       822000 bytes
Requests per second:    16.86 [#/sec] (mean)
Time per request:       296.55 [ms] (mean)
Time per request:       59.31 [ms] (mean, across all concurrent requests)
Transfer rate:          74.74 [Kbytes/sec] received

Pay particular attention to the number of failed requests. You can easily overwhelm a server and look right past the fact that a significant number of your requests failed, instead focusing on the requests per second, time per request, transfer rate, and so on.

The final two pieces of output, shown in Figure Four, summarize the data in tabular form. The first provides millisecond timing statistics that should give you an idea of how long connections spent in various state. The second gives you an idea of what the worst case times were like.




Figure Four: Apache Bench performance metrics


Connection Times (ms)
              min  mean [+/-sd] median   max
Connect:       96    97     1.0     97   108
Processing:   198   199     1.0    199   206
Waiting:      197   198     0.9    198   206
Total:        294   296     1.3    296   306


Percentage of the requests served within a certain time (ms)

 50%    296
 66%    296
 75%    296
 80%    297
 90%    298
 95%    298
 98%    300
 99%    302
100%    306 (last request)

Check the command-line help for Apache Bench — it can handle custom headers, HTTP POST (for uploads), cookies, WWW and proxy authentication, HTTP KeepAlives, and more. There’s enough flexibility to perform basic testing of a variety of web applications. The downside is that you need to test one URL at a time.

What’s Missing?

As noted earlier, this is a relatively simplistic view of web server and application benchmarking. In a more comprehensive and detailed test, you’d likely want a tool that can:


  • Test a variety of URLs using different and randomize access to them.

  • Impersonate multiple users using pre-built or dynamically created cookies.

  • Spoof various user-agents.

  • Be distributed among several clients to generate more load than a single server can.

There are many commercial and open source web server testing solutions available. They’re all considerably more complex than Apache Bench, but they can provide you with insights that go far beyond what we’ve looked at here.



Jeremy Zawodny uses Open Source tools at Yahoo! by day and is writing a MySQL book for O’Reilly & Associates by night. When not coding or writing, you can find Jeremy piloting gliders. Reach him at Jeremy@Zawodny.com.

Read More
  1. Network Block Devices: Using Hardware Over a Network
  2. The Importance of Command Line Literacy
  3. Easy Backups with AMANDA
  4. Wizard Boot Camp, Part 10: Utilities You Should Know
  5. What's GNU, Part Four: find

Comments on Benchmarking with Apache Bench

No comments yet.

Sorry, the comment form is closed at this time.

ActivSupport
Linux Magazine has chosen ActivSupport as IT consultants.
Sponsored Links