dcsimg

The Health of Your Site, Part 1

If you run an "always on" e-commerce site (perhaps using some of the high-availability tricks described in this issue), you must ensure that search forms really operate and that the pages pointed to have reasonable content. Validation is vital for dynamic web sites, especially those that generate an "everything's OK" 200 status when the content of the page contains a Java traceback from a database connection. To truly have high availability, you have to watch the associated programs and databases -- not just that the links on your pages all go somewhere reasonable.

If you run an “always on” e-commerce site (perhaps using some of the high-availability tricks described in this issue), you must ensure that search forms really operate and that the pages pointed to have reasonable content. Validation is vital for dynamic web sites, especially those that generate an “everything’s OK” 200 status when the content of the page contains a Java traceback from a database connection. To truly have high availability, you have to watch the associated programs and databases — not just that the links on your pages all go somewhere reasonable.

The trick is to regularly run a Perl program to connect to your web server and perform requests as if it was a visitor’s browser. While validation can be performed rather directly with the LWP package (or even low-level programming with sockets directly), I save time and energy using the rapidly-evolving WWW::Mechanize package found in the CPAN. With WWW::Mechanize, I get a virtual user agent that steers like a browser.

The next step is to figure out if the right responses are being returned from the web browser. For that, I prefer the Test: :More module that’s installed with modern Perl versions (and can be fetched from the CPAN for older Perl versions).

The Test::More module includes a number of tests that ultimately display a series of ok and not ok messages on stdout. These messages are normally interpreted by Test: :Harness to give an overall “thumbs up” or “thumbs down” to a test (as when you are installing a module or Perl itself), but the individual messages and program flow are also directly useful.

The WWW::Mechanize Age

For example, let’s pretend we’re in charge of search. cpan.org, and are responsible for its health. What should be tested at regular intervals to ensure that the site and its features are running satisfactorily? A quick first test would be to make sure that the top-level page can be fetched. Let’s do that with WWW::Mechanize:


use WWW::Mechanize;
my $a = WWW::Mechanize->new;
$a->get(“http://search.cpan.org/“);

The virtual browser is now “looking at” the top level page. But is it really? Let’s check with some Test::More routines:


use Test::More no_plan;
ok($a->success, “fetched /”);

The ok() routine evaluates the boolean returned by the success method. If the value is true, we get the output:


ok 1 – fetched /
1..1

The first line says that the first test passed OK, including our comment for clear identification. The second line says that our tests were numbered from 1 to 1. (The exact format of the lines are dictated by Test::Harness; see that module’s man page for specifics.) If the fetch had been unsuccessful, we’d get something like:


not ok 1 – fetched /
# Failed test (./healthcheck at line 6)
1..1
# Looks like you failed 1 tests of 1.

The hash-marked lines (beginning with #) are Test: :Harness comments. Only the not ok and 1..1 lines are significant to the harness.

But this didn’t tell us why we failed. If we want to know how the result differs, we can use is() rather than ok(). For example, since we expect the status to be 200, we can check for that directly:


is($a->status, 200, “fetched /”);

Now when the page fetch fails (perhaps due to a 404 error), we get a more detailed message:


not ok 1 – fetched /
# Failed test (./healthcheck at line 6)
# got: ’404′
# expected: ’200′
1..1
# Looks like you failed 1 tests of 1.

Of course, a 404 error on the root page is probably a clue that nothing else is going to work either.

We should probably make sure that we ended up with a WWW::Mechanize object on that new() call as well. That’s easy with the isa_ok() routine provided by Test: :More:


isa_ok(my $a = WWW::Mechanize->new,
“WWW::Mechanize”);

And now we get:


ok 1 – The object isa WWW::Mechanize
ok 2 – fetched /
1..2

Since we now have two tests, the final display shows that our tests are numbered 1 through 2.

The default timeout for the user-agent used by LWP is 180 seconds. If part of being “healthy” is that our website responds much faster than that, we can verify the response time by changing the timeout on our virtual browser, with $a->timeout(10). We might also set our user-agent string to something more recognizable for the access logs, or maybe to ensure that our tests aren’t included in the official statistics:


$a->agent(“search.cpan.org-healthcheck/0.01″);

If we get a good page fetch, we probably want to make sure it has the right content, and isn’t some other error page sent with a 200 status. A quick check might be to verify the title of the page with Test::More‘s like() routine:


like($a->title, qr/The CPAN Search Site/,
“/ title is good”);

The first argument is the target string. The second argument is typically specified using a regular expression literal object, although you can use a text string that starts and ends with a slash for compatibility with older versions of Perl that don’t have qr//. If the target string matches, the test was successful. If it fails, both the target string and the regular expression are displayed, along with a failure for the test.

Obviously, this test will fail if the title is changed, so the site and corresponding tests must be revised at the same time. If your web site is managed in a change control system, you should update, validate, and deploy this health check in the same manner as any other component of your web site.

Are We Reaching?

Let’s see if the links on the front page are working correctly. We can do that with follow_link(). We’ll look for the link that says FAQ and see if it gets us to the FAQ:


ok($a->follow_link(text => ‘FAQ’)
“follow FAQ link”);

The follow_link() method finds a link that has FAQ as the entire text. We could also find a link based on the URL, or a regular expression match of either the text or the URL. If multiple links match a particular requirement, we can also pick links based on their ordinal position.

If the link isn’t found, we get false, which fails the test. But if the link is found, we still need to find out if the page could be fetched:


is($a->status, 200, “fetched FAQ page”);

And yet, this still might be a 200 “error” page instead, so we should ensure that the content is as expected. This time we’ll use like() against the page content:


like($a->content, qr/Frequently Asked
Questions/, “FAQ content matches”);

Once we’re satisfied that the link works, we want to go back to the beginning page for some other tests. While we could simply get() the page again, let’s just push the virtual “Back” button. $a->back does the trick.

So far, our output looks like:


ok 1 – The object isa WWW::Mechanize
ok 2 – fetched /
ok 3 – / title matches
ok 4 – follow FAQ link
ok 5 – fetched FAQ page
ok 6 – FAQ content matches
1..6

Not bad. The web site is up and at least two pages have reasonable HTML. What if the FAQ link can’t be found? We’ll end up with an erroneous error and an erroneous success:


ok 1 – The object isa WWW::Mechanize
ok 2 – fetched /
ok 3 – / title matches
not ok 4 – follow FAQ link
# Failed test (./healthcheck at line 10)
ok 5 – fetched FAQ page
not ok 6 – FAQ content matches
# Failed test (./healthcheck at line 12)
# <!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML
… lots of text here …
# </html>
# doesn’t match ‘(?-xism:Frequently
Asked Questions)’
1..6
# Looks like you failed 2 tests of 6.

Test 4 is correctly reporting that we couldn’t find the FAQ link — but test 5 succeeds! The problem is that we’re testing the successful fetch of the previous page, so it’s a false positive. And test 6 is really irrelevant, because we’re checking the home page for the FAQ content, which doesn’t make sense, yielding a false negative.

What we need to do is skip tests 5 and 6 if test 4 fails, and also skip test 6 if test 5 fails. We can do this with Test::More‘s SKIP mechanism:


SKIP: {
ok($a->follow_link( text => ‘FAQ’ ),
“follow FAQ link”)
or skip “missing FAQ link”, 2;
SKIP: {
is($a->status, 200, “fetched FAQ
page”) or skip “bad FAQ fetch”, 1;
like($a->content, qr/Frequently Asked
Questions/,
“FAQ content matches”);
$a->back;
}
}

The skip mechanism uses a block labeled with SKIP to delimit the tests to be skipped. Since the ok() function returns a boolean success, we can note a failed test, and execute skip to skip the remaining tests and exit the SKIP block. The first parameter to skip is the reason for skipping, while the second parameter is the number of tests to skip. We need to ensure the accuracy of that number because we don’t want later tests to be renumbered if we skip some of these tests.

If the FAQ link can’t be found, we get output that looks like:


not ok 4 – follow FAQ link
# Failed test (./healthcheck at line 10)
ok 5 # skip missing FAQ link
ok 6 # skip missing FAQ link

While that the skipped tests appear to be “ok,” they’ve been annotated with a comment. This comment is recognized by Test::Harness so that it can say “2 tests skipped.”

If the inner is() fails, we will again skip, but only the one content test. If we maintain this code to add more tests, we’ll need to update all of the skip numbers properly. Note that the back button is pressed only when we’ve gone forward as well.

Now let’s try filling out a form, by searching for a particular author. Let’s start by selecting the first (and only) form on the page:


ok($a->form_number(1), “select queryform”);

Next, we’ll look for Andy Lester’s CPAN handle. (Andy is the current maintainer of both Test::Harness and WWW: :Mechanize.) To do this, we need to know the form’s field names, which we can get with a “view source” on the web page:


$a->set_fields(query => “PETDANCE”, mode
=> ‘author’);

When that’s done, we can submit the form using $a->submit. Now, we should be looking at Andy’s detailed CPAN page. First, let’s make sure it fetched OK:


is($a->status, 200, “query returned good
for ‘author’”);

We can then see if Andy’s name is mentioned somewhere on the page. This verifies that the CGI response is working, the search engine is working, and it’s returning sensible data:


like($a->content, qr/Andy Lester/, “found
Andy Lester”);

And of course, we’ll want to skip back when we’re done, ready for another test from the home page. Just use $a->back again. But if we can’t find the form, or we can’t fetch the page, we’re executing too many tests and too much other code again, so we’ll want to wrap this stuff up inside some nested skips as well:


SKIP: {
ok($a->form_number(1), “select query form”)
or skip “cannot select query form”, 2;
$a->set_fields(query => “PETDANCE”, mode =>
‘author’);
$a->submit();
SKIP: {
is($a->status, 200, “query returned good
for ‘author’”)
or skip “missing author page”, 1;
like($a->content, qr/Andy Lester/, “found
Andy Lester”);
$a->back;
}
}

Once again, we skip over any tests that give us false positives or false negatives.

So, in under three dozen lines of code, I now know that search.cpan.org is up and running, able to execute searches for authors, returning reasonable data, and generating reasonable pages with links to the FAQ. Next month, I’ll explore this subject further, including how to notify someone only when something is breaking. Until then, enjoy!



Randal L. Schwarz is the chief Perl guru at Stonehenge Consulting and can be reached at merlyn@stonehenge.com.

Comments are closed.