Simple Online Quiz Technique — Part II

In last month's column, I described a program that rips through screenit.com's database of movie reviews and extracts the "profanity" paragraphs, which detail how nearly 1,000 recent movies have used words that some might find offensive. This month, I'll look at a quiz engine that picks a movie from the database at random, presents the profanity paragraph, and requests a multiple-choice response to test your knowledge of which movie that paragraph is describing.

In last month’s column, I described a program that rips through screenit.com’s database of movie reviews and extracts the “profanity” paragraphs, which detail how nearly 1,000 recent movies have used words that some might find offensive. This month, I’ll look at a quiz engine that picks a movie from the database at random, presents the profanity paragraph, and requests a multiple-choice response to test your knowledge of which movie that paragraph is describing.

This quiz engine differs from many I’ve seen in that it’s nearly cheat-proof. To be a good quiz engine, it’s got to randomize the questions and order of answers while providing no clues to the correct answer in the HTML. Further, we shouldn’t be able to back up and select a different answer after we’ve selected a wrong one or alter any hidden variables to redefine our score.

One way to prevent cheating is using some server-side information associated with a session ID and sending down only a session ID to the browser, either as a hidden field or as part of any URL used in a response. This prevents tampering with the score and the state of answering questions. These same techniques apply to good “shopping cart” design as well, so this isn’t as insignificant an application as it might appear. The session ID should be random, using a relatively strong cryptographic selection process to prevent “hijacking” some other user’s session.

Although I’ve written the quiz program as a CGI application, the same techniques would also apply to an Apache mod_perl handler. The program is presented in Listing One.

Lines 1 through 3 start nearly every CGI program I write, enabling taint checking, warnings, compiler restricts, and disabling buffering on STDOUT.

Lines 6 and 7 define the only configuration constants I needed. The $DATA_DB corresponds to a dbmopen‘able path, where I placed the database created by last month’s program. $COUNT defines how many movie choices will be selected for each question. A value of 1 would be a lame quiz (just select the only choice) and 5 or more would be formidable. I felt that 3 was a nice, challenging number.

Lines 10 and 11 bring in the CGI.pm module and its companion CGI::Carp, both included in the standard Perl distribution. Note that I’ve selected fatalsToBrowser here, which is a security hole if left in production programs — so don’t do this. I’m just experimenting and wanted the errors to show up in my browser rather than having to hunt them down in the server logs.

Lines 15 to 25 handle the server-side database. I’m using the very fine File::Cache module (which can be found in the CPAN) that permits time-limited and size-bounded data to be stored (typically somewhere within /tmp/) on the server, keyed by an item of my choice. In this case, I’ve decided to have the session information expire in an hour, so as long as someone is answering a new question every hour, the data stays alive.

Lines 22 to 25 handle an occasional “purge” of the database. Every four hours, some lucky dog comes along when the special purge key (of my choosing) has expired. This triggers a cache purging that takes a few extra seconds; a new purge key entry is then inserted to prevent this from happening for another four hours. If you wanted to, you could tell File::Cache to do this on every access, but this seems excessive.

The File::Cache module is found in the CPAN (at places like http://www.cpan.org), but it is being phased out for the more general Cache::Cache module, which has entered alpha testing. I presume File::Cache will remain around for a while, but you might look at Cache::Cache instead if you are writing your own code.

Line 29 opens the database. Yes, I still use dbmopen (mostly for simplicity) even though it claims to be deprecated.

Lines 31 to 37 define the “session” data being saved and loaded for every hit of a given quiz taker. @this_keys is the list of titles presented for this question. @unused_ keys are the remaining movies to choose from (preventing duplicates). $winner is an index into @this_keys defining the winner. $answered and $correct maintain the score.

Line 39 prints the CGI header, the HTML header, and a nice h1 to start off the page.

Lines 42 to 52 pull up any existing session information. The session tag is a 32-character hex string (we’ll see later how this is generated), which is included as part of the GET request query parameters. If it’s valid and recent enough, the session variables get loaded from the cache. If anything fails here, $session remains undefined so that we start a new session below (presuming a new quiz taker).

If we are currently in a valid session, the code in lines 55 to 80 handles the incoming previous quiz guess. The value of $answer is expected to be the answer number followed by a dash and the selected answer. Most of the time, this answer number should be the same as our session data of $answered; if so, lines 61 to 73 determine if it’s the proper answer and display the appropriate text. The text includes a reference to screenit.com’s Web page for further verification.

The movie name is obtained via the database, looking from the first line of the value up to the newline. However, if the incoming answer number is not the same as our session data’s answer number, it means the quiz taker has either answered the question already or is otherwise trying to cheat. In this case, the message in line 75 reminds them to stay on track.

If the session number is not valid, it’s time to start a new session, handled by lines 82 to 86. We’re using the MD5 module from the CPAN and computing a session ID in the same way that the Apache::Session module (which can also be found in the CPAN) computes it. I’m not sure how secure this is, but if it’s good enough for Apache::Session, it’s good enough for me to steal for this program.

Lines 89 to 92 select a winner, if there’s no active question, by drawing from the list of keys that have gone unused so far from the database.

Line 94 displays the header for the question part of the Web page.

Lines 96 through 105 save the current session data into the cache by creating an anonymous hash and then storing it using the default expiration (one hour, specified earlier).

Line 107 saves us a bit of time by caching the values for the list of keys for the current question choices out of the database.

Lines 109 to 111 give credit to screenit.com (in the form of an outbound link) and define the context of the question.

Lines 114 to 115 present the profanity paragraph from the database (via the local cache) in a bordered table with a single cell. The value consists of the title, a newline, and the paragraph of profanity data (with embedded newlines) ripped from the screenit.com pages. We don’t want to give away the title, so we print everything after the first newline.

Lines 118 to 122 display the answer choices as an ordered list. Each list item is a link back to this same CGI program (as captured in line 118) with the session information and the answer identifier included as query parameters. The link text is the title of the movie, given as the first line of the cached data value.

Lines 124 and 125 are commented out; while testing, I was too lazy to think of the right answer, so I made the program tell me.

Lines 127 to 136 append some boilerplate disclaimers to the bottom of the quiz question. Line 137 closes off the HTML, and we’re done.

To recap, when the program is first invoked, there’s no session data, so we first create a session ID (line 83); then pull up a list of all movies in the database (line 84); then generate the first list of candidates (line 90); and finally seect the winner (line 91). All of this is saved into server-side storage (line 104).

The user is then presented with the profanity paragraph for that winner (line 114) and a list of choices (line 119) from which to choose. These links lead back to another invocation of the same program, which then scores the choice (line 65) and repeats the process, picking a new set of choices and updating the server-side storage.

It’s easy once you’ve seen it, but I wrangled with this for a few hours, thinking through all the failure and cheater paths. I think I’ve whacked out something that does the job effectively.

Of course, this program can be extended in multiple ways. For example, a scoreboard could be maintained. There’s also the little mess of what happens when all the movies are used up from one pass. Should the quiz be over? Also, I’m using up movies from the list as alternate choices, but maybe the incorrect choices should just be drawn at random from the master list and not from the unused list. There are lots of possibilities; it’s all a matter of programming, as they say.

Hope you’ve had fun working with this quiz generator. Until next time, enjoy!

Listing One: Quiz Engine — Part I

1 #!/usr/bin/perl -Tw
2 use strict;
3 $|++;

5 ## config
6 my$DATA_DB= “/home/merlyn/Web/profanity_quiz”;
7 my $COUNT = 3;
8 ## end config
10 use CGI qw(:all);
11 use CGI::Carp qw(fatalsToBrowser);
13 ## set up the cache
15 use File::Cache;
16 my $cache = File::Cache->new(
namespace =>’profanityquiz’,
17 username => ‘nobody’,
18 filemode => 0666,
19 expires_in => 3600, # one hour
20 });
22 unless($cache->get(“_purge_”)){ #cleanup?
23 $cache->purge;
24 $cache->set(” _purge “, 1, 3600 * 4);
# purge every four hours
25 }
27 ## connect to the database
29 dbmopen my %DATA, $DATA_DB, 0666 or die
“Cannot open data: $!”;
31 ## session info
32 my @unused_keys;
33 my @this_keys;
34 my $winner;
35 my $answered;
36 my $correct;
37 ## end session info
39 print header, start_html(“Guess the profanity”),
h1(“Guess the profanity”);
41 ## first, pull up existing session data:
42 my $session = param(‘session’);
43 if (defined $session and $session =~
44 and my $data = $cache->get($session)) {
45 @unused_keys = @{$data->{unused_keys}};
46 @this_keys = @{$data->{this_keys}};
47 $winner = $data->{winner};
48 $answered = $data->{answered};
49 $correct = $data->{correct};
50 } else {
51 undef $session; # no good, so ignore
52 }
54 ##now handle form response if withina valid session:
55 if ($session) {
56 if (defined(my $answer = param (‘answer’))) {
57 if (my ($guess_answered, $guess_guessed) =
$answer =~ /(\d+)-(\d+)/) {
58 if (0 <= $guess_guessed and $guess_
guessed <= $#this_keys) {
59 print h2(“Scoring”);
60 if ($guess_answered == $answered) {
61 $answered += 1;
62 print “You guessed “,
63 a({-href => “http:// $this_
64 $DATA{$this_keys[$guess_
65 if($guess_guessed==$winner) {
66 $correct += 1;
67 print “whichis correct!”;
68 } else {
69 print “which is wrong.
The correct answer is “,
70 a({-href => “http://
$this_ keys[$winner]“},
71 $DATA{$this_keys
[$winner]} =~
/(.*)/), “.”;
72 }
73 @this_keys = ();
74 } else {
75 print “You’ve already answered
this! Stop trying to cheat!”;
76 }
77 print p(“Your total score so far is
$correctout of $answered.”);
78 }
79 }
80 }
81 } else { # start a new session:
82 require MD5;
83 param(‘session’, $session = MD5->hexhash
84 @unused_keys = keys %DATA;
85 $answered = $correct = 0;
86 @this_keys = ();
87 }
89 unless (@this_keys) { # pick a new question
90 push @this_keys, splice @unused_keys, rand
@unused_keys, 1 for 1..$COUNT;
91 $winner = int rand @this_keys;
92 }
94 print h2(“Show us how smart you are…”);
96 ## save session data for next hit:
97 {
98 my $data = {};
99 @{$data->{unused_keys}} = @unused_keys;
100 @{$data->{this_keys}} = @this_keys;
101 $data->{winner} = $winner;
102 $data->{answered} = $answered;
103 $data->{correct} = $correct;
104 $cache->set($session, $data);
105 }
107 my @this_values = @DATA{@this_keys}; # cache
%DATA we need
109 print
110 “Which one of these movies had this profanity
information at “,
111 a({-href => ‘http://www.screenit.com/‘},
“screenit.com”), “?”;
113 ## pull up the profanity paragraph, boxed for easy reading:
114 print table({-border => 1, -cellspacing => 0,
-cellpadding => 5},
115 Tr(td($this_values[$winner] =~
117 ## show the choices, with links back to us
including session tag:
118 my $url = url;
119 print ol(map {
120 my ($title) = $this_values[$_] =~ /(.*)/;
121 li(a({-href => “$url?session=$session&answer=
$answered-$_”}, $title));
122 } 0..$#this_values);
124 ## (for debugging, because I was lazy… :-)
125 ## print “\n(Hint: the answer is “, $this_
values[$winner] =~ /(.*)/, “)\n”;
127 print h2(“Disclaimer”);
129 print “All decisions of our judges are final. “,
130 “Even if two movies have the same answer. “;
132 print “And “,
133 a({-href => ‘http://www.screenit.com/‘},
134 ” had nothing to do with this program. It’s all “,
135 a({-href => “/merlyn/”}, “my”), ” fault. “;
137 print end_html;

Randal L. Schwartz is the chief Perl guru at Stonehenge Consulting and co-author of Learning Perl and Programming Perl. He can be reached at merlyn@stonehenge.com. Code listings for this column can be found at: http://www.stonehenge.com/merlyn/LinuxMag/.

Comments are closed.