Web application security is often an afterthought. You start with the best of intentions -- build- ing a quick prototype which allows your users to get a feel for how the application might work. But the next thing you know, they're using it regularly and you've invested quite a bit of time and effort in the former prototype.
Web application security is often an afterthought. You start with the best of intentions — build- ing a quick prototype which allows your users to get a feel for how the application might work. But the next thing you know, they’re using it regularly and you’ve invested quite a bit of time and effort in the former prototype.
Not long after that, you realize you need to decide how to limit access to all or part of the application and do it without a major development effort. Luckily, PHP makes it easy to put an initial layer of security on your application. In this month’s column, we’ll look at what’s involved in simple user authentication so that you can keep out those users who shouldn’t be allowed in.
Before diving into PHP, it’s worth noting that you don’t need to add any code to your application. Apache knows how to validate usernames and passwords against a simple password file. In much the same way as we could enable PHP on a per-directory basis (last month), by adding a few lines to a .htaccess file, you can set up basic security for your application. In fact, you can even lump users into groups and allow (or disallow) them by their group names.
The most basic form of Web-based user authentication is known as Basic Authentication. This is an amazingly simple and very old protocol (well, in Web years, anyway) which doesn’t rely on any of the fancy encryption, cookies, checksums, or much of anything else that currently exists.
When your browser requests a secure document, the server looks to see if the request contains a username and password to authenticate you. If not, it responds with an HTTP 401 response code. Your browser then displays a dialog box asking for a username and password.
Under the hood, it works like this. You ask your browser to fetch a Web page such as http://jeremy.zawodny.com/lamp/page.php. In doing so, it connects to jeremy.zawodny.com on TCP port 80. And, it will issue a request which looks something like the following:
GET /lamp/page.php HTTP/1.1
But since your browser doesn’t inherently know that the Web page is password protected, it doesn’t send along any authentication information. So the server simply sends back a request for authentication with headers like the following:
HTTP/1.1 401 Authorization Required
Date: Mon, 05 Nov 2001 02:11:11 GMT
Server: Apache/1.3.19 (Unix) PHP/4.0.4pl1 mod_perl/1.25
WWW-Authenticate: Basic realm=”My Site”
The two interesting items in the response are the 401 response code and the WWW-Authenticate header. As you might guess, the 401 code tells your browser it is not allowed to fetch the URL that it requested. The WWW-Authenticate header tells the browser it should use the Basic method for authentication in the realm named My Site. The realm is a string that your browser should display in or near the title of the username/password dialog box it will display.
After you type in a username and password and tell your browser to try again (by clicking the OK button), it will submit the same request along with an Authorization header that contains an encoded (not encrypted) combination of the username and password you typed. If the username you typed was “Aladdin” and the password was “open sesame,” the header would look like:
Authorization: Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ==
The string that comes after the word “Basic” is simply the string “Aladdin: open sesame” (the username and password separated by a colon) that has been encoded in Base-64. This is the same encoding method used by MIME (Multipurpose Internet Mail Extensions) to send binary data, such as images, in a more friendly fashion. The server decodes the Base-64 string and checks against the username and password as required by the controlling .htaccess file. If the check is successful, the server sends the page to the browser.
Basic Authentication really is basic. It’s not a very secure protocol for exchanging such information, because it sends data using a known and easy to decode algorithm. However, it is easy to understand and implement, it’s supported by all browsers, and it doesn’t require cookies or other advanced tricks.
Coding It Up…
Having seen what’s involved behind the scenes, you’re probably wondering how to translate this into PHP code. The ideal way of implementing an authentication system involves making as few changes as possible to existing code, so we’ll do just that.
Listing One shows a very simple PHP page. It uses PHP’s require() function to import some logic from a separate include file, userauth.inc. The only other unusual thing about the page is the $PHP_AUTH_USER variable that is accessed when generating the greeting. Clearly, all the magic is happening in that include file.
1 <? require(“userauth.inc”) ?>
3 <head><title>Auth Test Page</title></head>
6 <h1>Welcome, <? echo $PHP_AUTH_USER ?></h1>
8 <p>You’ve been authenticated.</p>
Listing Two reveals the magic of userauth.inc. Looking at it in detail, we see that starting at line 5, it sets up two variables that contain information that users could see — the realm and the error message.
Listing Two: userauth.inc
3 #setup a few handy variables…
5 $realm = “My Site”;
6 $err_msg = ”
8 <head><title>Invalid Username/Password Entered</title></head>
10 <h1>Invalid Username/Password Entered</h1>
12 <p>You must enter your username and password. If you do not have a valid
13 username and password, you probably shouldn’t be here.</p>
19 #function to handle rejecting users
21 function auth_reject()
23 global $err_msg, $realm;
24 Header(“HTTP/1.0 401 Unauthorized”);
25 Header(“WWW-Authenticate: Basic realm=\”$realm\”");
26 echo $err_msg;
30 # check to see if username is set
32 if (!isset($PHP_AUTH_USER))
37 # check to see if password is set
39 if (!isset($PHP_AUTH_PW))
44 # check username
46 if ($PHP_AUTH_USER != ‘admin’)
51 # check password
53 if ($PHP_AUTH_PW != ‘FooBar2!’)
58 # If we make it this far, the user is good to go…
Then at line 21, the auth_reject() function is defined. It’s called any time a user fails to authenticate properly. By using PHP’s Header() function to add HTTP headers to the response, it produces the 401 response code as well as the WWW-Authenticate header with the name of the realm. Finally, it calls exit() to end the script.
The rest of the code performs a series of checks on the variables $PHP_ AUTH_USER and $PHP_AUTH_PW. PHP automatically populates them with the decoded username and password (if any) submitted along with the request. So, our script simply checks first to see that each of the variables is set (has a value at all) and calls auth_reject() if either check fails.
From there, it checks the actual values of both variables. As you can see, the only way to be properly authenticated by the logic in userauth.inc is to use the username “admin” and the password “FooBar2!”, because they are hard-coded in the script.
If everything checks out, the code in userauth.inc finishes, and PHP continues on with page.php to finish handing the request. Since the user has been authenticated, the $PHP_ AUTH_USER variable is available to page.php to use for greeting the user.
But there are a few problems with this approach. The most important is that it really isn’t all that secure. All someone needs to do to gain access to your application is read the userauth.inc file. If there are other people who have accounts on the server, make sure they cannot read this file!
However, even that’s not enough. What’s to stop someone from requesting http://jeremy.zawodny.com/lamp/ userauth.inc in their own browser? Absolutely nothing. Because of this, it’s best to keep all “include files” outside of Apache’s document root so nobody will ever be able to access them via HTTP. This means changing the require() call to use a fully qualified path rather than just a filename.
If you really like having all the files for your application in one place, there is another option. Add a rule like this:
<Files ~ “\.inc$”>
Order allow, deny
Deny from all
to Apache’s httpd.conf file. It will prevent Apache from serving any files ending in .inc, no matter where they might be. It will then be safe to leave userauth.inc right where it is.
Even if you do a good job of guarding the file, there’s always the chance that someone will be able to nab a copy of it somehow. And since the username and password are sitting in clear text, they’ll have a pretty easy time figuring them out.
A better approach is to use a hashing algorithm such as MD5. Rather than store the username and password in the file, you store the hashed version of each. PHP has a built-in md5() function which accepts a string and returns the MD5 hash of the string. So rather than that using the plaintext admin, you’d use 21232f297a57- a5a743894a0e4a801fc3, which is the MD5 hash of “admin.” Specifically, change Line 46 to read:
if (md5($PHP_AUTH_USER) != ’21232f297a57a5a7-43894a0e4a801fc3′)
You’d make a similar change in Line 53 to protect the password. Anyone who found a copy of userauth.inc would then have to write a script that iterates over millions of different strings, hoping that one of them hashes to the correct value. While that’s not an impossible task to accomplish, it is likely to deter all but the most determined folks.
Room to Grow
The code also has a poor assumption in it. It believes that there is only one valid username and password combination. In reality, it’s rarely acceptable for users to share the same authentication credentials. And if you need to support more than a dozen or so users, hand-coding each username and password (or their MD5 hashes) into the script will quickly become tedious and hard to maintain. Your life will be much easier if you enlist the help of some external storage: a flat text file, dbm file, MySQL database, etc.
Taking things a step farther, you could bypass the need for users to have Yet Another Password and authenticate them against something they already use, such as their e-mail retrieval credentials. PHP can easily talk to POP and IMAP servers, so that wouldn’t be too hard to rig up. Similarly, if you happen to have an LDAP-enabled database handy, you could use PHP’s LDAP connectivity. (See this month’s Guru Guidance column for more on LDAP.)
Clearly authentication is a big topic. There are a variety of things to consider if you are going to roll your own authentication system — even if it’s a really simple one. The good news is that you don’t have to build your own system from scratch. There are freely available PHP libraries that include rather sophisticated authentication, access control, and session management routines. In next month’s column, we’ll take a look at how to use one of them.
Jeremy Zawodny uses open source tools in Yahoo! Finance by day and is writing a MySQL book by night. You can reach him at jeremy@ zawodny.com.