dcsimg

Scripting the Web with PHP

PHP is one of the most popular open source scripting languages for the Web. It's fast, effective, and very easy to learn. See for yourself.

There is nothing more frustrating than trying to solve a problem with a set of ill-fitting tools. In fact, sometimes the best solution is just to design a better tool. This was the situation back in 1994 when PHP was born. Originally, PHP was little more than a library of C code built to ease the tedium of writing hundreds of CGI programs. A simple parser would scan through an HTML file looking for specific tags and replace those tags with the output of a function called in the C library.

Initially, the parser simply called the C function of the same name as the parsed tag, but it soon started to grow and evolve. For instance, there were many situations where it was necessary to execute a certain tag only if the previous tag executed without error. This meant having to store the exit status of the previous tag somewhere. That “somewhere” didn’t exist, so variables were added to the language. But that wasn’t enough. In order to make the decision, PHP needed basic conditional logic (if…else statements). It wasn’t long before the combination of variables and conditionals grew into while loops and switch/case statements, and eventually this supposedly simple tag parser grew into a full macro language.

As the language syntax grew, so did the backend functionality that the macro language gave people access to. In terms of data storage, it initially only had support for simple dbm files, but mSQL and Sybase support was soon added. As more and more people downloaded PHP, they started adding functionality to it, and patches came rolling in. In true open source fashion, everyone added support according to his own needs.

When it was first created, PHP stood for “Personal Home Page” Tools. However, PHP has grown well beyond the personal home page realm, and the original meaning no longer applies. The name “PHP” now stands on its own; if pressed, we say that it stands for “PHP: Hypertext Preprocessor.”

Having been around in various forms since 1994, PHP is not a new language, and it certainly isn’t very original. It borrows ideas and concepts from many existing languages. What is unique about PHP is that it seems to appeal to a large number of Web developers. According to the Netcraft surveys (http://www.netcraft.com), nearly 20 percent of all domains on the Internet use PHP. The main reason for PHP’s rapid acceptance and growth is that it is extremely easy to learn. It also helps that thousands of ISPs around the world make PHP available to their customers. PHP runs on all major operating systems today, including all varieties of Unix and Windows.

So while its widespread availability and cross platform nature have been key to PHP’s success, perhaps its greatest strength is that any Web developer can pick it up and begin building things with it in no time at all. After reading this article, anyone with basic exposure to HTML and a programming language such as C, Perl, C++, Python, or Java should be able to write advanced dynamic Web apps in PHP within a couple of hours.

Installing PHP

While PHP is normally installed and used as an Apache module, it is not tied to Apache. PHP can also be used with a variety of Web servers, including AOLServer, Roxen, and thttpd. However, because Apache is by far the most popular, we’ll look at the setup process for it.

All modern Linux distributions include a PHP package that installs a DSO (Dynamic Shared Object — a fancy name for an Apache module) called libphp4.so. The DSO must be enabled in Apache’s httpd.conf file using the LoadModule directive:


LoadModule php4_module lib/apache/libphp4.so

Additionally, you must tell Apache that files ending with .php should be processed by PHP rather than served as-is:


AddType application/x-httpd-php .php

If you selected PHP when you installed your Linux distribution, odds are that these steps have already been performed for you.

There are other ways to run PHP; it can be built as a static Apache module. To do this, you will need both the Apache and PHP source tarballs. Simply follow the instructions in PHP’s INSTALL file. PHP can also be installed as a simple command-line parser like Perl, Python, and other scripting languages. That is, you can grab the PHP source tarball and run the following:


./configure && make install

You will end up with a /usr/local/bin/php binary. You can use the resulting binary to test your php scripts from the command line:


php script.php

Or, you can even write hash-bangpath scripts that have nothing to do with the Web:


#!/usr/local/bin/php -q
<? echo “Hello World\n”; ?>

If you build PHP from source, you’ll probably want to provide some additional configure flags to activate some of PHP’s many modules (the real glue). Have a look at


./configure –help

to see what is available and you’ll find an overwhelming number of configure options. (See the How Much Glue? sidebar for more on this topic.)




How Much Glue?

PHP probably holds the world record for the greatest number of autoconf flags in a single software package. At last count, it had 188 different configure switches. Freeping Creaturism? Not really — PHP is glue. It glues many different third-party libraries to the Web server, and it lets you control how many are glued in. Most PHP extensions can be built as standalone shared libraries that are loaded at runtime. Just add =shared after an extension switch, such as:


./configure–withftp=shared

You will end up with a modules/ftp.so file that can be dynamically loaded using dl() directly from any PHP script. Or, you can add extension=ftp.so to php.ini, and the functions in that extension will be available to every script.

PHP’s Tags

As you may have noticed in the bangpath “Hello World” example script, a PHP tag starts with <? and ends with ?>. This is known as the short tag style. There are actually four different tag styles you can use (see Table One).




Table One: PHP Tags






Tag ExampleDescription
<? … ?>Short tag style
<?php … ?>Long tag style which avoids XML Processing Instruction (PI) tag conflicts
<% … %>ASP style (must be enabled in your php.ini file)
<script language=”php”> … </script>JavaScript style

Many developers tend to use the short tag style, but if you are writing PHP scripts you plan on distributing, it is a good idea to use the long style <?php … ?> because it cannot be disabled in php.ini. Since it is possible to turn off the short tag style, scripts using it will (obviously) not work on a server where it has been disabled.

So a PHP script is normally a mixture of HTML and PHP tags. Listing One shows a basic PHP script.




Listing One: A Basic PHP Script


<html>
<head>
<title>A PHP Example</title>
</head>
<body>
<h1>A PHP Example</h1>
You are using
<? echo $HTTP_USER_AGENT ?>
to view this page.
</body>
</html>








PHP Figure 1
Figure One: PHP Example

As you can see, it looks like normal HTML until the point where something dynamic is needed. The $HTTP_USER_AGENT variable is a special variable set by Apache that contains the browser identification string (user agent) sent by the remote browser. Figure One (pg. 26) shows what this looks like in a browser.

To see all of PHP’s special variables and other information about your PHP installation, create a file which contains the phpinfo() function call and load it up in your browser. You can change ? echo$HTTP_USER_AGENT? in Listing One to <? phpinfo()?>.

Variables

One thing that makes PHP so easy to learn is the way it handles variables. HTML form variables are automatically turned into PHP variables. When this HTML form is submitted:


<form action=”script.php” method=”POST”>
<inputtype=”text”name=”var” value=”HelloWorld”>
<input type=”submit”>
</form>

script.php is run by Apache and PHP. Inside script. php the variable $var is automatically set to “Hello World” (or whatever you happen to type in the input box). Variables that come from posted forms are known as POST variables.

The same applies for variables set in the URL (known as GET variables). For example: http://localhost/test.php?var= Hello+World.

In addition to GET and POST, PHP’s automatic variables can come from the environment, cookies, and the Web server itself. Together, all these sources are known as the EGPCS (EnvGetPostCookieServer) variables. This also happens to be the default order in which PHP will import the variables. They are imported from the environment first, GET-method variables second, POST third, etc. If a variable exists in both the GET and the POST data, the POST data variable would take effect since it was set last.

This order can be changed by setting the variables_ order directive in the php. ini file, and you can also omit some. For example: variables_order=”EPCS”

would stop any GET-method variables from being imported into PHP variables. You can turn off all automatic variables by setting variables_order to an empty string, or you can set register_globals to Off.

Even if you tell PHP not to automatically import these variables, you can still access them through several arrays: $HTTP_ENV_VARS, $HTTP_GET_VARS, $HTTP_POST_ VARS, $HTTP_COOKIE_VARS, and $HTTP_SERVER_VARS. These are associative arrays (hashes, in Perl lingo) with the variable name as the index (or key) and the variable value as the value.

So, to get the current path environment variable, you’d simple write:


<?echo “Thepathis “,$HTTP_ENV_VARS["PATH"]?>

String Manipulation

PHP provides you with a variety of features for working with strings. There are two flavors of regular expressions (POSIX 1003.2 and Perl-compatible) as well as more basic functions like strlen(), is_numeric(), and so on.

Suppose that you needed to write a function that would check if the IP address entered in a form field is valid. It is possible to do this in a relatively complex regular expression, but it’s also quite easy (and possibly more readable) using PHP’s other string-related functions. Listing Two shows one solution to the problem.




Listing Two: IP Address Validators



function check_ip($ip) {
$a = explode(‘.’, $ip);
if(count($a) != 4) return false;
foreach($a as $val) {
if(! is_numeric($val)) return false;
if((int) $val > 255 || (int)$val < 0) return false;
if(strspn($val,’0123456789′)!= strlen($val)) returnfalse;
}
return true;
}

The check_ip() function will first explode the string into an array of parts based on the ‘.’ delimiter. (PHP’s explode() is similar to split() in Perl.) It will then check to make sure that there are four parts. Next, it will loop through each value to make sure it is numeric, between 0 and 255 inclusive, and that it contains only the characters 0-9. The last check is necessary because it’s possible to have non-digits in a number. For example, is_ numeric(2e2) returns true because 2e2 is an exponential notation for 200. But 2e2 is not valid in an IP address.

Database Access

One of the first things people tend to connect PHP to is a relational database. PHP has native support for all major databases, and even if your favorite database is not supported directly by PHP, there is a pretty good chance that you can use an ODBC driver along with PHP’s ODBC support and still be able to talk to it.

As a simple example, let’s walk through the steps that are necessary to create a database-driven guestbook using MySQL (If you need help getting or configuring MySQL, check out our March 2001 article “Hey! Leggo MySQL!: Installing, Configuring, and Using MySQL,” which can be found online at located online at http://www.linux-mag.com/2001-03/mysql_01.html.

1. Create the Database


mysql> CREATE DATABASE mydb;

2. Create the Comments Table


CREATE TABLE comments (

id int(8) DEFAULT ’0′ NOT NULL
auto_increment,
comment text,
ts datetime,
PRIMARY KEY (id)
);

3. Write the HTML and PHP Code

Save the code in Listing Three as guestbook.php and point your browser at it. After leaving a comment, you will see something resembling Figure Two.




Listing Three: A PHP Guestbook



<html>
<head>
<title>My Guestbook</title>
</head>
<body>
<h1>Welcome to my Guestbook</h1>
<h2>Please write me a little note below</h2>

<form action=”<? echo $PHP_SELF ?>” method=”POST”>
<textarea cols=40 rows=5 name=”note” wrap=virtual></textarea>
<input type=”submit” value=” Send it “>
<br>
</form>
<?
mysql_connect(“localhost”, “dbuser”, “dbpass”);
// Make sure your Webserver user id has
// access to the mydb table
if(isset($note)) {
$ts = date(“Y-m-d H:i:s”);
mysql_db_query(“mydb”,”insert into comments values
(0,’$note’,'$ts’)”);
}
?>

<h2>The entries so far:</h2>
<?
$result = mysql_db_query(“mydb”,”select * from comments”);
while($row=mysql_fetch_row($result)) {
echo $row[0] .” ” . $row[1] . ” ” . $row[2] . “<br>\n”;
}
?>

</body>
</html>








PHP Figure 2
Figure Two: The PHP Guestbook

As you can see, PHP makes writing the guestbook application very easy.

More Than Just HTML

Many people don’t realize that PHP can be used to generate content other than HTML. Using either the GD or the Imlib2 extensions, PHP can be used to generate images on the fly. For example, to generate a simple blue rectangle with the word PHP in it, you could use the script which appears in Listing Four .




Listing Four: Image Generation with image.php


<?
Header(“Content-type: image/png”);
$im = ImageCreate(200,80);
$blue = ImageColorAllocate($im,0x5B,0×69,0xA6);
$white = ImageColorAllocate($im,255,255,255);
$black = ImageColorAllocate($im,0,0,0);
ImageTTFText($im, 60, 0, 10, 55, $black, “arial.ttf”, $text);
ImageTTFText($im, 60, 0, 6, 52, $white, “arial.ttf”, $text);
ImagePNG($im);
?>

In the HTML page where you would like the image to appear, simply reference the script like this:


<img src=”image.php?text=PHP”>

Note that we are using a TrueType font here. You can grab any TTF font from a Windows box or download them from a number of sources on the Net.

In addition to images, you can also use PHP to generate content in Adobe Acrobat (PDF), Shockwave Flash (SWF), and other formats. See the PHP manual for complete details.

Session Management

Because of HTTP’s stateless nature, session management has become one of the classic problems that every Web developer must solve in any reasonably complex application. The stereotypical example is a shopping cart application, where you want to keep track of a user’s shopping cart contents as they click through your site.

Luckily, PHP has built-in session support, so you’ll never have to deal with many of the tricky implementation issues. It works by creating a session ID whenever the session_start() function is called. By default, this session ID will be sent as a cookie named PHPSESSID. It is then possible to add data to the session simply by calling session_register (‘variable_name’) for each variable that you wish to add to the session.

So, if on the initial page you do this:


<?
session_start();
session_register(‘my_var’);
$my_var = ‘Hello World’;
?>

On subsequent pages you simply need this:


<?
session_start();
echo $my_var;
?>

A number of session characteristics, including using URL mangling instead of cookies, can be set in the php. ini configuration file. The session.auto_start option lets you write session scripts that don’t even need to call session_start() at the beginning. This can be quite handy if all your pages are intended to be used with sessions.

Error Handling

Nobody’s perfect. Odds are that you’ll want to know when your PHP code breaks. PHP uses an internal bitmask that defines what kinds of errors and warnings will be generated by PHP automatically. Table Two lists PHP’s error numbers, types, and descriptions.




Table Two: PHP Error Types

NumberTypeDescription

1
E_ERRORFatal run-time errors

2
E_WARNINGRun-time warnings

4
E_PARSECompile-time parse errors

8
E_NOTICERun-time notices

16
E_CORE_ERRORFatal startup errors

32
E_CORE_WARNINGStartup warnings

64
E_COMPILE_ERRORFatal compile-time errors

128
E_COMPILE_WARNINGCompile-time warnings

256
E_USER_ERRORUser-generated errors

512
E_USER_WARNINGUser-generated warnings

1024
E_USER_NOTICEUser-generated notices

2047
E_ALLAll of the above

The default error reporting setting, which can be changed in your php.ini file, is E_ALL&~E_NOTICE, which means everything except notices. (PHP uses bitwise operators like those found in C.)

You can change the error reporting level within a script by using the error_reporting() function. For example:


$old_level = error_reporting(E_ERROR);

error_reporting($old_level);

This would drop the error level to only show errors and then restore the error-level back to its old setting afterwards. A shortcut for turning off all errors for a specific function call is to place a @ in front of the call:


@readfile($filename);

You can also turn off the display of errors and redirect all errors either to syslog or a text file. For example, the following three lines in your php.ini file will redirect all errors to /tmp/errors.log:


display_errors = Off
log_errors = On
error_log = /tmp/errors.log

Of course, the PHP manual contains more information about error trapping, reporting, and logging.

References

Like other modern scripting languages, PHP implements references (rather than pointers). References are best thought of as aliases in the symbol table; they point to other symbols. They do not point to memory addresses.

For example:


$a = “ABC”;

is a simple assignment that creates an entry in the symbol table named $a that refers to the string value “ABC”. If you wanted to create a second entry in the symbol table that refers to this exact same string value, you would use the following assignment:


$b = & $a;

Without the & in this assignment, PHP would have made a copy of the “ABC” string value instead of creating a reference. At this point $a and $b are completely equivalent symbol table entries for the “ABC” string value; the reference count on this string value is now 2 because there are two symbol table entries that reference it. PHP will free the memory associated with the string value once the reference count reaches 0.

Let’s look at how you can pass arguments to functions by reference:


function increment( & $arg ) {
$arg++;
}

$a = 1;
increment($a);

When $a is passed to the increment() function, a reference to the passed argument data named $arg is created in the function’s symbol table. This can save string-copying overhead if you pass large chunks of data to functions. Functions may also return references to data:


function & foo() {
$str = “Some big string”;
return $str;
}

$a = & foo();

In this case, the function foo() is defined as returning a reference (the first &), and the function call indicates that the returned reference should be used (the second &) rather than copying the data into a new variable.

Try It…

The best way to learn PHP is to try using it. There are dozens of books available, and the online documentation is available in 12 languages. There are also a number of very active mailing lists to help out beginners and advanced users alike. The Resources sidebar contains a list of the essential sites.

Download a package for your distribution or download the source tarball and install it yourself. Even if you are an experienced developer with another language, you’ll probably find that there are times when PHP can get the job done faster than most other tools.




RESOURCES

http://php.net The main PHP Web site

http://php.net/books.php List of books about PHP and related topics

http://php.net/manual PHP’s online manual, searchable/available in many languages

http://php.net/support.php Information about PHP mailing lists and newsgroups

http://phpbuilder.com A community site for PHP programmers to share



Rasmus Lerdorf invented PHP and is a member of the Apache core team. He can be reached at rasmus@php.net.

Comments are closed.