Python: You SHOULD Be Using It

Whatever kind of program you have to write, you can probably write it in Python. What makes Python so powerful?

by Alex Martelli

Python has been around for a dozen years and is going strong — two production releases a year, a vibrant community, lively Net presence, yearly conferences, tracks on Python at Open Source and Web Development venues, books, articles, the works. Why is Python so popular? The reasons are simplicity, regularity, and the talent of Guido van Rossum, Python’s inventor and Benevolent Dictator For Life. Hundreds of people contribute to Python, but Guido has the final say; his hand at the helm makes Python a well-architected whole, not a soup of “features” (For more on Guido, see our interview with him in the December 2001 issue, available online at http://www.linux-mag/2001-12/vanRossum_01.html.). There are no “convenient” shortcuts, quirks, or special cases: just power through simplicity, clean syntax, and generality.

Python is simple inside, too; its highly modular, structured internals and clean, well-documented API makes it easy to port, extend, embed in applications, and interface with existing libraries. Jython, the 100 percent pure Java implementation of Python, lets you deploy Python wherever you can use Java, with full access to Java’s class libraries.

Python can also be found embedded in applications such as the cooledit editor and the Blender 3D modeler and is at the heart of the Zope Web application server. And Python has lots of extensions to let you handle diverse tasks, including numerical applications, image processing, distributed computing, multimedia, and games.

If you’re just beginning to learn how to program, are doing object-oriented programming, writing scripts, prototyping large programs, or even developing them entirely, Python is a strong candidate. Python provides power through simplicity, with full-featured core libraries, easy interfacing, and backwards compatibility, too.

Let’s take a closer look at Python and see just how it provides all this power.

Installation and Configuration…If You Need It

Python is probably already on your system. Try running python-V at a shell prompt; if Python’s in your PATH, this tells you which version it is.

If you don’t have Python, or have an older release (such as 1.5.2), you can either install it with whatever tool your distribution uses or download the source package for the latest stable release (currently 2.2) from ftp://ftp.python.org/pub/python/2.2/Python-2.2.tgz.

You can then build and install Python by running the following commands:

$ tar xzf Python-2.2.tgz
$ cd Python-2.2
$ ./configure
$ make
$ sudo make install

That’s it; you now have the latest and greatest Python release. For your convenience, make sure /usr/local/bin is in your PATH. If you have problems, send e-mail to help@python.org, giving all the details (copy and paste error messages, don’t summarize them). Many volunteers monitor that “help line,” so you should be able to get up and running quickly.

Interactive Python

An easy way to try Python is via the interactive interpreter:

$ python
Python 2.2 (#1, Nov 30 2001, 15:08:22)
[GCC 2.96 20000731 (Mandrake Linux 8.1 2.96-0.62mdk)] on linux2
Type “help”, “copyright”, “credits” or “license” for more information.
>>>

>>> is the prompt the interactive interpreter uses to ask for any statement or expression:


>>> 2+2
4
>>> 1.23**4.56
2.5702023016193025

The ** operator performs exponentiation, so we have just computed 1.23 to the 4.56th power. Python is a nice advanced calculator, but you will notice it’s missing trigonometric and other functions. No problem, they’re in the library (in the math module); we just need to import it:


>>> import math
>>> math.log(1.23**math.sin(0.45))
0.090044028754846059

The command import math binds the name math to the math module object, making all names bound in the module available as attributes of math, as shown by the calls to math.log and math.sin. If you don’t like the name math, you can bind the module object to a different name:


>>> import math as Foo
>>> Foo.log(1.23**Foo.sin(0.45))
0.090044028754846059

Alternatively, the command from math import * puts all the names from the math module into your current namespace, making them available directly:


>>> from math import *
>>> log(1.23**sin(0.45))
0.090044028754846059

The command from modulename import * is handy for interactive use or tiny scripts but much less readable in “real” programs; the reader might wonder where the names log and sin came from.

A lot of Python’s functionality lives in the library, neatly organized into modules and packages (hierarchical collections of modules). The current standard library has 165 top-level modules and packages and a total of almost 300 modules (including platform-specific ones); we’ll see some of them in later examples.








idle
Figure One: IDLE’s main window.

Python’s interactive interpreter is a text-mode program. If you prefer to use a GUI program for your development, you can use IDLE, the Python Interactive DeveLopment Environment, that is included with Python. IDLE is built with Tkinter, one of many Python packages that you can also use to develop your own GUI applications. You need to have Tcl/Tk (8.1 or better) installed on your system before you can build or use Tkinter. If you need to, you can download Tcl/Tk from http://tcl.activestate.com/.

IDLE’s “shell window” (see Figure One) resembles the interactive interpreter but adds features such as colorization and call tips. Menus and shortcuts let you open editor windows, write and edit Python scripts, run the debugger, view the stack, and so on. Many other Python IDEs, both free and commercial, are also available. Many old-timers, however, prefer the interactive interpreter and a good text editor.

Scripts

Python scripts are just text files (often with the extension “.py“, but it’s not required). Listing One contains a script that evaluates an arbitrary expression passed as an argument to it. If we run this program, we might see output like this:




Listing One: exp


1 #!/usr/local/bin/python
2
3 from math import *
4 import sys
5
6 expression = sys.argv[1]
7 print expression,’=',eval(expression)


$ ./exp 2+2
2+2 = 4
$ ./exp 1.23**4.56
1.23**4.56 = 2.57020230162

We use the normal Unix hash-bang (#!) method to invoke Python. Line 3 imports all the names from the math module. Line 4 imports the sys module, which gives us access to important aspects of a program’s environment. The expression sys.argv[1] is the first argument passed to the program; since we want to evaluate it as an expression, we bind it to the name expression in line 6. You could say “assign it to the variable” rather than “bind it to the name,” but “bind” and “name” better convey the connotations that correspond to Python’s semantics; this is covered in more detail later.

The print statement in line 7 emits the expression, an equals sign, and the result obtained by evaluating the expression with the built-in function eval. The print function automatically inserts spaces between the items it emits, and a newline at the end.

You may notice that print displayed the result of 1.23**4.56 with fewer digits than the interactive interpreter showed; floating-point arithmetic is inexact, and print and the interactive interpreter use different defaults for the number of digits to display. This is adjustable if you so desire.

Python and the Web

The shell is not the only way to run scripts. Web servers — such as Apache, for example — are popular ways to run scripts via the CGI standard. Listing Two contains a CGI script to evaluate expressions.




Listing Two: A cgi Script: exp.cgi


1 #!/usr/local/bin/python
2 from math import *
3 import cgi, sys
4
5 expression = ”
6 form = cgi.FieldStorage()
7 if form.has_key(‘expr’):
8 expression = form['expr'].value
9 try: result = eval(expression)
10 except:
11 error, detail, etcetc = sys.exc_info()
12 result = : error: ‘+str(error)+’, ‘+str(detail)
13
14 print ‘Content-type: text/html’
15 print
16 if expression: print ‘<p>’,expression,’=',result,’</p>’
17
18 print ‘<p><form action=”./exp.cgi”>’
19 print ‘Expression: <input type=”text” name=”expr”></input>’
20 print ‘</form></p>’

After placing this script in your cgi-bin directory, visiting http://localhost/cgi-bin/exp.cgi from a browser will activate the script.

This script is more complicated than the first because it takes precautions against irregular input. Line 5 binds the name expression to , the empty string. Line 6 calls the function FieldStorage of the cgi module, binding the result to the name form, allowing access to the form data. The object bound to the name form is a “dictionary” in Python (similar to a Perl “hash” or a C++ “std::map”). Using the name of a field from the form as the “index” into this object retrieves the field’s value.

Line 7 checks to see if the form included a field named “expr“; if not, the name expression remains bound to the empty string, so that the if statement on line 16 later evaluates it as false and makes no attempt to emit the expression and the result of its evaluation.

If the form does have a field named “expr,” line 8 binds its value to the name expression, and line 9 tries to evaluate the expression with the eval function and binds the result to the name result. Line 9 starts with a try clause, so any error it might raise is caught by the corresponding except clause in line 10, in which case lines 11-12 bind result to an error message. The + operators on line 12 perform string concatenation, not addition, since the objects they operate on are strings, not numbers.

Finally, lines 14-15 and 18-20 unconditionally emit the HTML form needed to access this script. Thus, the first time you visit the script’s URL, you’ll just get the form; when you fill in and submit the form, you get the result and the form again, in case you want to ask for another result.

You may have noticed that neither the if statement nor the except clause have any punctuation to delimit what is contained within their scopes. This is by design; all groupings of statements, such as the guarded block of an if statement or the statements within an except clause, are done by indentation; grouped statements are aligned with each other and shifted rightwards. Spaces and tabs can be intermixed and Python considers tabs to be equivalent to eight spaces, but it’s generally best to use all spaces.

Python uses neither keywords nor punctuation for statement grouping — a syntactic minimalism shared by a few other languages, including Haskell (named after mathematician Curry Haskell) and Occam (named in honor of medieval philosopher William of Occam, known for his principle, “Occam’s Razor”). Python, in case you didn’t know, is named in honor of Monty Python’s Flying Circus.

A Simple Search Engine in Python

Let’s take a look at some more of the modules in Python’s library. Say you have a directory full of important text files and often grep through them looking for certain words. Let’s use Python to index these files, then build a small search engine so we can search through them quicker. Take a look at Listing Three.




Listing Three: Creating An Index


1 import glob, fileinput, re, shelve
2
3 aword = re.compile(r’\b[\w-]+\b’)
4 index = {}
5
6 for line in fileinput.input(glob.glob(‘*.txt’)):
7 location = fileinput.filename(), fileinput.filelineno()
8 for word in aword.findall(line.lower()):
9 index.setdefault(word,[]).append(location)
10
11 shelf = shelve.open(‘shelf’,'n’)
12 for word in index:
13 shelf[word] = index[word]
14 shelf.close()

Line 3 binds the name aword to a regular expression object identifying a word: one or more word characters or hyphens between word boundaries. Line 4 binds the name index to an empty dictionary.

Line 6 loops over every line of all files that end with .txt in the current directory. Line 7 binds the name location to the current filename and line number. Line 8 loops over all the words in the line (using a lowercased copy of the string in line, as we are interested in case-independent searching).

Line 9 is executed for each word in each line. The setdefault method of the dictionary index returns the existing index entry for the word, or if the word was not yet in the index, setdefault binds a new entry to the default value [] (an empty list), and returns it. The append method adds the location value to the list.

When the loop is done, our index is stored in the dictionary index. We need to persist the index to disk in an easily searchable form. This is done with the shelve module. Line 11 opens a new shelf object in the file shelf and binds the name shelf to the resulting Python object. Lines 12-13 persist the dictionary to the shelf, element by element. Finally, line 14 closes the shelf object, and we’re done; our index is on disk, ready for searching.

Listing Four contains a simple script that uses the index to search for words and displays three lines centered around each occurrence.




Listing Four: Using An Index


1 import shelve, sys, linecache
2
3 shelf = shelve.open(‘shelf’, ‘r’)
4
5 for word in sys.argv[1:]:
6 try:
7 locations = shelf[word.lower()]
8 except KeyError:
9 print word+’: not found’
10 else:
11 print word+’:’
12 for file, line in locations:
13 print ‘ In file’,file+’:’
14 for delta in -1,0,1:
15 aline = linecache.getline(file, line+delta)
16 if aline: print ‘ ‘,aline,

Line 3 opens the ‘shelf’ file in read-only mode and binds the resulting Python object to the name shelf. Line 5 loops over each word (passed as arguments to the script); the construct sys.argv[1:], called a list slice, returns all the command-line arguments passed to the program. Line 7 looks up the list of locations for the word (lower-cased, again, to make the search case insensitive). The actual lookup is within the try statement in line 6, so that if the word is not in the index, the except clause in lines 8 and 9 handle the error by printing the “not found” message. The else clause in line 10 executes if no error was raised (i.e., if the lookup succeeded).

In this case, we print the word (line 11), then loop over the list of locations (line 12), printing the file name for each (line 13). The loop in lines 14-16 gets and prints the lines immediately before and after the line in which the word occurs, applying deltas of -1, 0, and 1 to the line number bound to line. The getline function from the linecache module is called with a filename and line number as arguments and returns the requested line (as a string, complete with trailing newline). If the requested line is not found (as might happen in our script, when a word is found on the first or last line of a file) then getline returns an empty string.

The if statement in line 16 prevents the print statement from printing nonexistent lines. The final comma causes the print statement to not emit a newline because the aline string already ends with one.

Extending and Embedding

Another feature that makes Python fun to use is its C-level API. It’s easy to extend Python with new functionality and to embed a Python interpreter in another application. There are a lot of extension modules written in C you can download and use, and you can find Python embedded as the scripting language of many applications.

The Simplified Wrapper and Interface Generator, SWIG (http://www.swig.org), makes it easy to wrap existing C libraries into Python extensions. If you like C++, you have even more choices. The Boost Python library (http://www.boost.org) lets you turn your C++ libraries into Python extensions using all the power of C++’s templates.

But I Need 100 Percent Pure Java

And if you want to access Java classes from within Python, you’re in luck! Jython (http://www.jython.org) implements the Python language in 100 percent pure Java. You need a highly compliant JVM (Kaffe doesn’t work, for example, but Javasoft releases do), because Jython exercises the Java specs to the limit. But what you get is awesome; your Python code can import and use any existing Java class just as if it were a Python module. No wrapping, no adaptation; the Jython runtime does it for you via Java reflection. A simple Jython servlet is shown in Listing Five.




Listing Five: A Jython Servlet


1 import javax
2
3 class hello(javax.servlet.http.HttpServlet):
4 def doGet(self, request, response):
5 response.setContentType(‘text/html’)
6 out = response.getOutputStream()
7 out.write(”’<html><head><title>Hello World</title></head>
8 <body><p>Now <b>this</b> is simplicity!</p></body></html>”’)
9 out.close()

Check It Out!

As we have seen in this article, Python can be used to perform many different kinds of tasks. Python’s rich library of modules lets you apply it to almost any kind of programming endeavor: database munging, number-crunching, image processing, games, Web servers, even cryptography.

It’s good that Python can do so much, because that has a practical benefit: ease of software maintenance. Programs do need to be maintained, even if you thought of them as “throwaway” when you wrote them. If you use a scripting language that emphasizes concision, variety, and cleverness, going back to a script written months ago can be a harrowing experience. Python emphasizes clarity, simplicity, and readability so that revisiting your old scripts is fun, not a chore. The proof of the pudding is in the eating, and the proof of Python is in the programming. Give Python a try. You deserve it!




Resources

Main site, chock full of both material and links:

http://www.python.org

Mailing lists and newsgroups:

http://www.python.org/psa/MailingLists.html

http://mail.python.org/mailman/listinfo/tutor

http://mail.python.org/mailman/listinfo/python-list

news://comp.lang.python

You can also mail any help request to help@python.org (volunteer helpers watch this address and will give you personalized help).

IDLE: http://www.python.org/idle

Tkinter: http://www.python.org/topics/tkinter

Some of the popular applications that embed and/or are written in Python:

Zope: http://www.zope.org

Cooledit: http://cooledit.sourceforge.net

Blender: http://www.blender3d.com

PySol: http://www.oberhumer.com/opensource/pysol

Extending Python:

Extending manual: http://python.sourceforge.net/devel-docs/ext/ext.html

C API reference: http://python.sourceforge.net/devel-docs/api/api.html

Boost Python: http://www.boost.org/libs/python/doc

SWIG: http://www.swig.org

Jython (Python on the Java Virtual Machine): http://www.jython.org



Alex Martelli lives in Italy and is a System Developer for AB Strakt, Sweden. Alex co-edited the Python Cookbook and is writing the forthcoming book Python in a Nutshell. He can be reached at martl@aleax.it.

Fatal error: Call to undefined function aa_author_bios() in /opt/apache/dms/b2b/linux-mag.com/site/www/htdocs/wp-content/themes/linuxmag/single.php on line 62