Exposing your APIs to Python

April's "API Spy" introduced Python's C API and showed how a Python interpreter can be embedded in a C program. For many tasks where you need to run a Python script from within C code, last month's technique is sufficient. However, as your C programs and Python scripts evolve, you may want or need more advanced interaction between the two languages.

April’s “API Spy” introduced Python’s C API and showed how a Python interpreter can be embedded in a C program. For many tasks where you need to run a Python script from within C code, last month’s technique is sufficient. However, as your C programs and Python scripts evolve, you may want or need more advanced interaction between the two languages.

For example, in addition to passing data to your Python scripts, you may also want to expose your C libraries to Python, so that Python can call back to your application, creating a true, two-way channel of communication between your C and Python code.

In fact, exposing C libraries to the Python interpreter is eminently practical:

* C code is often organized into one or more “libraries” of functions that are useful to an entire application. Exposing these libraries to embedded scripting languages allows an entire application to be scripted.

* Many existing, third-party C libraries accomplish certain tasks that no Python library can. So, instead of duplicating the implemention of those C functions in Python, it’s often faster (faster to develop and faster to execute) to “wrap” the existing C library.

* Existing Python code is often optimized by re-implementing it as a C library that’s then wrapped with a Python module. This provides the best of both worlds — rapid prototyping in Python and optimized performance in C — as needs dictate.

This month, the “API Spy” shows you how to wrap C functions and import them into Python. To focus on the process rather than the code, we’ll use a common ANSI C function that’s already supported by Python. Although the C code is simple, the technique can be applied to any C function.

Creating New Python Modules in C

To a Python programmer, there’s no distinction between Python modules written in C and Python modules written in Python, other than raw execution speed. (C code is generally several times faster than equivalent Python code.)

C’s transparency comes from a little “boilerplate” code that’s used in all extension modules. With that boilerplate in place, new extensions written in C follow a very straightforward pattern:

All of the code to wrap a C library is (typically) contained in one .c file. This file contains the standard boilerplate code and the library wrapping code.

C functions are wrapped by Python functions that translate Python parameters to C parameters and back.

As with all C programs, compiling and linking are important details. Let’s leave those to the end, and jump into some code.

Listing One shows a C function, my_mkdir(), that creates a directory. Python callers must provide a name for the new directory, and can optionally provide a permissions mode (such as 0600). Given one or two arguments, my_mkdir() simply calls the standard ANSI C function mkdir() to do the work. (Again, the Python module os.mkdir already exists and provides the same features, but we’re ignoring it.)

Listing One: demo1.c creates a new Python module

#include <Python.h>
#include <sys/stat.h>
#include <sys/types.h>

static PyObject* my_mkdir(PyObject *self,
PyObject *args)
int sts;
char *path;
mode_t mode = 0777;

if (!PyArg_ParseTuple(args, “s|i”, &path, &mode))
return NULL;
sts = mkdir(path, mode);

return Py_BuildValue(“i”, sts);

static PyMethodDef myModuleMethods[] =
{“mkdir”, my_mkdir, METH_VARARGS, “Create a
new directory.”},
{NULL, NULL, 0, NULL} /* Sentinel */

PyMODINIT_FUNC initmyModule(void)
(void) Py_InitModule(“myModule”, myModuleMethods);

There are three important stanzas of code in Listing One:

1. The first is the definition of my_mkdir(). my_mkdir() uses the standard Python functions (provided in Python’s C API) PyArg_ParseTuple() and Py_BuildValue() to parse the arguments to my_mkdir() and construct a return value, respectively. The latter function was discussed last time. PyArg_ParseTuple() is described in more detail in the next section.

2. Next is the definition of myModuleMethods, a static array of PyMethodDef structures. This is part of the boilerplate necessary to interface C with Python. This data structure is passed to the interpreter to define top level functions like my_mkdir(). Notice how the last element of the array is all NULL values. This is a sentinel or marker definition that indicates to Python that it’s the last definition in the list.

3. Finally, look at the initmyModule() function. This is also part of the boilerplate code needed to interface your new module with Python. This function is called by the Python interpreter when the module is imported. It uses the standard Python C API function Py_InitModule() to initialize the module, passing in the new module name and list of module methods.

Extracting Parameters

In most cases, you won’t be able to just pass Python arguments to a C library, because most libraries know nothing about Python or Python data types. To pass data from a Python module to a C library, you must first translate Python objects into C data variables and structures. Thankfully, the Python C API provides a handy function that does just this, PyArgParseTuple().

PyArgParseTuple() expects a tuple containing Python function arguments, a special string called a format specifier, and the addresses of zero or more C variables that are assigned translated values.

Listing One shows how you can call a C function from a Python module, passing in a string and getting an integer result (but there are many other combinations of arguments that can be passed and returned). In the code, the Python function that wraps the C function has a PyObject * argument named args. This argument is a pointer to the Python tuple object containing the arguments for this Python function.

For example, this call expects one integer argument:

int i;
PyArg_ParseTuple(args, “i”, &i);

The format specifier string “i” contains one i, indicating that the function expects one integer argument. This argument is “parsed” out of the tuple and assigned to the C integer variable i.

In Listing One, the format specifier string is s|i, meaning that the Python function expects at least one string argument, but will also accept an optional integer argument.

PyArg_ParseTuple(args, “s|i”, &path, &mode)

Every argument to the left of the pipe is required; everything to the right is optional.

So, if a second integer argument to the Python function is omitted, the C variable mode will not be assigned a new value. (Instead, it retains the default value 0777 that it was initialized to.)

As another example, this call to PyArg_ParseTuple() expects no arguments:

PyArg_ParseTuple(args, “”);

Here, if any arguments are contained in args, the call raises a “Too many arguments” exception when executed in Python.

PyArg_ParseTuple() fails if the arguments provided do not match the format specifier. If PyArg_ParseTuple() fails, it returns a false value. If that occurs, you must return NULL from your method to indicate that an error occurred.

Keep in mind that although PyArg_ParseTuple() checks to make sure the Python arguments are valid, it cannot verify that the addresses of the C variables are valid, so be careful and make sure the C variables you are passing to PyArg_ ParseTuple() are of the correct type.

Compiling and Linking

In most cases, Linux systems use dynamic loading, which means that you will build your Python extension module as a shared library with (typically) a .so (Unix) or .pyd (Windows) file extension. The resulting shared library is loaded by your operating system for the Python interpreter at run-time, so you don’t need to modify the Python interpreter in any way.

(If dynamic loading isn’t available, you must use static loading to link your module directly into the Python interpreter. This is rarely needed, as most modern operating systems and C compilers support dynamic shared library loading, and is beyond the scope of this article.)

Prior to Python 2.0, building a shared library for Python was a bit tricky, but these days you can use the distutils package to easily get the job done. The distutils package determines what’s necessary to compile your new module and even detects your platform and C compiler for you. All you have to do is specify the name of your new module, where the .c file can be found, the location of any necessary include files, and the location of any necessary libraries to link the new module against.

All of this information is specified in a Python file called setup.py, as shown in Listing Two. setup.py builds the library code in Listing One.

Listing Two: a distutils setup program

from distutils.core import setup, Extension

mod = Extension(“myModule”, ["demo1.c"])

setup(name=”myModule”, version=”0.1″,

Listing One does not need any additional include files or libraries, but if your code does, the information can be easily specified in the constructor for the Extension object:

module1 = Extension(‘myModule’,
sources = ['demo1.c']
libraries=["mylibrary1", "mylibrary2"],

The keyword arguments include_dirs and library_dirs specify where header files and libraries can be located and the keyword argument libraries specifies the libraries to link against the extension module. distutils passes this information to your C compiler.

For example, if you were compiling the extension module on Linux using the gcc compiler, distutils would ensure the correct -I arguments for compilation and the proper -l and -L arguments for linking.

Now, to build your new module, just run the command python setup.py build. This command causes distutils to compile your new module into a shared library in the build sub-directory of your source tree.

distutils has many other cool features as well. If you plan on distributing your extension module to other users, you can have distutils build a distribution for you. distutils can build an Windows installer distribution for Windows or an RPM distribution for Linux. These features and others are covered in the distutils documentation.

Using the Module

Once your module is compiled and linked, distutils deposits the new .so file into the build subdirectory of your source file location. The .so should then be placed in a location where your dynamic linker can find it, or you can run the Python interpreter in the same directory as the .so file.

Once Python starts, you can use your new extension module immediately using import myModule, and you can call your extension’s functions using the new module name:


The line above creates the directory foobar in the current working directory with the default mode specified in Listing One. If necessary, you can use the optional second argument to set the mode for the newly created directory:

myModule.mkdir(foobar, 0700)

Using Your New Module in an Embedded Interpreter

When you import myModule from a standalone Python interpreter, the initmyModule() function is automatically called.

However, when you try to import the module from an embedded interpreter, you must call initmyModule() explicitly.

As shown last month, here is a main() function that calls the module initializer function after it calls Py_Initialize():

int main(int argc, char *argv[])



Now, the myModule module is properly initialized and can be used from an embedded interpreter.

Better than Snake Oil

By making it easy to extend and embed the Python interpreter, the Python developers have given C programmers a new and powerful tool to use anytime they please.

For example, instead of re-implementing your own lists, dictionaries, tuples, strings, and other high-level data structures, you can simply re-use Python’s.

In addition, you get the benefits of having one of the most popular scripting languages available to your application.

Later in this series you’ll see even more powerful synergies between Python and C, including how to define new types in C and subclassing from C types in Python. Stay tuned to Linux Magazine and the “API Spy” for more.

Michel Pelletier is a Python, Java, and C programmer, and author of The Zope Book and the upcoming Definitive Guide To Zope 3 to be published by Apress.

Comments are closed.