Finding and Defining Features

In September, we discussed the significant advantages of re-implementing desired, but less common functions: if you use a feature of your local operating system, but discover it doesn't exist on other platforms, write your own implementation, and make that code a part of your distribution.

In September, we discussed the significant advantages of re-implementing desired, but less common functions: if you use a feature of your local operating system, but discover it doesn’t exist on other platforms, write your own implementation, and make that code a part of your distribution.

In that column we also discussed the benefit of testing for features in your code: feature test macros make code easier to port, and far easier to read.

This month, we’ll discuss four ways to generate feature definitions on any Unix-like platform. The four techniques are: by hand, derived from OS definitions, using metaconfig, and using autoconf.

Feature Test Macros Made By Hand

One of the most common mechanisms for configuring software is downright old-fashioned: setting features by hand. This technique requires you to edit the project’s or package’s configuration file (usually a .h file), and comb through it line by line, changing #ifdefs as needed. Listing One shows feature.h, an include file that configures logger.

Listing One: A simple include file that declares what system features are available

/* features.h — features used by logger(1l) from the local system */

/* HAS_SYSLOG — does the system have a syslog implementation? */
#define HAS_SYSLOG 0

* HAS_SNPRINTF — does the system have a working snprintf() function?
* If not, we’ll use our local version. snprintf() is a bounds
* checking version of sprintf(3s).
#define HAS_SNPRINTF 0

* HAS_STRFTIME — does the system have an strftime() implementation
* we can use to generate time stamps? If not, we’ll fall back
* on a version pulled from the *BSD source trees.
#define HAS_STRFTIME 0

/* PATH_LOGFILE — where to drop the logfile. */
#define PATH_LOGFILE “/var/log/syslog”

/* PATH_CONSOLE — path to the console device */
#define PATH_CONSOLE “/dev/console”

Why use this technique? It’s simple, and it’s an effective choice if you don’t have the time or ability to create a complex self-configuration script. Tweaking and modifying the configuration is easy, and the process allows the user to easily debug compilation and runtime problems, mostly due to the familiarity learned during the configuration process.

However, while generating feature definitions by hand may be easy for you, the developer of the software, it’s terribly inconvenient for your “customer,” who could be another programmer, but could also be a system administrator, or a true end-user. In this case, whoever wants to build your code has to know or learn which features are available and suitable for their local system — something not everyone will be able or willing to do.

If you must distribute code like this, isolate all of the configuration information in a single include file. The file should describe each of the prerequisites and each feature used, and describe what to do if the feature doesn’t exist on the end user’s platform.

In general, it’s good practice in any project to document the features and pre-requisites required or expected by your software. For example, imagine someone trying to port your Linux code to Windows without cygwin. A little bit of well-written and insightful documentation can greatly ease the porting effort should your software move to a dissimilar platform. This documentation should also explain how to extend the project’s Makefile, if that’s needed.

By the way, you use a variation of configuring by hand every time you type make config in /usr/src/linux to configure your Linux system.

Features by OS Definition

The second method of generating feature definitions is “deriving” them from the OS definition. Early implementations of this technique showed up as lots of #ifdef OSNAME littered throughout source code. But, we all know that’s bad style. So, it has mutated into a single source file that converts OS names into feature definition lists.

To use OS definitions to configure your software package, create an include file containing only the feature definitions for the supported operating systems. Then, for each supported operating system, create the feature definition block protected by a preprocessor macro that identifies the operating system. For many platforms, the C compiler provides a nice single definition to uniquely identify the system without outside intervention. However, on some systems (such as the many System V Release 4 systems for Intel released in the early ’90s), a new macro may be needed to differentiate between variations (this may also be needed to support various Linux distributions).

Inside each OS-specific preprocessor block, turn features on or off as the case may be for that OS. Listing Two shows an excerpt of one such configuration.

Listing Two: Using OS definitions to configure packages

/* determine features by looking at compiler definitions */

#include “features.h”

/* Linux, in its various incarnations */
#if defined(__linux__)
# define HAS_SYSLOG 1
# define HAS_STRFTIME 1
# if defined(__GLIBC__)
# if __GLIBC__ > 1 && __GLIBC_MINOR__ > 1
# define HAS_SNPRINTF 1
# endif
# endif
#endif /* defined (__linux__) */

/* EmbededOS — supports ANSI C, but only barely looks like UNIX */
#if defined(__EmbeddedOS__)
# define HAS_SYSLOG 0
# define HAS_STRFTIME 1
# define HAS_SNSPRINTF 0
# define PATH_CONSOLE”cons:”
# define PATH_LOGFILE”DUA0:[0,0]LOG.TXT”
#endif /* defined (__EmbeddedOS__) */

This technique comes closest to providing a true “compile and go” software distribution to the end user. An example of a project that uses this method is sendmail.

The disadvantages? You need access to a large number of systems (or access to a large, responsive user community) to create and set the correct feature definitions for each and every OS and operating system version. That’s a lot of work and quite a commitment, and chances are you won’t be able to keep up with new OS releases fast enough.

Auto-configuration with metaconfig

metaconfig is the automatic configuration file generator developed by Larry Wall (and a cast of others). Wall used metaconfig early on in his news reader, rn, and later in trn. metaconfig is also used to build Perl.

metaconfig was one of the earliest (perhaps the earliest) “singing and dancing” scripts to automatically deduce features available on an OS. It’s been in regular use since 1987 (at least). Strangely, enough, metaconfig‘s package is known as dist, probably because it provides a strong set of distribution tools in addition to generating a Configure script.

metaconfig-generated Configure scripts are interactive scripts that provide directions to the end-user. If metconfig can’t figure something out, it typically asks the end-user for confirmation. For the average user, on a relatively normal system, running Configure scripts is mostly an exercise in hitting the RETURN key. metaconfig also allows you to ask questions that might be site-specific. Examples of such questions are the organization name and the email address of the site administrator.

metaconfig provides a rich, standard set of feature definitions to the developer. At the time of this writing, there are 465 feature macros for use in C language modules, and 468 feature macros for use in shell scripts.

Here’s a typical use of metconfig to configure a new project:

  • Run packinit to generate packaging and patching information. packinit asks a few questions about your project, including the email address for support/enhancement requests, the hostname of the distribution’s FTP server, the path to the FTP server’s public space, where to post patches, and others. The directory where you run packinit is considered the top level directory of the project.

  • Update your C modules and shell scripts to use the feature macros defined in the metaconfig glossary. You may need to add local metaconfig modules to your project to allow it to sense/determine things that metaconfig doesn’t currently know about.

  • Generate any .sh scripts needed to write the Makefile or other shell scripts. A program called makeSH can make this easier.

  • Write a MANIFEST.new file in the top level directory. The format for MANIFEST.new is filename description. List only source files in the MANIFEST.new because metaconfig uses the files listed in MANIFEST.new to determine what feature tests to include in the final Configure program. MANIFEST.new is not be distributed with your final package. However, you can provide a MANIFEST file (which should be listed in both itself and MANIFEST.new) that lists all the files that should be delivered. The resulting Configure program knows how to use MANIFEST to verify that the package is complete.

  • Create local metaconfig modules, if required. metaconfig will look in a local .U directory to pull in locally defined modules needed to generate the Configure program.

  • Run metaconfig to generate Configure, config_h.sh, Makefile.sh, and any other .sh files you defined in your MANIFEST.new. metaconfig updates MANIFEST.new to include any generated programs. Remember to update MANIFEST in a similar fashion.

Listing Three is an initial MANIFEST.new for a project including the features.h shown in Listing One. Running metaconfig adds the following two lines to MANIFEST.new.

Listing Three: A MANIFEST.new file

MANIFEST This file, a listing of all included sources
Makefile.SH Generates into a Makefile to build logger(1l)
features.h Feature definitions
logger.1 Manual page, in *BSD mdoc format
logger.c logger(1m) main module
snprintf.c re-implementation of snprintf(), from sendmail (from UCB)
strftime.c re-implementation of strftime(), from UCB
syslog.c file based re-implementation of syslog(3)
syslog.h supporting include file for re-implementation of syslog(3)

ConfigurePortability tool
config_h.SHProduces config.h

If config_h.sh is run through the shell, it becomes config.h.

Once the software tests correctly, you can use makedist to generate distribution sets. makedist will gladly try to generate shell archives suitable for posting to the USENET comp.sources.* groups. Example packages that use metaconfig are trn and Perl.

Probably the most significant advantage to using metaconfig is its interactivity. It allows you to ask policy type questions at the time of install. trn uses this feature to determine which USENET news distributions to support, to set the preferred news server, to get your organization name, etc. metaconfig also automatically generates the Configure script directly from the sources, and provides the tools to greatly ease packaging and distribution.

The biggest disadvantage? metaconfig-generated Configure scripts are extremely interactive. This annoys many people. A second disadvantage is that the metaconfig distributions appear to have forked. At the moment, there does not seem to be a single metaconfig maintainer, so each project that uses it is probably maintaining their own distribution, usually based on the current Perl distribution.

GNU autoconf

The final feature definition generation system we’ll look at is the Free Software Foundation’s autoconf utility. autoconf is frequently used with automake and libtool, although we won’t delve deeply into either of those tools for now. (automake and libtool are topics suitable for columns of their own.)

autoconf is the newest of the feature definition generation tools. It’s been around in its current form for about 10 years.

autoconf-generated configure scripts attempt to determine everything needed in a batch fashion. Any information that cannot be determined by poking and prodding the system must be provided by command line switches to the generated configure script.

To use autoconf to generate a configure script, you create a configure.in file using various autoconf macros to describe the features to be tested. autoconf has no intrinsic knowledge of the files in the product distribution. This means that autoconf doesn’t provide a mechanism to automatically search source files for feature definitions.

For example, setting up logger to use autoconf includes the following steps:

  • Execute autoscan over the contents of the source directory, to generate a configure.scan file. configure.scan can be used as a starting point for your configure.in. For the logger project, autoscan generated Listing Five. The 26 lines in Listing Five are converted by autoconf into over 4,000 lines of shell script.

  • Using configure.scan as a starting point, write configure.in using autoconf m4 macros and in line shell scripting. configure.in is effectively a Bourne shell script that is preprocessed via m4 to expand the autoconf-provided macros. Since it’s a Bourne shell script, any Bourne shell constructs may be used in testing the system for feature availability. In general, you want to limit yourself to the autoconf provided macros as much as possible, but the ability to include Bourne shell code does allow easy extension. If you find yourself using the same Bourne shell code repeatedly, you might consider writing a macro to implement the code.

  • For the logger, we need to update AC_INIT to have the correct arguments. Then we need to remove some of the header checks that shouldn’t be there (and the corresponding code needs to be updated as well).

  • Conditional replacement for syslog.c, snprintf.c and strftime.c needs to be added. syslog.[ch] is needed if the OS doesn’t have a syslog(); snprintf() and strftime() are needed by syslog() if the OS doesn’t provide those supporting functions.

  • Note that autoscan guessed that the primary source file for the project was syslog.c, when in reality, it’s logger.c.

  • configure.in still needs to be taught how to gather and propagate the paths for the console and the log file. Given autoconf‘s batch orientation, this would be gathered by a switch to the resulting configure program.

  • Write Makefile.am in terms of automake m4 macros. Listing Six provides the source of Makefile.am which will be processed into Makefile.in. automake can support multiple executables per directory, by using the executable name as part of a later variable name. In Listing Six, we define the program to be built as logger. We define the sources to build logger as logger.c We further define that snprintf.c strftime.c syslog.c may be needed to build. Finally, we define the manual pages to be installed. These six lines of source will expand into about 175 lines of resulting Makefile.in. The resulting Makefile.in will also conform rather closely to the well-known GNU Coding Conventions.

  • Execute aclocal, autoheader, automake, and autoconf to generate Makefile.in, config.h.in, and configure.

Listing Five: A sample configure.scan file created by autoscan

# Process this file with autoconf to produce a configure script.

# Checks for programs.

# Checks for libraries.

# Checks for header files.
AC_CHECK_HEADERS([fcntl.h netdb.h paths.h stdlib.h string.h sys/socket.h syslog.h unistd.h])

# Checks for typedefs, structures, and compiler characteristics.

# Checks for library functions.
AC_CHECK_FUNCS([localtime_r memmove strcasecmp strchr strerror tzset])


As with all the feature definition generation techniques we’ve discussed, autoconf and automake have their advantages and disadvantages.

Listing Six:

bin_PROGRAMS = logger
logger_SOURCES = logger.c
EXTRA_logger_SOURCES = snprintf.c strftime.c syslog.c
man_MANS = logger.1

It’s nice that autoconfig and automake automatically generate feature testing scripts for a software package, allowing the end user to get by with knowing less about their system.

autoconfig and automake also provide an easy, non-interactive, two step process to get from raw source to compiled executable.

Two of the largest disadvantages are the steep learning curve required to generate the configuration script, and the resulting scripts’ batch nature (which we also considered a big advantage above).

Go Forth and Configure

We’ve looked at four different methods of determining the features given platform. The current favorite mechanism for generating feature definitions definitions is autoconf generated configure scripts. However, an extremely popular package, namely Perl, uses a metaconfig generated Configure script.

So, evaluate the different tools, and pick the one most appropriate for your package.

Eric Schnoebelen is a software developer with 20 years experience in both operating system and applications development. Eric can be reached at compiletime@cirr.com. Complete sample projects for metaconfig and autoconf can be found at ftp://ftp.cirr.com/pub/compiletime/2002-Dec and http://www.linux-mag.com/downloads/2002-12/compile.

Comments are closed.