Building and Using Shared Libraries

Over the past several months, this column has shown you how to use gcc and g++ language extensions, how to link objects and functions, and how to build executables. We will continue this month with a discussion about a very specific type of object -- a shared (or dynamic) library -- and how to take advantage of it in your programs.

Over the past several months, this column has shown you how to use gcc and g++ language extensions, how to link objects and functions, and how to build executables. We will continue this month with a discussion about a very specific type of object — a shared (or dynamic) library — and how to take advantage of it in your programs.

A shared library is an object that has been specifically built to be loaded at runtime by some other application. In Microsoft Windows, they are called DLLs (dynamically linked libraries). On Linux, run /sbin/ldconfig -v as root to see all the shared libraries that applications make use of day in and day out.

The main advantage of using shared libraries is the potential for code reuse they provide. An example is the library that contains the printf() function. It’s safe to say that many (if not all) Linux utilities use this function. If it were impossible to load libraries dynamically, that function would have to be linked into every program that needed it, duplicating that piece of code all over your system. Obviously, this would consume disk space and make updates much more difficult. By using shared libraries, the printf() function can be stored in one library that is still available to any program that needs to use it.

Let’s assume that the printf() function was found to have a huge security hole. Without the benefit of shared libraries, every program that uses printf() would need to be relinked with an updated, bug-free version of the library. However, with shared libraries, only one copy of the library needs to be updated.

This column will attempt to show you how to create shared libraries, load them into your programs, and call the functions in them.

Building a Shared Library

It’s easy to turn your own set of functions into a shared library. Believe it or not, it’s as simple as compiling it in a different way. Figure One shows a simple “Hello, World” function; all it does is print out a message. Now, let’s see how to compile and link this file (named b.c) so that you get a shared library:




Figure One: A “Hello, World” Shared Library Source File, ‘b.c’


#include <stdio.h>

void b_printer ()
{
printf (“Printing something from a shared library.\n”);
}


gcc -fPIC -shared -o b.so b.c

As discussed in last month’s column, the -fPIC option to gcc tells the compiler to create code that is “position independent.” This means that the final machine code does not assume that it knows where in memory it resides. Any time the code performs a branch or jump, it will not be to an absolute address (e.g., “jump to the instruction at address 1234″), but rather to a relative address (e.g., “jump to the instruction eight words away from here”). The -shared option is actually a flag to the linker, not the compiler. It tells the linker that it is all right if there are unresolved function calls because this object file will be linked in at runtime, and the unresolved calls will be taken care of at that point.

If you try to compile and link the example above without the -shared option (gcc -fPIC -o b.so b.c), the linker complains that there is no main() function as is normally required by executables. By using the -shared option, the linker permits this unresolved function.

Using the Library

Now that you have a shared library, how do you use it? The prototypes for the four main functions that allow you to do so are shown in Figure Two. All of these prototypes are found in <dlfcn.h>. These functions are all very straightforward. The steps for using a function located in a shared library are as follows:




Figure Two: Prototypes for the Shared Library Functions Found in dlfcn.h


void* dlopen (const char* filename, int flag);
const char* dlerror (void);
void* dlsym (void* handle, char* symbol);
int dlclose (void* handle);


  1. Open the shared library by using dlopen().

  2. Check that the open succeeded by using dlerror().

  3. Get the function that you want by using dlsym().

  4. Make sure that the function lookup succeeded by using dlerror().

  5. When done with the library, close it by using dlclose().

You must include -ldl in your command-line flags to gcc when compiling and linking programs that call these functions (gcc -ldl sample.c). This will tell the linker to link in the necessary libraries that contain these dl (dynamic library) functions, enabling your program to call the shared library’s functions at runtime.

Figure Three shows an example program that uses the above steps to open the newly created shared library and call the b_printer() function. The call to dlopen() takes two arguments: the name of the shared library to open and flags to indicate how to open the file. The RTLD_LAZY flag indicates that the linker should not resolve functions until they are needed. If you want all functions to be resolved before dlopen() returns, pass it the RTLD_ NOW flag instead. If there should still be an unresolved function at the time the library is loaded, using the RTLD_ NOW flag will cause dlopen() to return an error. If the RTLD_LAZY flag had been used, the dlopen() would have succeeded. It’s usually best to use the RTLD_LAZY flag.




Figure Three: An Example Program, ‘sample.c’


#include <stdio.h>
#include <dlfcn.h>

/* using the dl* functions to load and use a shared library */

int main ()
{
void* handle;
void (*printer)(void);
char* error;

/* Open the shared library, ‘b.so’. */
handle = dlopen (“/home/chelf/linuxmag/0202/b.so”, RTLD_LAZY);

/* Check to see if there were any errors in opening the library. */
error = dlerror ();
if (error)
{
/* If so, print the error message and exit. */
printf (“%s\n”, error);
exit (1);
}

/* Get the ‘b_printer’ function out of the library. */
printer = dlsym (handle, “b_printer”);

error = dlerror ();
/* Check to see if there were any errors in getting the function. */
if (error)
{
/* If so, print the error message and exit. */
printf (“%s\n”, error);
exit (1);
}

/* Call the function. */
printer ();

/* Close the shared library. */
dlclose (handle);
}

Either of these flags may be bitwise ORed with the RTLD_GLOBAL flag to indicate that any of the functions from this library may be used to resolve missing functions from other shared libraries. Since this is our only one, we don’t need this flag; however, if you have many shared libraries that work together, you may choose to always use this flag.

If the call to dlopen() succeeds, a handle to the library will be returned. Otherwise, the function will return NULL and the next call to dlerror() will return a string that describes the error. Once you call dlerror(), the error message is erased, so be sure to only call it once after each call to either dlopen() or dlsym(). (This is why it’s wise to save away the message to be printed out in the error variable.) If there is no error, dlerror() will return NULL.

Next, the program pulls out the exact function that you wish to call. It does this using the dlsym() function, which simply takes a handle to a shared library (the same as returned by dlopen()) as well as the symbol name (e.g., a function name) to look up. On success, it returns the address of the symbol that you looked up. The address, of course, is a virtual address, which ultimately maps to the appropriate part of the shared library. Because this happens at runtime, the operating system retrieves it the fastest way it knows how (cache, general memory or disk file — all in the background). For more details on virtual memory allocation, see Compile Time from June and July 2001 (http://www.linux-mag.com/2001-06/compile_ 01.html and http://www.linux-mag.com/2001-07/compile_01.html).

If the symbol represents a function, the program may then simply call that function as shown near the end of the main() function. Again, you can use dlerror() to make sure that there was no error in the lookup of the symbol. Finally, you can call the function from the shared library and then use dlclose() to indicate that you are finished with the library, allowing the operating system to free up any memory allocated.

When you run this program, it does the following:


machine:~> ./a.out
Printing something from a shared library.
machine:~>

Predefined Special Functions

Sometimes, you may need to perform some work in your shared library before any of the functions in it are called. Similarly, you might also have some clean-up work to perform after a given program has finished calling the functions in your library. The special functions _init() and _fini() provide you with this functionality. Any code found in the _init() function will be run before the return of dlopen(), and any code found in the _fini() function will be run once the library has been closed. Figure Four shows the b.c shared library modified to include the _init() and _fini() functions. However, if you compile and link this file the same way that you did the original version of the shared library, as shown in the following:


gcc -fPIC -shared -o b.so b.c

you get the following errors:


/tmp/ccBTfFgm.o: In function ‘_init’:
/tmp/ccBTfFgm.o(.text+0×4): multiple definition of ‘_init’
/usr/lib/crti.o(.init+0×0): first defined here
/tmp/ccBTfFgm.o: In function ‘_fini’:
/tmp/ccBTfFgm.o(.text+0×60): multiple definition of ‘_fini’
/usr/lib/crti.o(.fini+0×0): first defined here
collect2: ld returned 1 exit status




Figure Four: b.c with the _init and _fini Functions


#include <stdio.h>

void _init ()
{
printf (“init print.\n”);
}

void b_printer ()
{
printf (“Printing something from a shared library.\n”);
}

void _fini ()
{
printf (“fini print.\n”);
}

It turns out that gcc added those functions for you automatically when it compiled your program into an executable. In fact, gcc adds a lot of extra functions to your executable object files. How do you keep it from adding these extra functions when you don’t need the entire functionality of the executable? You simply tell it to only generate an object file with the -c flag. As discussed in the February 2002 Compile Time (http://www.linux-mag.com/2002-02/compile_01.html), this will generate machine code for the shared library but won’t link it with any other code.

Instead, the compiler simply generates a relocatable object file that will be linked later (in this case, dynamically). Note that relocatable and position-independent are different concepts. An object file can be relocatable without being position-independent. Relocatable simply means that a linker could take the object file and successfully link it with other object files. The -c flag will cause gcc to generate machine code, but it won’t link that code with any other existing code. Let’s take a look at how that changes our compilation/link process:


gcc -fPIC -c b.c

This will generate a file, b.o, which you can then link with:


ld -shared -o b.so b.o

Now there are no link errors. You might have also noticed that the size of this new shared library is much smaller than the original. This is because you haven’t allowed gcc to put in all of the extra functions necessary for executables as you would if you had used gcc -shared -o b.so b.o as the final step.

After you compile and link the new version of the b.so shared library, if you run your original program again, the output changes to:

init print.
Printing something from a shared library.
fini print.

You didn’t even have to recompile or relink the executable. Since it was programmed to use the shared library dynamically, the library could be updated without having to recompile the programs that use it.

Debugging

There are a couple of tools that are worth mentioning when discussing shared libraries. Sometimes it can be hard to track down all the shared library dependencies that a file contains. The utility ldd can help you to do just that. It lists all the shared libraries the given program depends on to run, along with their versions.

Another useful tool is nm. This program lists all of the symbols from an object file. Figure Five shows the output of nm when run on the shared library b.so. You’ll see that nm displays much information (certainly enough to write an entire article about), but the interesting pieces are the lines that have a T and U in the second column. T means that the symbol is defined in this object file (_fini(), _init(), and b_printer() are the three functions in your shared library), and U means the function is undefined (printf() is undefined because it will be linked in from another shared library).




Figure Five: The ‘nm’ Utility


machine:~> nm b.so

00001448 A _DYNAMIC
00001438 A _GLOBAL_OFFSET_TABLE_
000014a8 A __bss_start
000014a8 A _edata
000014a8 A _end
000003a0 T _fini
00000344 T _init
00000370 T b_printer
00000340 t gcc2_compiled.
U printf

It can be very useful to see exactly which functions are defined and undefined when you end up having many shared libraries that are each calling functions from other shared libraries.

It’s Nice to Share

Hopefully, you now know enough about shared libraries to start using them throughout your code. Keep in mind that any significant piece of code that will be needed by many different applications might be a good candidate for becoming a shared library.

As they say in kindergarten, “have fun sharing,” and, as always, happy hacking!



Benjamin Chelf is an author and engineer at CodeSourcery. He can be reached at chelf@codesourcery.com.

Fatal error: Call to undefined function aa_author_bios() in /opt/apache/dms/b2b/linux-mag.com/site/www/htdocs/wp-content/themes/linuxmag/single.php on line 62