Going Native

Java is great for solving many kinds of problems, but it isn't the first programming language that comes to mind when application performance is a critical issue. Sure, you can try to work around this with more powerful hardware, but at some point, a program written in Java just isn't going to run any faster.

Java is great for solving many kinds of problems, but it isn’t the first programming language that comes to mind when application performance is a critical issue. Sure, you can try to work around this with more powerful hardware, but at some point, a program written in Java just isn’t going to run any faster.

One way to speed things up is to recode parts of your application into C, C++, or even assembly language and then call these functions via JNI, the Java Native Interface (http://java.sun.com/j2se/1.3/docs/guide/jni). However, this would take costly development time and would jeopardize the portability that is Java’s hallmark.

In a way, the design of Java itself works against you. The Java compiler javac does not generate truly native instructions like gcc and g++ do for C and C++. Instead, the Java compiler generates bytecodes, which must then be converted into native instructions by the Just In Time (JIT) compiler of the Java Virtual Machine (JVM). This step takes time that native executables do not require.

So is there a way to “go native” with Java and to compile your Java source code into a native executable? Yes, there is. gcj, which is a part of the GNU Compiler Collection (GCC), is a direct-to-native compiler for Java that takes Java source code as input and produces machine-native binaries.

gcj comes with a whole bunch of support utilities, most of which are described in Table One; in addition, lots of information on gcj itself is available on the Web at http://gcc.gnu.org/java. But for now, let’s just dive in and see how we can put gcj to work.

Table One: gcj Components and Support Applications

gcj consists of a frontend to the GNU Compiler Collection (gcc) compiler and includes two shared libraries that always link against it, as well as some additional applications. The current version of gcj is 3.0.2, which supports the Java Language Specification version 1.1. You can run command gcj -dumpversion to print out the version number.

Application Description
gij A front-end to the Java interpreter that can be used to run a traditional .class file, or even a shared library converted by gcj as long as it has a main() method.
gcjh A program that reads .class files and generates C++ header files and/or “stub” files for use by JNI (Java Native Interface) or CNI (Cygnus Native Interface) programs: For more information about CNI, please see the Cygnus Native Interface sidebar.
jar A program to manipulate jar files.
jcf-dump A program that displays the contents of a .class file in human-readable form.
jv-scan A program that scans a .java source file for information such as the a list of the classes defined in the file or the class that has a main() method.
jvgenmain A program that adds a main() method to a Java class: This method is needed so that gij can execute this class.
Libraries Description
libgcj A shared library that contains all core Java classes: gcj automatically links applications with this library.
libgcjcA shared library that contains the default Java garbage collector: gcj automatically links applications with this library.

The Cygnus Native Interface

The Cygnus Native Interface (CNI) is a convenient way to write native Java methods in C++. It’s similar to the Java Native Interface (JNI) in that it provides a way to hook native code into your Java applications, but it’s easier to use and more efficient.

CNI uses C++ namespaces to implement Java packages, leading to a relatively intuitive way to refer to a specific class. For example, the Java class java.lang. String maps to java::lang::String in C++.

While CNI does make it easier to write native Java methods, it has the major drawback of not being portable. If you want to use CNI, you must use a C++ compiler that understands the CNI interface (such as g++).

Building a Simple Application

To get an idea of what you can do with gcj, let’s start by creating a simple “Hello World” program. You can take the Java code in Listing One and turn it into a native executable by running the following command:

Listing One: HelloWorld.java

public class HelloWorld
public static void main( String[] args )
System.out.println( “Hello, world!” );

gcj –main=HelloWorld
-o HelloWorld

Because gcj is a frontend to gcc, it shares all the options of gcc, including the -o flag that specifies the name of the resulting native executable.

However, there are a few new flags that deal with Java-specific issues. One of them is the –main flag. In C and C++, the main() function is considered to be the entry point for the application, and it’s an error to have more than one main() function defined.

Java, however, does not have such restrictions; any Java class can have a method called main(), and that method is not considered special. The –main flag is needed to tell gcj which class’s main() method should be the entry point for the application.

What About Libraries?

You might ask why we started with such a simple and contrived example. The reason is that a more complicated program probably would not have compiled. If you’re a programmer who believes in modularity and code reuse, your critical application probably uses Java Archive (jar) files. As you know, jar files are collections of .class files that contain the Java bytecode for the corresponding .java file.

Whenever a Java program calls a method in a class that’s stored in a jar file, the Java VM automatically opens and loads it. But a Java application that’s been compiled with gcj can’t open a jar file. It can only work with shared libraries (.so files).

Thus when compiling a Java application into a native executable, all the jar files it uses must be available as shared libraries. The regular Java classes are in the libgcj library (as mentioned in Table One) but if your application relies on a jar file you’ve created, you’re going to have to convert it into a shared library, and gcj can do this too. (Although see the Going Half-Native sidebar for an example of when you don’t need to do this.)

Going Half-Native

This feature only works on Linux running on an x86 box, but it’s pretty neat. If an application compiled with gcj attempts to use a Java class that hasn’t been compiled into the application or linked against as a converted shared library, the libgcj library will search your CLASSPATH for the appropriate .class file and load it via the built-in bytecode interpreter, just like the Java VM does. This can help when converting your Java application into a native executable, since you needn’t identify all the jar file dependencies and convert them into shared libraries before using gcj.

In order for this functionality to work, the libgcj library must have been compiled with the –enable-interpreter option.

So if your critical Java application is dependent upon some jar files you have got lying around, let’s begin converting them into shared libraries. If you don’t have a jar file handy, you might want to convert the jar file that comes from the Collections subproject of the Jakarta Commons (http://jakarta.apache.org/commons/collections.html). This is a set of Java classes that provide interfaces to different kinds of collections.

To convert your jar file into a shared library, you can run the command below. Notice that you use the same command-line options as though converting an object module (.o file) or archive library (.a file) into a shared library. Don’t forget that the shared library’s name should start with “lib“.

gcj -shared -Wall -o

If you’re converting the Collections jar file into a shared library, you’d use the following command instead (the choice of libccjava is arbitrary and stands for Java Commons Collections):

gcj -shared -Wall -o

Putting it Together

Now that we’ve converted the jar files into shared libraries, we can link them into an executable and use gcj to “go native” with our Java application.

If you converted the Collections jar file into a shared library, you can use the sample program in Listing Two that is taken from the Collections tutorial at http://java.sun.com/docs/books/tutorial/collections/interfaces/set.html/.

Listing Two: FindDups.java

import java.util.*;

public class FindDups {
public static void main(String args[]) {
Set s = new HashSet();
for (int i=0; i<args.length; i++)
if (!s.add(args[i]))
System.out.println(“Duplicate detected: “+args[i]);

System.out.println(s.size()+” distinct words detected: “+s);

To compile the FindDups.java application into a native executable, use the following command:

gcj -o FindDups -Wall -L.

The –CLASSPATH option is new; it specifies the location of the original jar file, which gcj needs to compile the FindDups application. We’ve covered the –main flag previously, and the familiar -L and -l options locate the converted shared library libccjava.so.

We can now run the native executable FindDups and give it a list of words as arguments. The program will print out a list and the number of distinct words as well as the words that are duplicated.

Gone Native

This has been a quick overview of gcj, but you should now be able to convert all your Java applications into native executables, thereby achieving maximum performance. A particularly astute reader will have noticed that the native executable can even be deployed on a platform that doesn’t have a JVM on it (though if any jar files have been converted, those shared libraries must be installed).

If you need to debug your new na-tive executable, you can use GDB. Consult the Web at http://gcc.gnu.org/java/gdb.html for more info. See if gcj makes your critical applications perform better.

Glenn McAllister is a part-time committer on the Jakarta Ant project. He can be reached at glenn@somanetworks.com.

Comments are closed.