Let There Be Jabber

Jabber is about more than Instant Messaging and chat. It is emerging as a

Instant Messaging (IM) software has mushroomed into one of the Internet’s latest “killer apps,” but most IM software is produced by companies like AOL, Microsoft, and Yahoo. Each of those companies would like us to use their IM client software to chat with users on their network — and only their network.

For a while, open source hackers worked to reverse engineer the proprietary protocols of the commercial IM world so they could build their own clients. However, a few IM fans chose not to engage in this perpetual catch-up game and began to work on their own IM system — one which could interoperate with other IM systems. The project became known as Jabber.

While Jabber is often contrasted with proprietary instant messenger services such as AOL’s ICQ and AIM or Microsoft’s MSN Messenger, it differs from those consumer-oriented services in several crucial respects.

For one, Jabber has captured the attention of system administrators and security-conscious individuals everywhere, because anyone can run their own Jabber server, determine who can have accounts on that server, and even control whether or not the server is connected to the Internet. Combined with SSL encryption of client-server connections and end-to-end encryption of messages using PGP or GnuPG, Jabber gives organizations and individuals full control over instant messaging – something that’s not possible when all your traffic goes through the central servers of a large corporation based in Redmond, Washington or Reston, Virginia.

However, Jabber is about more than just Open Source IM. One of the reasons Jabber has begun to attract serious developer interest is that it can be used to send structured data between any two entities that can connect to the Net, whether those entities are people or applications. And because Jabber’s protocol is both open and fully based on XML, Jabber can be easily extended in ways that other IM systems can only dream about.

At a philosophical level, Jabber is quite the opposite of the closed source and closed networks of the proprietary IM systems. Just as Apache is committed to freedom of publication, Jabber is committed to freedom of conversation — the freedom of people and applications to communicate regardless of medium, network, platform, or device, and without being locked into anyone’s proprietary “extensions” to common Internet protocols.

XML in Jabber

Unfortunately there are no common standards for near-real-time messaging, nor does it look as if any will be developed within the next few years. While Apache has HTTP, there is no standard protocol to which Jabber can adhere. In the absence of such a standard, the Jabber team has developed a straightforward protocol for messaging and presence. (We’ll talk more about these in a moment.) What’s more, the team made that protocol openly available on the Internet, along with a reference Jabber server implementation (written in C and runnable on Linux and various other Unices) as well as Jabber clients for all major operating systems (including several Linux/Unix clients, such as the Perl/Tk client Jarl and the GNOME client Gabber).

The protocol developed by the Jabber community is, as mentioned, 100 percent XML. Those new to Jabber often wonder why the team chose such a “heavy” format for its protocol. But the fact that XML (suitably documented, of course) is human-readable and openly-understandable means that XML is fully in line with Jabber’s mission of free and open communication. And the base Jabber protocol is quite simple, since it consists of three basic elements: <message/> for sending messages, <presence/> for knowing when people and applications are online, and <iq/> (info/query) for searching, registering, authenticating, and the like.

Furthermore, the base Jabber protocol is highly extensible through the addition of XML namespaces, which means that if you want to enable people or applications to communicate in a particular way or about a specialized domain such as customer service or remote instrument monitoring, you’re free to do so. (If you’re not familiar with XML namespaces, check out our Introduction to XML article located on the Web at http://www.linux-mag.com/2001-07/xml_basics_01.html.)

There are two key concepts that make the near-real-time exchange of XML-based messages a reality within Jabber. One is presence — the ability to know when another user or application is online and ready to engage in conversation. Obviously you don’t want just anyone to know when you’re online, so in Jabber you have full control over that knowledge; in order for someone to know when you are online, you must explicitly allow them to subscribe to your presence, which you can choose to revoke at any time. However, once you and a friend can know when each other is online, you will be able to engage in nearly instantaneous communication over the net.

At a more technical level, the other key to near-real-time communication in Jabber is the concept of XML streams. While XML was born in the documentation community as a slimmed down version of SGML (or, depending on how you look at it, as a smarter version of HTML), Jabber sends not static XML documents but streams of XML data between clients and servers, and by extension between servers. These data streams are the technical counterparts of the conversations that people and applications engage in through Jabber.

Before delving further into Jabber’s XML protocol, it might be helpful to understand a bit about its network architecture. In essence, Jabber is a client-server system quite similar architecturally to a rather successful messaging platform called e-mail. In fact, the similarity extends to Jabber’s addressing scheme, which follows the familiar user@ domain.net format. So let’s say that Juliet wants to send the following message to Romeo:

<message from=’juliet@capulet.com‘ to=’romeo@montague.net‘>
<body>Wherefore art thou Romeo?</body>

jabber fig one
FIGURE ONE: The route that an instant message from Juliet to Romeo would take.

What happens within the Jabber system to ensure that Romeo gets the message? First, Juliet connects to the Jabber server at capulet.com using the Jabber client that she has installed on the Linux laptop she uses out on the balcony. She types, “Wherefore art thou Romeo?” and her client transforms that into the correct Jabber XML, which it then sends to the server. The capulet.com server then opens a connection to the montague.net server (if a connection is not already open) and sends the XML across that connection. The Jabber server at montague.net then checks to see if Romeo is online and, if so, routes the message to the Jabber client running on the PDA Romeo has with him out in the Capulets’ orchard.

A Jabber Session

Now that we have a sense of how Jabber works, let’s look more closely at the actual XML data that is exchanged in a typical Jabber session. In essence, a Jabber session is defined by two intertwining XML streams — one from the client to the server and a complementary one from the server to the client.

The first step is for a Jabber client or other Jabber-aware application to open a TCP connection to a Jabber server (in this example, a test server running at jabber.to) and initiate an XML stream:


The server then responds by opening an XML stream of its own so that it will be able to communicate with the client or application:


Jabber servers won’t indefinitely keep these streams and their associated TCP sockets open, however, since that would quickly become highly inefficient. In particular, a client must authenticate with the server within a certain amount of time (set in the server configuration file) or the server will close the connection.

Jabber uses several different methods of authentication, including plaintext passwords (discouraged for obvious reasons), encrypted passwords using the W3C-approved SHA1 hash method, and zero-knowledge authentication. In order for a client to know which methods are allowable on this server, it must query the server, which it does using an info/ query element containing a query in the jabber:iq:auth namespace:

<iq id=’A0′ type=’get’>
<query xmlns=’jabber:iq:auth’>

The server then replies with information regarding which authentication methods are supported:

<iq id=’A0′ type=’result’>
<query xmlns=’jabber:iq:auth’>
<password/> # supports plaintext
<digest/> # supports SHA1 digest
<sequence/> # supports zero-knowledge
<token>3A7F8004</token> # a token for 0k

Next, the client then provides its authentication information — in this case, an encrypted password:

<iq id=’A1′ type=’set’>
<query xmlns=’jabber:iq:auth’>

The server then lets the client know that authentication was successful by responding with an iq packet of type ‘result’:

<iq id=’A1′ type=’result’/>

Now that the complexity of starting a Jabber session has been taken care of, the Jabber user can chat with friends, join conference rooms, search user directories, go off to lunch and change his online status to ‘away’, add people to his contact list (called a “roster” in Jabber), and so on. All of these activities use the three basic Jabber XML elements of <message/>, <presence/>, and <iq/> in various combinations and with various namespaces attached (all of these features are captured in the protocol documents located at http://docs.jabber.org). When you want to log off, your Jabber client will close the XML stream it opened (by sending </stream:stream>) and the server will close the TCP connection.

Building an Application

Fortunately, you don’t have to know all the details of Jabber’s XML format in order to put Jabber to work for you. Thanks to some of the core Jabber contributors, several Jabber code libraries exist to handle almost all of the low-level functionality, such as opening a connection to the server and creating the XML that the server expects. For a full list of Jabber programming libraries, see http://www.jabber.org/?oid=71/. You’ll find C/C++, Java, and Perl code which you can use to write your own Jabber client or to write more specialized applications.

Jabber Screen 1

Because surveyor is a bot (slang for “robot” in the on-line chat world), it will in essence masquerade as a Jabber client, so we can use Net::Jabber‘s client-related facilities as we build it. The first step is to create a client, connect to the server, and authenticate (making use of an account that we created with a regular Jabber client):

my $connection = Net::Jabber::Client->new();

$connection->Connect( “hostname” => SERVER,
“port” => PORT )
or die “Cannot connect ($!)\n”;

my @result = $connection->AuthSend( “username” => USER,
“password” => PASSWORD,
“resource” => RESOURCE );

Next, we will set up some callbacks for handling messages and presence packets. We will also set some up to perform several of the standard functions of a Jabber client, such as getting our contact list and sending presence to the server:

$connection->SetCallBacks( “presence” => \&handle_presence );
$connection->SetCallBacks( “message” => \&handle_message );
while(defined($connection->Process())) { }

The last while statement means that as long as we’re connected to the server, the program will continue handling any messages and presence packets that we happen to receive. In other words, we want the bot to be “always on.”

Since the bot won’t talk with anyone who isn’t subscribed to it (it was taught not to chat with strangers), it needs a way to handle subscription requests from Jabber users. The handle_presence() method will do this:

sub handle_presence
my $presence = new Net::Jabber::Presence(@_);
my $jid = $presence->GetFrom();
my $show = $presence->GetShow();
my $type = $presence->GetType();

# we need remove any resource suffix from the Jabber ID
$jid =~ s!\/.*$!!;

if ($type eq “subscribe”) {
log3(“$jid requests subscription”);
$connection->Send($presence->Reply(type => ‘subscribed’));
$connection->Send($presence->Reply(type => ‘subscribe’));

# etc….

The method that does the heavy lifting in our surveys, however, is handle_message() (in fact it probably does too much, but the code will mostly likely be divided up in a more efficient fashion when it is “refactored” in the future). Our bot keeps track of who is talking with it and where they are in the survey process through a hash (%action) that functions as a primitive state machine; users move from (1) merely being subscribed to (2) starting a session by saying “hello” to the bot to (3) requesting to complete a survey to (4) actively completing a specific survey. Figures Two, Three, and Four illustrate the interaction between a human user and surveyor.

While users are completing a survey, we keep track of which questions they have answered by utilizing the %question hash, collecting all of the users’ answers in a multidimensional %answer hash and prompting them by sending the next question in the survey. Finally, once a user has answered the last question, we send him a thank you message and reset the value for their Jabber ID in the %action hash back to subscribed.

Behind the scenes, we save this user’s answers, record the fact that they have completed this survey, and disallow them from filling it out again.

The result is a simple but fairly effective method for managing the user’s conversation with our survey bot. All we’re really doing here is creating different replies based on convoluted if-then logic and then sending them to the person we got the message from using one of the methods provided by the Net::Jabber modules:

$msg = Net::Jabber::Message->new();
$msg->SetMessage( “to” => $from,
“body” => $sendbody );

Of course you could extend this example in various ways. As mentioned, you could enable people to create surveys using this tool (though you might want to prompt users for a password to avoid spam). That way, every time someone creates a survey, it would send a message to subscribers telling them about the new survey. Integrating the survey results with a Web page or e-mail list might be handy too. In addition, it would be good to be able to capture more complicated questions and to build in a more complex state machine so that if, say, a person answers “yes” to question #2, we can present some in-depth follow-up questions.

The Future of Jabber

As we’ve seen with surveyor, Jabber enables you to develop interactive applications quite easily. Other simple applications that developers have created with Jabber include:

  • being informed when a certain file is modified on the filesystem or in CVS, or when particular events occur in an error log
  • receiving notification when a new e-mail message comes in on a certain thread or from a certain person
  • getting headlines sent to you when a news story is posted and the accompanying RSS file is updated

However, these small-scale applications are just the venerable tip of the iceberg. Some of the most interesting extensions of Jabber involve using its generic XML routing capabilities to deliver data between two systems or applications, without a human involved.

An excellent example is the monitoring of remote scientific instruments; one such system involved instruments connecting to a Jabber server periodically, sending their latest information to a central data-gathering application via a custom namespace embedded within the <message/> element, then immediately logging off. Indeed, because the Jabber server can store messages for future delivery if a certain Jabber ID is not currently online, such a system will work even if the central application is down for some reason.

So, while Jabber has its roots in Instant Messaging, it is about much more than just Internet chat. With gateways being added in Jabber to just about every imaginable method of message delivery (from IRC to various IM systems to cell phones), the Jabber community is working hard to ensure freedom of conversation among people and applications across the Internet.

The following Web sites contain helpful information for those interested in exploring Jabber in more depth:


Home of the Jabber Foundation, including a developers’ community and extensive documentation


A news site devoted to Jabber (includes links to Jabber clients)


Ryan Eatmon’s Net::Jabber Perl modules


Some helpful programming examples created by D.J. Adams



The full source code to surveyor

Peter Saint-Andre is a systems analyst for Jabber.com and chief evangelist for the Jabber.org open-source project. He can be reached at

Comments are closed.