Jabber is about more than Instant Messaging and chat. It is emerging as a
Instant Messaging (IM) software has mushroomed into one of the Internet’s latest “killer apps,” but most IM software is produced by companies like AOL, Microsoft, and Yahoo. Each of those companies would like us to use their IM client software to chat with users on their network — and only their network.
For a while, open source hackers worked to reverse engineer the proprietary protocols of the commercial IM world so they could build their own clients. However, a few IM fans chose not to engage in this perpetual catch-up game and began to work on their own IM system — one which could interoperate with other IM systems. The project became known as Jabber.
While Jabber is often contrasted with proprietary instant messenger services such as AOL’s ICQ and AIM or Microsoft’s MSN Messenger, it differs from those consumer-oriented services in several crucial respects.
For one, Jabber has captured the attention of system administrators and security-conscious individuals everywhere, because anyone can run their own Jabber server, determine who can have accounts on that server, and even control whether or not the server is connected to the Internet. Combined with SSL encryption of client-server connections and end-to-end encryption of messages using PGP or GnuPG, Jabber gives organizations and individuals full control over instant messaging – something that’s not possible when all your traffic goes through the central servers of a large corporation based in Redmond, Washington or Reston, Virginia.
However, Jabber is about more than just Open Source IM. One of the reasons Jabber has begun to attract serious developer interest is that it can be used to send structured data between any two entities that can connect to the Net, whether those entities are people or applications. And because Jabber’s protocol is both open and fully based on XML, Jabber can be easily extended in ways that other IM systems can only dream about.
At a philosophical level, Jabber is quite the opposite of the closed source and closed networks of the proprietary IM systems. Just as Apache is committed to freedom of publication, Jabber is committed to freedom of conversation — the freedom of people and applications to communicate regardless of medium, network, platform, or device, and without being locked into anyone’s proprietary “extensions” to common Internet protocols.
XML in Jabber
Unfortunately there are no common standards for near-real-time messaging, nor does it look as if any will be developed within the next few years. While Apache has HTTP, there is no standard protocol to which Jabber can adhere. In the absence of such a standard, the Jabber team has developed a straightforward protocol for messaging and presence. (We’ll talk more about these in a moment.) What’s more, the team made that protocol openly available on the Internet, along with a reference Jabber server implementation (written in C and runnable on Linux and various other Unices) as well as Jabber clients for all major operating systems (including several Linux/Unix clients, such as the Perl/Tk client Jarl and the GNOME client Gabber).
The protocol developed by the Jabber community is, as mentioned, 100 percent XML. Those new to Jabber often wonder why the team chose such a “heavy” format for its protocol. But the fact that XML (suitably documented, of course) is human-readable and openly-understandable means that XML is fully in line with Jabber’s mission of free and open communication. And the base Jabber protocol is quite simple, since it consists of three basic elements: <message/> for sending messages, <presence/> for knowing when people and applications are online, and <iq/> (info/query) for searching, registering, authenticating, and the like.
Furthermore, the base Jabber protocol is highly extensible through the addition of XML namespaces, which means that if you want to enable people or applications to communicate in a particular way or about a specialized domain such as customer service or remote instrument monitoring, you’re free to do so. (If you’re not familiar with XML namespaces, check out our Introduction to XML article located on the Web at http://www.linux-mag.com/2001-07/xml_basics_01.html.)
There are two key concepts that make the near-real-time exchange of XML-based messages a reality within Jabber. One is presence — the ability to know when another user or application is online and ready to engage in conversation. Obviously you don’t want just anyone to know when you’re online, so in Jabber you have full control over that knowledge; in order for someone to know when you are online, you must explicitly allow them to subscribe to your presence, which you can choose to revoke at any time. However, once you and a friend can know when each other is online, you will be able to engage in nearly instantaneous communication over the net.
At a more technical level, the other key to near-real-time communication in Jabber is the concept of XML streams. While XML was born in the documentation community as a slimmed down version of SGML (or, depending on how you look at it, as a smarter version of HTML), Jabber sends not static XML documents but streams of XML data between clients and servers, and by extension between servers. These data streams are the technical counterparts of the conversations that people and applications engage in through Jabber.
Before delving further into Jabber’s XML protocol, it might be helpful to understand a bit about its network architecture. In essence, Jabber is a client-server system quite similar architecturally to a rather successful messaging platform called e-mail. In fact, the similarity extends to Jabber’s addressing scheme, which follows the familiar user@ domain.net format. So let’s say that Juliet wants to send the following message to Romeo:
What happens within the Jabber system to ensure that Romeo gets the message? First, Juliet connects to the Jabber server at capulet.com using the Jabber client that she has installed on the Linux laptop she uses out on the balcony. She types, “Wherefore art thou Romeo?” and her client transforms that into the correct Jabber XML, which it then sends to the server. The capulet.com server then opens a connection to the montague.net server (if a connection is not already open) and sends the XML across that connection. The Jabber server at montague.net then checks to see if Romeo is online and, if so, routes the message to the Jabber client running on the PDA Romeo has with him out in the Capulets’ orchard.
A Jabber Session
Now that we have a sense of how Jabber works, let’s look more closely at the actual XML data that is exchanged in a typical Jabber session. In essence, a Jabber session is defined by two intertwining XML streams — one from the client to the server and a complementary one from the server to the client.
The first step is for a Jabber client or other Jabber-aware application to open a TCP connection to a Jabber server (in this example, a test server running at jabber.to) and initiate an XML stream:
The server then responds by opening an XML stream of its own so that it will be able to communicate with the client or application:
Jabber servers won’t indefinitely keep these streams and their associated TCP sockets open, however, since that would quickly become highly inefficient. In particular, a client must authenticate with the server within a certain amount of time (set in the server configuration file) or the server will close the connection.
Jabber uses several different methods of authentication, including plaintext passwords (discouraged for obvious reasons), encrypted passwords using the W3C-approved SHA1 hash method, and zero-knowledge authentication. In order for a client to know which methods are allowable on this server, it must query the server, which it does using an info/ query element containing a query in the jabber:iq:auth namespace:
<iq id=’A0′ type=’get’>
<query xmlns=’jabber:iq:auth’>
<username>stpeter</username>
</query>
</iq>
The server then replies with information regarding which authentication methods are supported:
<iq id=’A0′ type=’result’>
<query xmlns=’jabber:iq:auth’>
<username>stpeter</username>
<password/> # supports plaintext
<digest/> # supports SHA1 digest
<sequence/> # supports zero-knowledge
<token>3A7F8004</token> # a token for 0k
</query>
</iq>
Next, the client then provides its authentication information — in this case, an encrypted password:
<iq id=’A1′ type=’set’>
<query xmlns=’jabber:iq:auth’>
<username>stpeter</username>
<resource>Gabber</resource>
<hash>61bc90893f75927906eac4337897413e1171</hash>
</query>
</iq>
The server then lets the client know that authentication was successful by responding with an iq packet of type ‘result’:
<iq id=’A1′ type=’result’/>
Now that the complexity of starting a Jabber session has been taken care of, the Jabber user can chat with friends, join conference rooms, search user directories, go off to lunch and change his online status to ‘away’, add people to his contact list (called a “roster” in Jabber), and so on. All of these activities use the three basic Jabber XML elements of <message/>, <presence/>, and <iq/> in various combinations and with various namespaces attached (all of these features are captured in the protocol documents located at http://docs.jabber.org). When you want to log off, your Jabber client will close the XML stream it opened (by sending </stream:stream>) and the server will close the TCP connection.
Building an Application
Fortunately, you don’t have to know all the details of Jabber’s XML format in order to put Jabber to work for you. Thanks to some of the core Jabber contributors, several Jabber code libraries exist to handle almost all of the low-level functionality, such as opening a connection to the server and creating the XML that the server expects. For a full list of Jabber programming libraries, see http://www.jabber.org/?oid=71/. You’ll find C/C++, Java, and Perl code which you can use to write your own Jabber client or to write more specialized applications.
|