Here’s a scenario: say you’re interested in writing a Web-based application that scours the Web looking for the cheapest available airfares. You have a list of URLs where you can find fare prices for various airlines, and your application will need to search those sites to look for the best possible deals.
Unfortunately, you find that the important information your Web application needs is drowned in a sea of extraneous HTML. To extract the relevant data, your program would have to parse the HTML and also know enough about the formatting conventions of the document to know where the information is — not fun.
Even worse, without warning, URLs or page layouts can change. When they do, the application you worked so long and hard on is thrown into a state of dysfunction, and you feel helpless when faced with the task of maintaining it.
This is exactly the kind of problem that will be solved by the next generation of Web applications — Web Services. The basic idea behind Web Services is that a site publishes its content as objects instead of HTML pages. So instead of having to parse HTML for information, you can just invoke a method on an object that will return the information you want with minimal hassle.
The underlying technology that makes this sort of interaction possible is called SOAP (the Simple Object Access Protocol). SOAP is an XML-based protocol that allows objects on a remote server to be used as if they were local objects from the client’s point of view. Some of you may be thinking, “Hmmm…that sounds a lot like a traditional RPC (Remote Procedure Call).” Or, maybe it sounds a lot like XML-RPC (see the What About XML-RPC? sidebar). There are similarities among them all. However, SOAP is a new twist on RPC, and it has two big advantages going for it.
What About XML-RPC?
XML-RPC, like SOAP, is an XML-based protocol for remote procedure calls that runs on top of the HTTP protocol. Unlike SOAP, XML-RPC is very lightweight. It is only concerned with remote procedure calls, not sending full-blown objects over the network. In the true minimalist spirit of Unix, XML-RPC was designed to perform one task and do it well.
For simple applications, like the one presented in this article, SOAP and XML-RPC are equally effective solutions. XML-RPC does the job with less overhead, but SOAP offers more flexibility as the service grows and gains complexity. That’s not to say that SOAP is “better” than XML-RPC. They are technologies that fill similar roles. In some cases, one just happens to be a more natural fit than the other.
Its simplicity has helped make XML-RPC libraries available for most programming languages (Perl included) for quite some time. The availability of these libraries has given XML-RPC wider adoption than SOAP. As time goes on, however, it’s likely that SOAP will surpass XML-RPC in popularity. SOAP has industry heavyweights like Microsoft and IBM backing it and will probably scale better as Web Services (and their APIs) become more complex.
First, because it uses XML, SOAP is platform- and language-independent. Traditional RPC methods were tied to specific languages and platforms, making interoperability difficult at best. Meanwhile, there are already more than 60 implementations of SOAP available for just about any programming language. The fact that it’s easy for everybody to use SOAP should help it to achieve widespread adoption.
SOAP’s second advantage is that its main transport mechanism is HTTP. This may seem odd if you are used to more traditional forms of RPC that directly use TCP/IP and employ a binary messaging format. However, to put this into perspective, think about all the Web servers out there. Now, imagine if all those Web servers could be turned into SOAP servers. SOAP’s use of HTTP as a transport mechanism actually makes this a very easy thing to do. SOAP has an unprecedented opportunity to gain wide and rapid adoption, because it is able to leverage the existing infrastructure of the Web to bootstrap itself. The people behind SOAP are working to bring RPC to the masses by making it as easy to use as possible.
Finally, SOAP is all about making objects “just work” across the network. It goes above and beyond normal RPC systems (including XML-RPC), which are only concerned with function calls working across the network. SOAP also handles the encoding and decoding of the various data types and complex structures that objects may employ.
Okay, all this talk about Web Services and bringing RPC to the masses is nice, but how hard will it be to actually implement and work with this stuff? Well, let’s dive in and find out…
Using SOAP::Lite, Part One
In order to build an example of how to work with Web Services, we first need to locate a Web Service we can call upon. These technologies are still in their very early stages of adoption, so there’s not a lot out there just yet. However, the XMethods Web site (http://www.xmethods.net) conveniently lists publicly accessible Web Services that are actually up and running and available for our use.
For the sake of our example code, let’s assume we want to write an application that fetches stock quotes. To do this, we’ll use the Delayed Stock Quotes service provided by xmethods.net in conjunction with the SOAP::Lite Perl module (available from http://www.soaplite.com). Our application would be supplied a ticker symbol for a publicly traded stock, and it would return the current share price for that stock. Hopefully, these examples demonstrate how easy and straightforward SOAP programming can be when using SOAP::Lite.
Listing One shows a very simple program that uses SOAP::Lite class methods to access a Web Service. In order to access a Web Service, we need to specify where to find the service, what the name of the object we want to use is, and the name of the method we want to invoke.
Listing One: A Simple SOAP Client Using SOAP::Lite
1 #!/usr/bin/perl -w
2 use strict;
3 use SOAP::Lite;
5 ## Method 1: Class Methods
6 print SOAP::Lite
7 ->proxy(‘http://services.xmethods.net:80/soap‘) ## where?
8 ->uri(‘urn:xmethods-delayed-quotes’) ## what object?
9 ->getQuote(‘SUNW’) ## what method?
12 print “\n”;
Line 7 contains a call to the SOAP::Liteproxy method, which identifies the URL that we will use to access the SOAP component. Line 8 contains a call to the uri method that uses a namespace URI in order to specify the object we want to use. Line 9 contains a call to getQuote, which causes SOAP::Lite to send a SOAP request to the proxy, wait for a response, and return a value to us. Note that getQuote is the name of a method that will be invoked on the server, and the parameters to getQuote are sent as part of the SOAP message. Finally, line 10 inspects the SOAP response that it received and returns it in a meaningful way.
As it turns out, the delayed quote service is written in Java, but using the service was as easy as calling native Perl code.
Using SOAP::Lite, Part Two
The code we saw in Listing One is fine for occasional usage, but it is inefficient for repeated use. Each time we want to talk to a remote SOAP service, we have to specify at least three pieces of information: the SOAP proxy, the namespace URI of the SOAP object we want to talk to, and the name of the method we want to invoke. The first two values are static and will never change while we’re talking to the same SOAP service. Listing Two demonstrates another way — specifying the values once and saving them in a SOAP::Lite object.
Listing Two: Creating SOAP::Lite Objects
1 #!/usr/bin/perl -w
2 use strict;
3 use SOAP::Lite;
5 my $quote = new SOAP::Lite
6 proxy => ‘http://services.xmethods.net:80/soap‘,
7 uri => ‘urn:xmethods-delayed-quotes’;
9 print “SUNW: “, $quote->getQuote(‘SUNW’)->result(), “\n”;
On line 5, we create a local SOAP::Lite object, which is initialized with the proxy and namespace URI for our remote SOAP service. These values can be modified later using the $quote->proxy() and $quote->uri() method calls.
The call to $quote->getQuote() on line 10 generates the same SOAP request we sent in Listing One. The big advantage here is that since we’ve saved the initialization information in the $quote object, we can use it over and over again without having to specify the proxy and namespace URI each time.
SOAP Service Definitions
The service we have been using is described on an HTML page at http://www.xmethods.net/detail.html?id=2. This page lists all the information required to use this service in a human-readable fashion. This information includes things such as a proxy URL, a namespace URI, and a list of methods with the parameters they require. The same information is also presented in a machine-readable format called WSDL (Web Services Description Language). The role WSDL plays with SOAP is critical, since it provides a clear, precise, and complete description of a SOAP service’s interface. This is information that can be used directly by SOAP::Lite when instantiating objects. (For more information on WSDL, check out the SOAP, WSDL, and UDDI sidebar.)
Soap, WSDL, and UDDI
Many parties are taking active steps to see Web Services become more widespread. While SOAP was designed to leverage XML and HTTP, other technologies have been created to make Web Services easier to use. The other major technological components of Web Services are WSDL (Web Services Description Language) and UDDI (Universal Description, Discovery, and Integration).
WSDL is an XML-based language that describes everything that needs to be known in order to programmatically use a Web Service. SOAP libraries can use this information directly to instantiate objects.
The purpose of UDDI is to be a CPAN for Web Services. Just as people search through CPAN (Comprehensive Perl Archive Network) to find useful Perl modules, people can search through UDDI registries to find useful Web Services. Once you’ve found the service you’re looking for, the WSDL document that describes that service will be right there, so you can fetch it and get started with your project.
So while SOAP is generally considered the “backbone” of the Web Services architecture, when people talk about Web Services, they are often talking about the combination of these three technologies working together.
The main benefit of using a WSDL file to instantiate a SOAP::Lite object is that some error checking that would otherwise take place on the server side can now be performed locally. One example of this is the checking of method names. Because a WSDL file contains a list of methods supported by a given Web Service, SOAP::Lite can use that information to prevent invalid methods from being called. When one does not use a WSDL file for instantiating SOAP::Lite objects, a SOAP request is made regardless of whether the method is supported by the Web Service or not. It then becomes the responsibility of the server to check for this error. Leaving this job to the server is inefficient and can produce unpredictable results, so using WSDL files is a good idea.
Listing Three shows a slightly different program that uses a WSDL document to instantiate a SOAP::Lite object.
Listing Three: Using WSDL to Create SOAP::Lite Objects
On line 5, we create a SOAP::Lite object by simply specifying a WSDL interface to the service we wish to use. SOAP::Lite fetches this file and processes it before returning an object that knows all of the little details about this SOAP service.
On line 8, we call the getQuote() method on the $quote object we just created. The difference this time is that this $quote object will only support method calls that the SOAP service supports; if we were to try to invoke an unsupported method, it would return immediately with a Perl error instead of blindly sending a SOAP request. This saves a little bandwidth.
Unfortunately, even this technique is not without its drawbacks. Every time we instantiate an object using a WSDL file, it has to be downloaded over the Web and parsed. If we instantiate a given object many times, that’s a lot of downloading and parsing that has to be done. What makes this worse is that WSDL files rarely change, so parsing the same data over and over again is wasteful. Over time, all this inefficiency can add up.
Fortunately, there is a simple solution to this problem. The SOAP::Lite distribution comes with a utility program called stubmaker.pl. This program is used to fetch a WSDL file, analyze it, and produce a stub Perl module. This Perl module remembers all of the information that was in the WSDL file, so it doesn’t have to be downloaded ever again. Using stubmaker.pl is very simple, as shown in Listing Four.
## Object Interface
my $quote = new StockQuoteService;
print $quote->getQuote(‘SUNW’), “\n”;
This brings us to the end of our examples. Admittedly, they have been very simple, and some of you may be looking for something more challenging to try. It’s a good investment to spend some time over at http://www.xmethods.net and look through the directory of publicly accessible Web Services. The services listed there are quite eclectic, ranging from trivial to amusing to actually useful. The SOAP::Lite distribution also comes with plenty of example scripts that demonstrate a lot of practical uses for SOAP. For anyone who wants to start exploring Web Services and SOAP, these should prove to be very useful resources.
Recycle, Reuse, Redeploy…
In the grand scheme of things, perhaps one of the most important aspects of the Web Services approach is that it brings the concept of software reusability to an area where it has been practically non-existent. Software reuse has been a major focus of traditional application development for quite some time now, but until recently, it was difficult to easily apply many of those principles to Web application development. Fortunately, all that was required was a bit more infrastructure, and now that infrastructure has finally arrived.
Security and Soap
As we already know, HTTP is the primary transport mechanism for SOAP. This has some security-minded people worried. It is common practice for sysadmins to utilize a firewall to block off every port except 80 (which is the port that HTTP uses). The ability of SOAP to penetrate firewalls is perhaps its most controversial feature.
Some argue that a powerful protocol like SOAP shouldn’t be sneaking in by attaching itself to the simpler HTTP protocol. To them, SOAP appears to be a dangerous security hole. Others counter-argue that these fears are misguided. They point out that SOAP is nothing more than an embellished CGI request. Are people trying to imply that CGI programmers never make security mistakes, but those careless SOAP programmers make them all the time? They must be, because the amount of security that SOAP has is exactly the same as the amount of security that CGI has.
In an attempt to resolve this issue, the SOAP specification requires that a SOAPAction header be present in the HTTP header for every SOAP request. This header states the intent of the client making a SOAP request. The SOAP service may sometimes require the SOAPAction header contain a specific value in order to process a SOAP request. Similarly, a firewall may be configured to disallow SOAP requests, based on the content or even the existence of a SOAPAction header. Use and processing of a SOAPAction header may aid in setting up more secure Web Services. Should processing based on these headers be insufficient, HTTPS and HTTP authentication may be a good next step to adding more security to SOAP.