« New Tiger builds | Main | Apple article on new CoreImage tools »
March 05, 2005
REST vs SOAP
I've been having an interesting discussion with a friend of mine, he is researching to figure out what service protocol to use for his company's service layer. We've been talking about all sorts of things, including JGroups, Spread, CORBA, REST, ARREST, SOAP, and a few commercial middleware layers. Messaging middleware is a fun area, there are so many pros and cons to consider.
Amazon uses a lot of multicast messaging (search google for '+amazon +tibco +multicast'), but multicast is hard to get right. You have to closely monitor the network and multicast storms are a real possibility. I view multicast messaging the same way I view writing in assembler; it's high-bandwidth, but fragile and suitable mostly for limited problem domains. I think multicast makes good sense for low velocity OOB messages (like machine configuration, metrics broadcasting, host discovery, etc), but using it as a messaging backbone seems like overkill for most folks. I've heard anecdotally that one of the largest multicast installs in the world runs the network that passes trade orders on wall street between the trade floor and the brokerage firms. That would be an interesting domain to work in, to say the least.
My friend has basically narrowed down his choices to SOAP or REST. First of all, I think that using HTTP as a middleware transport is a smart choice. You get drop-in load balancing and caching of service calls with several choices of implementation (free and commercial). Scaling becomes a pretty simple exercise, and you can use tools for HTTP introspection to debug your transport layer if you run into problems. Prototyping is easy too, most languages have SOAP/REST bindings.
SOAP is nice because WSDL lets you advertise supported service calls, but there's a definite overhead when using SOAP envelopes (both on the send and receive side), and while it uses HTTP as its transport, it is harder to debug than REST. With REST, you just paste the url into your web browser and look at the results. Because the REST calls are simple urls with args encoded in the uri string, you can build all sorts of abstractions on top of your service. The most popular calls can be exposed on your developer intranet as a RSS feed and marked as candidates for optimization. You can add a debugging field to the calls (append /debug to them, perhaps), and make your service print out detailed query tracing for SQL calls. Append /pretty to indent the returned xml and display it as beautified html. It's much easier to cache REST than it is to cache SOAP, many cache implementations out there don't even cache POST calls.
Another optimization that REST enables is easier routing of service requests. Typically, developers deploy web services and place the boxes behind a load balancer or a cache, which means that the requests get spread evenly across the servers and everyone's cache is fairly lukewarm. Modern load balancers and caches let you horizontally partition requests and route them to different servers based on the contents of the uri, which lets you partition your servers using your primary key namespace from your request. Each server host works on a slice of the total working set and its caches (buffer, memory, and any software-based caching) all get scorching hit rates. Routing based on the contents of a SOAP envelope from a POST to the server requires more overhead.
The big downside of REST is that advertisement of service APIs is not built into the system. This is a really nice feature of SOAP/WSDL, but if you're implementing an internal service layer, then the benefits of WSDL (broadcast of APIs and programmatic versioning) aren't a hard requirement. You can afford to place a thin wrapper abstraction on top of your service layer that serves as a contract between internal service clients and the services themselves. And while the arguments are encoded in the url, there's nothing saying you can't use a DTD or schema for the results coming back and verify the document is well-formed.
I'm a big believer in using HTTP as a generic transport, it works for most type of request/response paradigms, and if you want to get fancy, the ARREST framework gives you true async messaging with nice performance. There is a per-request overhead you get with HTTP versus rolling your own server layer and doing some sort of binary encoding (XDR, etc), but I think that extra per-request cost is more than made up for by the easy debugging, scaling and development of HTTP-based services. To me, in most cases, the choice is really REST for internal services, and SOAP/WSDL for customer-facing ones. REST is more quick to develop with, although you do need a little bit of OOB structure in order to manage the supported calls and versioning of your service layer. I think in most cases though, the benefits of REST over SOAP/WSDL outweigh the drawbacks.
One last reason I dig HTTP for service layers is that you get trivial wrapping with ssl if you want to make the transport layer opaque. And ssl accelerators negate the need to waste server host cycles on crypto routines.
Posted by djb at March 5, 2005 09:50 AM
Comments
Made me think of this as well:
Don't Be Afraid to Drop the SOAP.
You've got me hot to mess around with Ruby again too.
Posted by: Ashley at March 17, 2005 03:48 PM