The evolution of in-memory concept of messages in the managed Microsoft Web Services stack(s) is quite interesting to look at. When you compare the concepts of System.Web.Services (ASMX), Microsoft.Web.Services (WSE) and System.MessageBus (Indigo M4), you'll find that this most fundamental element has undergone some interesting changes and that the Indigo M4 incarnation of "Message" is actually a bit surprising in its design.

ASMX

In the core ASP.NET Web Services model (nicknamed ASMX), the concept of an in-memory message doesn't really surface anywhere in the programming model unless you use the ASMX extensibility mechanism. The abstract SoapMessage class, which comes in concrete SoapClientMessage and SoapServerMessage flavors has two fundamental states that depend on the message stage that the message is inspected in: The message is either unparsed or parsed (some say "cracked").

If it's parsed you can get at the parameters that are being passed to the server or are about to be returned to the client, but the original XML data stream of the message is no longer available and all headers have likewise either been mapped onto objects or lumped into a "unknown headers" array. if the message is unparsed, all you get is an text stream that you'll have to parse yourself. If you want to add, remove or modify headers while processing a message in an extension, you will have to read and parse your copy of the input stream (the message text) and write the resulting mesage to an output stream that's handed onwards to the next extension or to the infrastructure. In essence that means that if you had two or three ASMX-style SOAP extensions that implement security, addressing and routing functionality, you'd be parsing the message three times and serializing it three times just so that the infrastructure would parse it yet again. Not so good.

WSE

The Web Services Enhancements (WSE) have a simple, but very effective fix for that problem. The WSE team needed to use the ASMX extensibility point but found that if they'd build all their required extensions using the ASMX model, they'd run into that obvious performance problem. Therefore, WSE has its own pipeline and its own extensibility mechanism that plugs as one big extension into ASMX and when you write extensions (handlers) for WSE, you don't get a stream but an in-memory info-set in form of a SoapEnvelope (that is derived from System.Xml.XmlDocument and therefore a DOM). Parsing the XML text just once and have all processing steps work on a shared in-memory object-model seems optimal. Can it really get any better than "parse once" as WSE does it?

Indigo

When you look at the Indigo concept of Message (the Message class in the next milestone will be the same in spirit, similar in concept and different in detail and simpler as a result), you'll find that it doesn't contain a reference to an XmlDocument or some other DOM-like structure. The Indigo message contains a collection of headers (which in the M4 milestone also come in an "in-memory only" flavor) and a content object, which has, as its most important member, an XmlReader-typed Reader property.

When I learned about this design decision a while ago, I was a bit puzzled why that's so. It appeared clear to me that if you kept the message parsed in a DOM, you'd have a good solution if you want to hand the message down a chain of extensibility points, because you don't need to reparse. The magic sentence that woke me up was "We need to support streaming". And then it clicked.

Assume you want to receive a 1GB video stream over an Indigo TCP multicast or UDP connection (even if you think that's a silly idea - work with me here). Because Indigo will represent the message containing that video as an XML Infoset (mind that this doesn't imply that we're talking about base64-encoded content in an UTF-8 angle bracket document and therefore 2GB on the wire), we've got some problems if there was a DOM based solution. A DOM like XmlDocument is only ready for business when it has seen the end tag of its source stream. This is not so good for streams of that size, because you surely would want to see the video stream as it downloads and, if the video stream is a live broadcast, there may simply be no defined end: The message may have a virtually infinite size with the "end-tag" being expected just shortly before judgment day.

There's something philosophically interesting about a message relaying a 24*7*365 video stream where the binary content inside the message body starts with the current video broadcast bits as of the time the message is generated and then never ends. The message can indeed be treated as being well-formed XML because there is always a theoretical end to it. The end-tag just happens to be a couple of "bit-years" away.

Back to the message design: When Indigo gets its hands on a transport stream it layers a Message object over the raw bits available on the message using an XmlReader. Then it peeks into the message and parses soap:Envelope and everything inside soap:Header. The headers it finds go into the in-memory header collection. Once it sees soap:Body, Indigo stops and backs off. The result of this is a partially parsed in-memory message for which all headers are available in memory and the body of the message is left sitting in an XmlReader. When the XmlReader sits on top of a NetworkStream, we now have a construct where Indigo can already work on the message and its control information (headers) while the network socket is still open and the rest of the message is still arriving (or portions haven't even been sent by the other party).

Unless an infrastructure extension must touch the body (in-message body encryption or signature do indeed spoil the party here), Indigo can process the message, just ignore the body portion and hand it to the application endpoint for processing as-is. When the application endpoint reads the message through the XmlReader it therefore pulls the bits directly off the wire. Another variant of this, and the case where it really gets interesting, is that using this technique, arbitrary large data streams can be routed over multiple Indigo hops using virtualized WS-Addressing addressing where every intermediary server just forwards the bits to the next hop as they arrive. Combine this with publish and subscribe services and Indigo's broadcasting abilities and this is getting really sexy for all sorts of applications that need to traverse transport-level obstacles such as firewalls or where you simply can't use IP.     

For business applications, this support for very large messages is not only very interesting but actually vital for a lot of applications. In our BizTalk workshops we've had quite a few customers who exchange catalogs for engineering parts with other parties. These catalogs easily exceed 1GB in size on the wire. If you want to expand those messages up into a DOM you've got a problem. Consequently, neither WSE nor ASMX nor BizTalk Server nor any other DOM based solution that isn't running on a well equipped 64-bit box can successfully handle such real-customer-scenario messages. Once messages support streaming, you have that sort of flexibility.

The problem that remains with XmlReader is that once you touch the body, things get a bit more complex than with a DOM representation. The XmlReader is a "read once" construct that usually can't be reset to its initial state. That is specifically true if the reader sits on top of a network stream and returns the translated bits as they arrive. Once you touch the message content is the infrastructure, the message is therefore "consumed" and can't be used for further processing. The good news is, though, that if you buffer the message content into a DOM, you can layer an XmlNodeReader over the DOM's document element and forward the message with that reader. If you only need to read parts of the message or if you don't want to use the DOM, you can layer a custom XML reader over a combination of your buffer data and the original XmlReader.

Updated: