[This is a follow-up post to "Internet of Things or Thing on the Internet?"]

The metaphor "Internet of Things" stands for the next wave of expansion of scope for distributed systems.

We started the journey with centralized systems, single computers, that you had to walk up to and control with switches and that were later able to be fed with batches of punch cards allowing distributed creation of jobs with centralized processing. We then took the step of introducing the notion of terminals; remote control screens that allowed immediate interaction with the central computer by allowing interactive composition of jobs that were then fed into processing. The advent of PCs and PC-technology based servers and later smart phones then led to a decentralized landscape where personal functions are personal and shared functions tend to live at an appropriate scope for the respective audience, may that be a work group sharing on a department server, a company sharing in a datacenter, or the general public sharing on a public web site.

Cloud-based systems are increasingly challenging this model as personal data gets held and processed in the cloud since people now increasingly own multiple digital devices, work groups are becoming less dependent on particular locations, and companies realize the advantages of lower operating cost when they delegate work to cloud providers.

What has remained stable across most of the waypoints in this journey since the introduction of terminals is that there is, in the majority of systems, some form of human interaction through a human-machine-interface, motivating actions in a program at an appropriate scope, and the resulting output is presented to the same person or someone else through another such interface. Information flows. That information flow can be fairly immediate as with myriads of database-frontend applications, or far decoupled as with a cash register clerk's input (even by way of a scanner) ultimately rolls up into a cell of a financial balance sheet in Excel. Ultimately, the vast majority of software developers has so far built pure information technology systems. People put data in, and people get the same or a transformation of that data out. Put differently, information technology systems are intermediated from the physical world through people.

"Internet of Things" is a metaphor for an evolved kind of systems where that intermediation is removed.

Instead of a human observing the state of the physical world and submitting that observation into a system – which obviously can take the form of pointing a camera at an object, so we're not talking about keyboard input – we allow systems to make such observations for themselves and on a continuous basis. We're giving systems eyes to see, ears to hear, and noses to smell and sense pollution, and other senses to feel temperature, humidity, acidity, atmospheric pressure, vibration, acceleration, orientation, altitude, or geographic position.

These senses manifest in devices, aptly named sensors.

We're also giving systems the power to change the state of physical world objects as a result of these observations and additional inputs. Aircraft auto-pilot systems have long been implementing actuation of control surfaces based on sensor observations and many advanced military aircraft types would not be flyable at all without such digital avionics. Autonomous vessels and vehicles operate in the same fashion.

But even in scenarios that seem to be human-controlled at first blush, such as unlocking a vehicle just borrowed from a car sharing service with a smart phone app gesture, the decision whether the car will indeed unlock is made by the car sharing system based on an authorization verification and subsequent command routing decision to the right vehicle. A person pushes a button, but the actual unlock command is issued by the system based on a decision sequence.

Having (remote) systems be the judge in decision making, especially around authorization, will also be important in many scenarios where the mainline communication occurs peer-to-peer. You may interface with digital tools like a projector and a digital whiteboard in a conference room or a game console in your entertainment rack in a peer-to-peer fashion for optimizing latency, but the matchmaking will commonly aided by a system that helps ensuring that only authorized and trustworthy people can participate in the peer-mesh, even if they all happen to sit in the same room.

The role that these devices, including the car's telematics box interfacing with the CAN bus, play towards the systems is that of "peripherals". That's is obviously a very well-known concept for which we have very-well understood models of how we attach input sensors like mice and keyboards or actuators like printers. What the "Internet of Things" changes is that these peripherals often become attached over long-haul links, and are not attached to singular computers but to distributed systems. But in principle, the telematics box in the car or a light pole on the street are not different from printers from an architectural perspective.

What they also share with contemporary printers is the ability to communicate their current condition. A modern ink-jet or laser printer will always let you know when it is a good time to go to the store and buy fresh Original Brand™ ink or toner as the supply and it will do so via telemetry information sent to the computer hosting the driver.

What "Internet of Things" changes quite radically is the ecosystem breadth and diversity. There are many protocols and standards and systems and it's not like the operating systems made by two or three dominant players get to call the shots on how all devices are communicating, because there are very many modes of communication and broadly varying scenarios. Diversity will be the norm and there will be plenty of innovation on the communication front challenging the status quo.

The key innovation of the "Internet of Things" concept is that we're equipping distributed systems with senses that allow them (their programs) to acquire information in a self-motivated fashion, to make decisions, and to actuate things in the physical world as a result. Systems are the focus, not the things. The things are peripherals.

Categories: Architecture

There is good reason to be worried about the "Internet of Things" on current course and trajectory. Both the IT industry as well as manufacturers of "smart products" seem to look at connected special-purpose devices and sensors as a mere variation of the information technology assets like servers, PCs, tablets, or phones. That stance is problematic as it neglects important differences between the kinds of interactions that we're having with a phone or PC, and the interactions we're having with a special-purpose devices like a gas valve, a water heater, a glass-break sensor, a vehicle immobilizer, or a key fob.

Before I get to a proposal for how to address the differences, let's take a look at the state of things on the Web and elsewhere.

Information Devices

PCs, phones, and tablets are primarily interactive information devices. Phones and tablets are explicitly optimized around maximizing battery lifetime, and they preferably turn off partially when not immediately interacting with a person, or when not providing services like playing music or guiding their owner to a particular location. From a systems perspective, these information technology devices are largely acting as proxies towards people. They are "people actuators" suggesting actions and "people sensors" collecting input.

People can, for the most part, tell when something is grossly silly and/or could even put them into a dangerous situation. Even though there is precedent of someone driving off a cliff when told to do so by their navigation system, those cases are the rarest exceptions.

Their role as information gathering devices allowing people to browse the Web and to use a broad variety of services, requires these devices to be "promiscuous" towards network services. The design of the Web, our key information tool, centers on aggregating, combining, and cross referencing information from a myriad of different systems. As a result, the Web's foundation for secure communication is aligned with the goal of this architecture. At the transport protocol level, Web security largely focuses on providing confidentiality and integrity for fairly short-lived connections.

User authentication and authorization are layered on top, mostly at the application layer. The basic transport layer security model, including server authentication, builds on a notion of federated trust anchored in everyone (implicitly and largely involuntarily) trusting in a dozen handfuls of certification authorities (CA) chosen by their favorite operating system or browser vendor. If one of those CAs deems an organization trustworthy, it can issue a certificate that will then be used to facilitate secure connections, also meaning to express an assurance to the user that they are indeed talking to the site they expect to be talking to. To that end, the certificate can be inspected by the user. If they know and care where to look.

This federated trust system is not without issues. First, if the signing key of one of the certification authorities were to be compromised, potentially undetected, whoever is in possession of the key can now make technically authentic and yet forged certificates and use those to intercept and log communication that is meant to be protected. Second, the system is fairly corrupt as it takes all of $3 per year to buy a certification authority's trust with minimal documentation requirements. Third, the vast majority of users have no idea that this system even exists.

Yet, it all somehow works out halfway acceptably, because people do, for the most part, have common sense enough to know when something's not quite right, and it takes quite a bit of work to trick people into scams in huge numbers. You will trap a few victims, but not very many and not for very long. The system is flawed and some people get tricked, but that can also happen at the street corner. Ultimately, the worst that can happen – without any intent to belittle the consequences – is that people get separated from some of their money, or their identities get abused until the situation is corrected by intervention and, often, some insurance steps in to rectify these not entirely unexpected damages.

Special-Purpose Devices

Special-purpose devices, from simple temperature sensors to complex factory production lines with thousands of components inside them are different. The devices are much more scoped in purpose and even if they may provide some level of a people interface, they're largely scoped to interfacing with assets in the physical world. They measure and report environmental circumstances, turn valves, control servos, sound alarms, switch lights, and do many other tasks. They help doing work for which an information device is either too generic, too expensive, too big, or too brittle.

If something goes wrong with automated or remote controllable devices that can influence the physical world, buildings may burn down and people may die. That's a different class of damage than someone maxing out a stolen credit-card's limit. The security bar for commands that make things move, and also for sensor data that eventually results in commands that cause things to move, ought to be, arguably, higher than in an e-commerce or banking scenario.

What doesn't help on the security front is that machines, unlike most people, don't have a ton of common sense. A device that goes about its day in its programmed and scheduled ways has no notion of figuring when something it not quite right. If you can trick a device into talking to a malicious server or intermediary, or into following a network protocol redirection to one, it'll dutifully continue doing its work unless it's explicitly told to never do so.

Herein lies one of the challenges. A lot of today's network programming stacks and Web protocols are geared towards the information-oriented Web and excellently enable building promiscuous clients by default. In fact, the whole notion of REST rests on the assumption that the discovery and traversal of resources is performed though hypertext links included in the returned data. As the Web stacks are geared towards that model, there is extra work required to make a Web client faithful to a particular service and to validate, for instance, the signature thumbnail of the TLS certificate returned by the permitted servers. As long as you get to interact with the web stack directly, that's usually okay, but the more magic libraries you use on top of the Web stack basics, the harder that might get. And you have, of course, and not to be underestimated in complexity, to teach the device the right thumbnail(s) and thus effectively manage and distribute an allow-list.

Generally, device operators will not want to allow unobserved and non-interactive devices that emit telemetry and receive remote commands to be able to stray from a very well-defined set of services they're peered with. They should not be promiscuous. Quite the opposite.

Now – if the design goal is to peer a device with a particular service, the federated certificate circus turns into more of a burden than being a desired protocol-suite feature. As the basic assumptions about promiscuity towards services are turned on their head, the 3-6 KByte and 2 network roundtrips of certificate exchange chatter slow things down and also may cost quite a bit of real money paying for precious, metered wireless data volume. Even though everyone currently seems to assume Transport Layer Security (TLS) being the only secure channel protocol we'll ever need, it's far from being ideal for the 'faithful' connected devices scenario.

If you allow me to take you into the protocol basement for a second: That may be somewhat different if we could seed clients with TLS RFC5077 session resumption tickets in an out-of-band fashion, and have a TLS mode that never falls back to certs. Alas, we do not.

Bi-Directional Addressing

Connected and non-interactive devices not only differ in terms of the depth of their relationship with backend services, they also differ very much in terms of the interaction patterns with these services when compared to information-centric devices. I generally classify the interaction patterns for special-purpose devices into the categories Telemetry, Inquiries, Commands, and Notifications.

  • Telemetry is unidirectionally flowing information which the device volunteers to a collecting service, either on a schedule or based on particular circumstances. That information represents the current or temporally aggregated state of the device or the state of its environment, like readings from sensors that are associated with it.
  • With Inquiries, the device solicits information about the state of the world beyond its own reach and based on its current needs; an inquiry can be a singular request, but might also ask a service to supply ongoing updates about a particular information scope. A vehicle might supply a set of geo-coordinates for a route and ask for continuous traffic alert updates about particular route until it arrives at the destination.
  • Commands are service-initiated instructions sent to the device. Commands can tell a device to provide information about its state, or to change the state of the device, including activities with effects on the physical world. That includes, for instance, sending a command from a smartphone app to unlock the doors of your vehicle, whereby the command first flows to an intermediating service and from there it's routed to the vehicle's onboard control system.
  • Notifications are one-way, service-initiated messages that inform a device or a group of devices about some environmental state they'll otherwise not be aware of. Wind parks will be fed weather forecast information and cities may broadcast information about air pollution, suggesting fossil-fueled systems to throttle CO2 output or a vehicle may want to show weather or news alerts or text messages to the driver.

While Telemetry and Inquiries are device-initiated, their mirrored pattern counterparts, Command and Notifications, are service-initiated – which means that there must be a network path for messages to flow from the service to the device and that requirement bubbles up a set of important technical questions:

  • How can I address a device on a network in order to route commands and notifications to it?
  • How can I address a roaming and/or mobile device on a network in order to route commands and notifications to it?
  • How can I address a power constrained device on a network in order to route commands and notifications to it?
  • How can I send commands or notifications with latency that's acceptable for my scenario?
  • How can I ensure that the device only accepts legitimate commands and trustworthy notifications?
  • How can I ensure that the device is not easily susceptible to denial-of-service attacks that render it inoperable towards the greater system? (not good for building security sensors, for instance)
  • How can I do this with several 100,000 or millions of devices attached to a telemetry and control system?

Most current approaches that I'm running into are trying to answer the basic addressing question with traditional network techniques. That means that the device either gets a public network address or it is made part of a virtual network and then listens for incoming traffic using that address, acting like a server. For using public addresses the available options are to give the device a proper public IPv4 or IPv6 address or to map it uniquely to a well-known port on a network address translation (NAT) gateway that has a public address. As the available pool of IPv4 addresses has been exhausted and network operators are increasingly under pressure to move towards providing subscribers with IPv6 addresses, there's hope that every device could eventually have its very own routable IPv6 address. The virtual network approach is somewhat similar, but relies on the device first connecting to some virtual network gateway via the underlying native network, and then getting an address assigned within the scope of the virtual network, which it shares with the control system that will use the virtual network address to get to the device.

Both of those approaches are reasonable from the perspective of answering the first, basic addressing question raised above, and if you pretend for a moment that opening inbound ports through a residential edge firewall is acceptable. However, things get tricky enough once we start considering the other questions, like devices not being in the house, but on the road.

Roaming is tricky for addressing and even trickier if the device is switching networks or even fully mobile and thus hopping through networks and occasionally dropping connections as it gets out of radio range. While there are "Mobile IP" roaming standards for both IPv4 (RFC3344) and IPv6 (RFC6275), but those standards rely on a notion of traffic relaying through agents and those are problematic at scale with very large device populations as the relay will have to manage and relay traffic for very many routes and also needs to keep track of the devices hopping foreign networks. Relaying obviously also has significant latency implications with global roaming. What even the best implementations of these standards-based approaches for roaming can't solve is that you can't connect to a device that's outside of radio coverage and therefore not connected, at all.

The very same applies to the challenge of how to reliably deliver commands and notifications to power-constrained devices. Those devices may need to survive on battery power for extended periods (in some cases for years) between battery recharges, or their external power source, like "power stealing" circuits employed in home building automation devices, may not yield sufficient power for sustained radio connectivity to a base station. Even a vehicle battery isn't going to like powering an always-on radio when parked in the long-term airport garage while you're on vacation for 2 weeks.

So if a device design aims to conserve power by only running the radio occasionally or if the device is mobile and frequently in and out of radio coverage or hopping networks, it gets increasingly difficult to reach it naively by opening a network connection to it and then hoping for that to remain stable if you're lucky enough to catch a moment when the device is indeed ready to talk. That's all even assuming that the device were indeed having a stable network address provided by one of the cited "Mobile IP" standards, or the device was registering with an address registration/lookup service every time it comes online with a new address so that the control service can locate it.

All these approaches aiming to provide end-to-end network routes between devices and their control services are almost necessarily brittle. As it tries to execute a command, the service needs to locate the device, establish a connection to it, issue the command and collect the command feedback all while, say, a vehicle drives through a series of tunnels. Not only does this model rely on the device being online and available at the required moment, it also introduces a high number of tricky-to-diagnose failure points (such as the device flipping networks right after the service resolved its address) with associated security implications (who gets that newly orphaned address next?), it also has inherent reliability issues at the application layer since all faults that occur after the control system has sent the command, do introduce doubt in the control system on whether the command could be successfully executed; and not all commands are safe to just blindly retry, especially when they have physical consequences.

For stationary power constrained or wirelessly connected devices, the common approach to bridging the last meters/yards is a hub device that's wired to the main network and can bridge to the devices that live on a local network. The WLAN hub(s) in many homes and buildings are examples of this as there is obviously a need to bridge between devices roaming around house and the ISP network. From an addressing perspective, these hubs don't change the general challenge much as they themselves need to be addressable for commands they then ought to forward to the targeted device and that means you're still opening up a hole in the residential firewall, either by explicit configuration or via (don't do this) UPnP.

If all this isn't yet challenging enough for your taste, there's still security. Sadly, we can't have nice and simple things without someone trying to exploit them for malice or stupid "fun".

Trustworthy Communication

All information that's being received from and sent to a device must be trustworthy if anything depends on that information – and why would you send it otherwise? "Trustworthy communication" means that information is of verifiable origin, correct, unaltered, timely, and cannot be abused by unauthorized parties in any fashion. Even telemetry from a simple sensor that reports a room's temperature every five minutes can't be left unsecured. If you have a control system reacting on that input or do anything else with that data, the device and the communication paths from and to it must be trustworthy.

"Why would anyone hack temperature sensors?" – sometimes "because they can", sometimes because they want to inflict monetary harm on the operator or physical harm on the facility and what's in it. Neglecting to protect even one communication path in a system opens it up for manipulation and consequential harm.

If you want to believe in the often-cited projection of 50 billion connected devices by 2020, the vast majority of those will not be classic information devices, and they will not be $500 or even $200 gadgets. Very many of these connected devices will rather be common consumer or industry goods that have been enriched with digital service capabilities. Or they might even just be super inexpensive sensors hung off the side of buildings to collect environmental information. Unlike apps on information devices, most of these services will have auxiliary functions. Some of these capabilities may be even be largely invisible. If you have a device with built-in telemetry delivery that allows the manufacturer or service provider to sense an oncoming failure and proactively get in touch with you for service – which is something manufacturers plan to do – and then the device just never breaks, you may even not know such a capability exists, especially if the device doesn't rely on connectivity through your own network. In most cases, these digital services will have to be priced into the purchase price of the product or even be monetized through companion apps and services as it seems unlikely that consumers will pay for 20 different monthly subscriptions connected appliances. It's also reasonable to expect that many devices sold will have the ability to connect, but their users will never intentionally take advantage these features.

On the cost side, a necessary result from all this is that the logic built into many products will (continue to) use microcontrollers that require little power, have small footprint, and are significantly less expensive than the high-powered processors and ample memory in today's information devices – trading compute power for much reduced cost. But trading compute power and memory for cost savings also means trading cryptographic capability and more generally resilience against potential attacks away.

The horror-story meme "if you're deep into the forest nobody will hear your screams" is perfectly applicable to unobserved field-deployed devices under attack. If a device were to listen for unsolicited traffic, meaning it listens for incoming TCP connections or UDP datagrams or some form of UDP-datagram based sessions and thus acting as server, it would have to accept and then triage those connection attempts into legitimate and illegitimate ones.

With TCP, even enticing the device to accept a connection is already a very fine attack vector, because a TCP connection burns memory in form of a receive buffer. So if the device were to use a network protocol circuit like, for instance, the WizNet W5100 used one the popular enthusiast tinker platform Arduino Ethernet, the device's communication capability is saturated at just 4 connections, which an attacker could then service in a slow byte-per-packet fashion and thus effectively take the device out. As that happens, the device now also wouldn't have a path to scream for help through, unless it made – assuming the circuit supports it – an a priori reservation of resources for an outbound connection to whoever plays the cavalry.

If we were to leave the TCP-based resource exhaustion vector out of the picture, the next hurdle is to establish a secure baseline over the connection and then triaging connections into good and bad. As the protocol world stands, TLS (RFC5246) and DTLS (RFC6347) are the kings of the security protocol hill and I've discussed the issues with their inherent client promiscuity assumption above. If we were indeed connecting from a control service to a device in an outbound fashion, and the device were to act as server, the model may be somewhat suitable as the control service will indeed have to speak to very many and potentially millions of devices. But contrary to the Web model where the browser has no idea where the user will send it, the control system has a very firm notion of the devices it wants to speak to. There are many of those, but there no promiscuity going on. If they play server, each device needs to have its own PKI certificate (there is a specified option to use TLS without certificates, but that does not matter much in practice) with their own private key since they're acting as servers and since you can't leak shared private keys into untrusted physical space, which is where most of the devices will end up living.

The strategy of using the standard TLS model and having the device play server has a number of consequences. First, whoever provisions the devices will have to be a root or intermediate PKI certification authority. That's easy to do, unless there were any need to tie into the grand PKI trust federation of today's Web, which is largely anchored in the root certificate store contents of today's dominant client platforms. If you had the notion that "Internet of Things" were to mean that every device could be a web server to everyone, you would have to buy yourself into the elite circle of intermediate CA authorities by purchasing the necessarily signing certificates or services from a trusted CA and that may end up being fairly expensive as the oligopoly is protective of their revenues. Second, those certificates need to be renewed and the renewed ones need to be distributed securely. And when devices get stolen or compromised or the customer opts out of the service, these certificates also need to get revoked and that revocation service needs to be managed and run and will have to be consulted quite a bit.

Also, the standard configuration of most application protocol stacks' usage of TLS tie into DNS for certificate validation, and it's not obvious that DNS is the best choice for associating name and network address for devices that rapidly hop networks when roaming – unless of course you had a stable "home network" address as per the IPv6 Mobile IP. But that would mean you are now running an IPv6 Mobile relay. The alternative is to validate the certificate by some other means, but then you'll be using a different validation criterion in the certificate subject and will no longer be aligned with the grand PKI trust federation model. Thus, you'll be are back to effectively managing an isolated PKI infrastructure, with all the bells and whistles like a revocation service, and you will do so while you're looking for the exact opposite of the promiscuous security session model all that enables.

Let's still assume none of that would matter and (D)TLS with PKI dragged in its wake were okay and the device could use those and indeed act as a server accepting inbound connections. Then we're still faced with the fact that cryptography computation is not cheap. Moving crypto into hardware is very possible, but impacts the device cost. Doing crypto in software requires that the device deals with it inside of the application or underlying frameworks. And for a microcontroller that costs a few dollars that's non-negligible work. So the next vector to keep the device from doing its actual work is to keep it busy with crypto. Present it with untrusted or falsely signed client certificates (if it were to expect those). Create a TLS link (even IPSec) and abandon it right after the handshake. Nice ways to burn some Watts.

Let's still pretend none of this were a problem. We're now up at the application level with transport layer security underneath. Who is authorized to talk to the device and which of the connections that pop up through that transport layer are legitimate? And if there is an illegitimate connection attempt, where do you log these and if that happens a thousand times a minute, where do you hold the log and how do you even scream for help if you're pegged on compute by crypto? Are you keeping an account store in the device? Quite certainly not in a system whose scope is more than one device. Are you then relying on an external authentication and authorization authority issuing authorization tokens? That's more likely, but then you're already running a token server.

The truth, however inconvenient, is that non-interactive special-purpose devices residing in untrusted physical spaces are, without getting external help from services, essentially indefensible as when acting as network servers. And this is all just on top of the basic fact that devices that live in untrusted physical space are generally susceptible to physical exploitation and that protecting secrets like key material is generally difficult.

Here's the recipe to eradicate most of the mess I've laid out so far: Devices don't actively listen on the network for inbound connections. Devices act as clients. Mostly.

Link vs. Network vs. Transport vs. Application

What I've discussed so far are considerations around the Network and Transport layers (RFC1122, 1.1.3) as I'm making a few general assumptions about connectivity between devices and control and telemetry collections systems, as well as about the connectivity between devices when they're talking in a peer-to-peer fashion.

First, I have so far assumed that devices talk to other systems and devices through a routable (inter-)network infrastructure whose scope goes beyond a single Ethernet hub, WLAN hotspot, Bluetooth PAN, or cellular network tower. Therefore I am also assuming the usage of the only viable routable network protocol suite and that is the Internet Protocol (v4 and v6) and with that the common overlaid transport protocols UDP and TCP.

Second, I have so far assumed that the devices establish a transport-level and then also application-level network relationship with their communication peers, meaning that the device commits resources to accepting, preprocessing, and then maintaining the connection or relationship. That is specifically true for TCP connections (and anything riding on top of it), but is also true for Network-level links like IPSec and session-inducing protocols overlaid over UDP, such as setting up agreements to secure subsequent datagrams as with DTLS.

The reason for assuming a standards-based Network and Transport protocol layer is that everything at the Link Layer (including physical bits on wire or through space) is quite the zoo, and one that I see growing rather than shrinking. The Link Layer will likely continue to be a space of massive proprietary innovation around creative use of radio frequencies, even beyond what we've seen in cellular network technology where bandwidth from basic GSM's 9.6Kbit/s to today's 100+ MBit/s on LTE in the last 25 years. There are initiatives to leverage new "white space" spectrums opened up by the shutdown of Analog TV, and there are services leveraging ISM frequency bands, and there might be well-funded contenders for licensed spectrum emerging that use wholly new stacks. There is also plenty of action on the short-range radio front, specifically also around suitable protocols for ultra-low power devices. And there are obviously also many "wired" transport options over fiber and copper that have made significant progress and will continue to do so and are essential for device scenarios, often in conjunction with a short-range radio hop for the last few meters/yards. Just as much as it was a losing gamble to specifically bet on TokenRing or ARCnet over Ethernet in the early days of Local Area Networking, it isn't yet clear what to bet on in terms of protocols and communication service infrastructures as the winners for the "Internet of Things", not even today's mobile network operators.

Betting on a particular link technology for inter-device communication is obviously reasonable for many scenarios where the network is naturally scoped by physical means like reach by ways of radio frequency and transmission power, the devices are homogeneous and follow a common and often regulation-imposed standard, and latency requirements are very narrow, bandwidth requirements are very high, or there is no tolerance for failure of intermediaries. Examples for this are in-house device networks for home automation and security, emerging standards for Vehicle-To-Vehicle (V2V) and Vehicle-To-Infrastructure (V2I) communication, or Automatic Dependent Surveillance (ADS, mostly ADS-B) in Air Traffic Control. Those digital radio protocols essentially form peer meshes where everyone listens to everything in range and filters out what they find interesting or addressed specifically at them. And if the use of the frequencies gets particularly busy, coordinated protocols impose time slices on senders.

What such link-layer or direct radio information transfers have generally struggled with is trustworthiness – allow me to repeat: verifiable origin, correct, unaltered, timely, and cannot be abused by unauthorized parties in any fashion.

Of course, by its nature, all radio based communication is vulnerable to jamming and spoofing, which has a grand colorful military history as an offensive or defensive electronic warfare measure along with fitting countermeasures (ECM) and even counter-countermeasures (ECCM). Radio is also, especially when used in an uncoordinated fashion, subject to unintended interference and therefore distortion.

ADS-B, which is meant to replace radar in Air Traffic Control doesn't even have any security features in its protocol. The stance of the FAA is that they will detect spoofing by triangulation of the signals, meaning they can tell whether a plane that say it's at a particular position is actually there. We should assume they have done their ECM and ECCM homework.

IEEE 1609 for Wireless Access in Vehicular Environments that's aiming to facilitate ad-hoc V2V and V2I communication, spells out an elaborate scheme to manage and use and roll X.509 certificates, but relies on the broad distribution of certificate revocation lists to ban once-issued certificates from the system. Vehicles are sold, have the telematics units replaced due to malfunction or crash damage, may be tampered with, or might be stolen. I can see the PKI's generally overly optimistic stance on revocations being challenging at the scale of tens if not hundreds of million vehicles, where churn will be very significant. The Online Certificate Status Protocol (OCSP, RFC6960) might help IEEE 1609 deal with the looming CRL caching issues due to size, but then requires very scalable validation server infrastructure that needs to be reachable whenever two vehicles want to talk, which is also not acceptable.

Local radio link protocols such as Bluetooth, WLAN (802.11x with 802.11i/WPA2-PSK), or Zigbee often assume that participants in a local link network share a common secret, and can keep that secret secret. If the secret leaks, all participants need to be rolled over to a new key. IEEE 802.1X, which is the foundation for the RADIUS Authentication and Authorization of participants in a network, and the basis of "WPA2 Enterprise" offers a way out of the dilemma of either having to rely on a federated trust scheme that has a hard time dealing with revocations of trust at scale, or on brittle pre-shared keys. 802.1X introduces the notion of an Authentication (and Authorization) server, which is a neutral third party that makes decisions about who gets to access the network.

Unfortunately, many local radio link protocols are not only weak at managing access, they also have a broad history of having weak traffic protection. WLAN's issues got largely cleaned up with WPA2, but there are plenty of examples across radio link protocols where the broken WEP model or equivalent schemes are in active use, or the picture is even worse. Regarding the inherent security of cellular network link-level protection, it ought to be sufficient to look at the recent scolding of politicians in Europe for their absent-mindedness to use regular GSM/UMTS phones without extra protection measures – and the seemingly obvious result of dead-easy eavesdropping by foreign intelligence services. Ironically, mobile operators make some handsome revenue by selling "private access points" (private APNs) that terminate cellular device data traffic in a VPN and that the customer then tunnels into across the hostile Internet to meet the devices on this fenced-off network, somehow pretending that the mobile network somehow isn't just another operator-managed public network and therefore more trustworthy.

Link-layer protection mechanisms are largely only suitable for keeping unauthorized local participants (i.e. intruders) from getting link-layer data frames up to any higher-level network logic. In link-layer-scoped peer-to-peer network environments, the line between link-layer data frames and what's being propagated to the application is largely blurred, but the previous observation stays true. Even if employed, link-layer security mechanisms are not much help on providing security on the network and transport layers, as many companies are learning the hard way when worms and other exploits sweep through the inside of their triply-firewalled, WPA2 protected, TPM-tied-IPSec-protected networks, or as travelers can learn when they don't have a local firewall up on their machine or use plaintext communication when connecting to the public network at a café, airport, or hotel.

Of course, the insight of public networks not being trustworthy has led many companies interconnecting sites and devices down the path of using virtual private network (VPN) technology. VPN technology, especially when coming in the form of a shiny appliance, makes it very easy to put a network tunnel terminator on either end of a communication path made up of a chain of untrustworthy links and networks. The terminator on either end conveniently surfaces up as a link-layer network adapter. VPN can fuse multiple sites into a single link-layer network and it is a fantastic technology for that. But like all the other technologies I discussed above, link-layer protection is a zoning mechanism, the security mechanisms that matter to protect digital assets and devices sit at the layers above it. There is no "S" for Security in "VPN". VPN has secure virtual network cables, it doesn't make the virtual hub more secure that they plug into. Also, in the context of small devices as discussed above, VPN is effectively a non-starter due to its complexity.

What none of these link-layer protection mechanisms help with, including VPN, is to establish any notion of authentication and authorization beyond their immediate scope. A network application that sits on the other end of a TCP socket, where a portion of the route is facilitated by any of these link layer mechanisms, is and must be oblivious to their existence. What matters for the trustworthiness of the information that travels from the logic on the device to a remote control system not residing on the same network, as well as for commands that travel back up to the device, is solely a fully protected end-to-end communication path spanning networks, where the identity of the parties is established at the application layer, and nothing else. The protection of the route at the transport layer by ways of signature and encryption is established as a service for the application layer either after the application has given its permission (e.g. certificate validation hooks) or just before the application layer performs an authorization handshake, prior entering into any conversations. Establishing end-to-end trust is the job of application infrastructure and services, not of networks.

Service Assisted Communication

The findings from this discussion so far can be summarized in a few points:

  • Remote controllable special-purpose devices have a fundamentally different relationship to network services compared to information devices like phones and tablets and require an approach to security that enables exclusive peering with a set of services or a gateway.
  • Devices that take a naïve approach to connectivity by acting like servers and expecting to accept inbound connections pose a number of network-related issues around addressing and naming, and even greater problems around security, exposing themselves to a broad range of attack vectors.
  • Link-layer security measures have varying effectiveness at protecting communication between devices at a single network scope, but none is sufficient to provide a trustworthy communication path between the device and a cloud-based control system or application gateway.
  • The PKI trust model is fundamentally flawed in a variety of ways, including being too static and geared towards long-lived certificates, and it's too optimistic about how well certificates are and can be protected by their bearers. Its use in the TLS context specifically enables the promiscuous client model, which is the opposite of the desired model for special-purpose devices.
  • Approaches to security that provide a reasonable balance between system throughput, scalability, and security protection are generally relying on third party network services that validates user credentials against a central pool, issues security tokens, or validates assurances made by an authority for their continued validity.

The conclusion I draw from these findings is an approach I call "Service Assisted Communication" (SAC). I'm not at all claiming the principles and techniques being an invention, as most are already broadly implemented and used. But I do believe there is value in putting them together here and to give them a name so that they can be effectively juxtaposed with the approaches I've discussed above.

The goal of Service Assisted Communication is to establishing trustworthy and bi-directional communication paths between control systems and special-purpose devices that are deployed in untrusted physical space. To that end, the following principles are established:

  • Security trumps all other capabilities. If you can't implement a capability securely, you must not implement it. You identify threats and mitigate them or you don't ship product. If you employ a mitigation without knowing what the threat is you don't ship product, either.
  • Devices do not accept unsolicited network information. All connections and routes are established in an outbound-only fashion.
  • Devices generally only connect to or establish routes to well-known services that they are peered with. In case they need to feed information to or receive commands from a multitude of services, devices are peered with a gateway that takes care of routing information downstream, and ensuring that commands are only accepted from authorized parties before routing them to the device
  • The communication path between device and service or device and gateway is secured at the application protocol layer, mutually authenticating the device to the service or gateway and vice versa. Device applications do not trust the link-layer network
  • System-level authorization and authentication must be based on per-device identities, and access credentials and permissions must be near-instantly revocable in case of device abuse.
  • Bi-directional communication for devices that are connected sporadically due to power or connectivity concerns may be facilitated through holding commands and notifications to the devices until they connect to pick those up.
  • Application payload data may be separately secured for protected transit through gateways to a particular service

The manifestation of these principles is the simple diagram on the right. Devices generally live in local networks with limited scope. Those networks are reasonably secured, with link-layer access control mechanisms, against intrusion to prevent low-level brute-force attacks such as flooding them with packets and, for that purpose, also employ traffic protection. The devices will obviously observe link-layer traffic in order to triage out solicited traffic, but they do not react to unsolicited connection attempts that would cause any sort of work or resource consumption from the network layer on up.

All connections to and from the device are made via or at least facilitated via a gateway, unless the device is peered with a single service, in which case that service takes on the role of the gateway. Eventual peer-to-peer connections are acceptable, but only if the gateway permits them and facilitates a secure handshake. The gateway that the device peers with may live on the local network and thus govern local connections. Towards external networks, the local gateway acts as a bridge towards the devices and is itself connected by the same set of principles discussed here, meaning it's acting like a device connected to an external gateway.

When the device connects to an external gateway, it does so by creating and maintaining an outbound TCP socket across a network address translation boundary (RFC2663), or by establishing a bi-directional UDP route, potentially utilizing the RFC5389 session traversal utilities for NAT, aka STUN. Even though I shouldn't have to, I will explicitly note that the WebSocket protocol (RFC6455) rides on top of TCP and gets its bi-directional flow capability from there. There's quite a bit of bizarre information on the Interwebs on how the WebSocket protocol somehow newly and uniquely enables bi-directional communication, which is obviously rubbish. What it does is to allow port-sharing, so that WebSocket aware protocols can share the standard HTTP/S ports 80 (RFC2616) and 443 (RFC2818) with regular web traffic and also piggyback on the respective firewall and proxy permissions for web traffic. The in-progress HTTP 2.0 specification will expand this capability further.

By only relying on outbound connectivity, the NAT/Firewall device at the edge of the local network will never have to be opened up for any unsolicited inbound traffic.

The outbound connection or route is maintained by either client or gateway in a fashion that intermediaries such as NATs will not drop it due to inactivity. That means that either side might send some form of a keep-alive packet periodically, or even better sends a payload packet periodically that then doubles as a keep-alive packet. Under most circumstances it will be preferable for the device to send keep-alive traffic as it is the originator of the connection or route and can and should react to a failure by establishing a new one.

As TCP connections are endpoint concepts, a connection will only be declared dead if the route is considered collapsed and the detection of this fact requires packet flow. A device and its gateway may therefore sit idle for quite a while believing that the route and connection is still intact before the lack of acknowledgement of the next packet confirms that assumption is incorrect. There is a tricky tradeoff decision to be made here. So-called carrier-grade NATs (or Large Scale NAT) employed by mobile network operators permit very long periods of connection inactivity and mobile devices that get direct IPv6 address allocations are not forced through a NAT, at all. The push notification mechanisms employed by all popular Smartphone platforms utilize this to dramatically reduce the power consumption of the devices by maintaining the route very infrequently, once every 20 minutes or more, and therefore being able to largely remain in sleep mode with most systems turned off while idly waiting for payload traffic. The downside of infrequent keep-alive traffic is that the time to detection of a bad route is, in the worst-case, as long as the keep-alive interval. Ultimately it's a tradeoff between battery-power and traffic-volume cost (on metered subscriptions) and acceptable latency for commands and notifications in case of failures. The device can obviously be proactive in detecting potential issues and abandon the connection and create a new one when, for instance, it hops to a different network or when it recovers from signal loss.

The connection from the device to the gateway is protected end-to-end and ignoring any underlying link-level protection measures. The gateway authenticates with the device and the device authenticates with the gateway, so neither is anonymous towards the other. In the simplest case, this can occur through the exchange of some proof of possession of a previously shared key. It can also happen via a (heavy) X.509 certificate exchange as performed by TLS, or a combination of a TLS handshake with server authentication where the device subsequently supplies credentials or an authorization token at the application level. The privacy and integrity protection of the route is also established end-to-end, ideally as a byproduct of the authentication handshake so that a potential attacker cannot waste cryptographic resources on either side without producing proof of authorization.

The current best option is a combination of the simple authentication model of SSH (pre-shared keys) with the established foundation of TLS. Luckily, this exists in the form of TLS-PSK (RFC4279), which enables pre-shared keys as credentials, and eliminates the weight of the X.509 certificate handling and wire-level exchange. The pre-shared key can be used as the session key proper (in the simplest case) or can be used as a credential and basis for a Diffie-Hellman session key exchange. The result is a fairly lightweight mechanism that can build on a narrow set of algorithms (like AES-256, SHA-256) on compute and library-footprint-constrained constrained devices, and still is compatible with all application layer protocols that rely on TLS.

The result of the application-level handshake is a secure peer connection between the device and a gateway that only the gateway can feed. The gateway can, in turn, now provide one or even several different APIs and protocol surfaces, that can be translated to the primary bi-directional protocol used by the device. The gateway also provides the device with a stable address in form of an address projected onto the gateway's protocol surface and therefore also with location transparency and location hiding.

The device could only speak AMQP or MQTT or some proprietary protocol, and yet have a full HTTP/REST interface projection at the gateway, with the gateway taking care of the required translation and also of enrichment where responses from the device can be augmented with reference data, for instance. The device can connect from any context and can even switch contexts, yet its projection into the gateway and its address remains completely stable. The gateway can also be federated with external identity and authorization services, so that only callers acting on behalf of particular users or systems can invoke particular device functions. The gateway therefore provides basic network defense, API virtualization, and authorization services all combined into in one.

The gateway model gets even better when it includes or is based on an intermediary messaging infrastructure that provides a scalable queuing model for both ingress and egress traffic.

Without this intermediary infrastructure, the gateway approach would still suffer from the issue that devices must be online and available to receive commands and notifications when the control system sends them. With a per-device queue or per-device subscription on a publish/subscribe infrastructure, the control system can drop a command at any time, and the device can pick it up whenever it's online. If the queue provides time-to-live expiration alongside a dead-lettering mechanism for such expired messages, the control system can also know immediately when a message has not been picked up and processed by the device in the allotted time.

The queue also ensures that the device can never be overtaxed with commands or notifications. The device maintains one connection into the gateway and it fetches commands and notifications on its own schedule. Any backlog forms in the gateway and can be handled there accordingly. The gateway can start rejecting commands on the device's behalf if the backlog grows beyond a threshold or the cited expiration mechanism kicks in and the control system gets notified that the command cannot be processed at the moment.

On the ingress-side (from the gateway perspective) using a queue has the same kind of advantages for the backend systems. If devices are connected at scale and input from the devices comes in bursts or has significant spikes around certain hours of the day as with telematics systems in passenger cars during rush-hour, having the gateway deal with the traffic spikes is a great idea to keep the backend system robust. The ingestion queue also allows telemetry and other data to be held temporarily when the backend systems or their dependencies are taken down for service or suffer from service degradation of any kind. You can find more on the usage of brokered messaging infrastructures for these scenarios in a MSDN Magazine article I wrote a year back.


An "Internet of Things" where devices reside in unprotected physical space and where they can interact with the physical world is a very scary proposition if we solely rely on naïve link and network-level approaches to connectivity and security, which are the two deeply interwoven core aspects of the "I" in "IoT". Special-purpose devices don't benefit from constant human oversight as phones and tablets and PCs do, and we struggle even to keep those secure. We have to do a better job, as an industry, to keep the devices secure that we want to install in the world without constant supervision.

"Trustworthy communication" means that information exchanged between devices and control systems is of verifiable origin, correct, unaltered, timely, and cannot be abused by unauthorized parties in any fashion. Such trust cannot be established at scale without employing systems that are designed for the purpose and keep the "bad guys" out. If we want smarter devices around us that helping to improve our lives and are yet power efficient and affordable, we can't leave them alone in untrustworthy physical space taking care of their own defenses, because they won't be able to.

Does this mean that the refrigerator cannot talk to the laundry washing machine on the local network? Yes, that is precisely what that means. Aside from that idea being somewhat ludicrous, how else does the washing machine defend itself from a malicious refrigerator if not by a gateway that can. Devices that are unrelated and are not part of a deeply integrated system meet where they ought to meet: on the open Internet, not "behind the firewall".

Categories: Architecture | Technology

Terminology that loosely ring-fences a group of related technologies is often very helpful in engineering discussions – until the hype machine gets a hold of them. “Cloud” is a fairly obvious victim of this. Initially conceived to describe large-scale, highly-available, geo-redundant, and professionally-managed Internet-based services that are “up there and far away” without the user knowing of or caring about particular machines or even datacenter locations, it’s now come so far that a hard drive manufacturer sells a network attached drive as a “cloud” that allows storing content “safely at home”. Thank you very much. For “cloud”, the dilution of the usefulness of the term took probably a few years and included milestones like the labeling of datacenter virtualization as “private cloud” and more recently the broad relabeling of practically all managed hosting services or even outsourced data center operations as “cloud”.

The term “Internet of Things” is being diluted into near nonsense even faster. It was initially meant to describe, as a sort of visionary lighthouse, the interconnection of sensors and physical devices of all kinds into a network much like the Internet, in order to allow for gaining new insights about and allow new automated interaction with the physical world – juxtaposed with today’s Internet that is primarily oriented towards human-machine interaction. What we’ve ended up with in today’s discussions is that the term has been made synonymous with what I have started to call “Thing on the Internet”.

A refrigerator with a display and a built-in browser that allows browsing the next super-market’s special offers including the ability to order them may be cool (at least on the inside, even when the gadget novelty has worn off), but it’s conceptually and even technically not different from a tablet or phone – and that would even be true if it had a bar code scanner with which one could obsessively check the milk and margarine in and out (in which case professional help may be in order). The same is true for the city guide or weather information functions in a fancy connected car multimedia system or today’s top news headline being burnt into a slice of bread by the mythical Internet toaster. Those things are things on the Internet. They’re the long oxidized fuel of the 1990s dotcom boom and fall. Technically and conceptually boring. Islands. Solved problems.

The challenge is elsewhere.

“Internet of Things” ought to be about internetworked things, about (responsibly) gathering and distributing information from and about the physical world, about temperature and pollution, about heartbeats and blood pressure, about humidity and mineralization, about voltages and amperes, about liquid and gas pressures and volumes, about seismic activity and tides, about velocity, acceleration, and altitude – it’s about learning about the world’s circumstances, drawing conclusions, and then acting on those conclusions, often again affecting the physical world. That may include the “Smart TV”, but not today’s.

The “Internet of Things” isn’t really about things. It’s about systems. It’s about gathering information in certain contexts or even finding out about new contexts and then improving the system as a result. You could, for instance, run a bus line from suburb into town on a sleepy Sunday morning with a promise of no passenger ever waiting for more than, say, 10 minutes, and make public transport vastly more attractive instead of running on a fixed schedule of every 60-90 minutes on that morning, if the bus system only knew where the prospective passengers were and can dynamically dispatch and route a few buses along a loose route.

“Let’s make an app” is today’s knee-jerk approach to realizing such an idea. I would consider it fair if someone were to call that discriminating and elitist as it excludes people too poor to afford a $200 pocket computer with a service plan, as well as many children, and very many elderly people who went through their lives without always-on Internet and have no interest in dealing with it now.

It’s also unnecessary complication, because the bus stop itself can, with a fairly simple (thermographic) camera setup, tell the system whether anyone’s waiting and also easily tell whether they’re actually staying around or end up wandering away, and the system can feed back the currently projected arrival time to a display at the bus stop – which can be reasonably protected against vandalism attempts by shock and glass break sensors triggering alarms as well as remote-recording any such incidents with the camera. The thermographic camera won’t tell us which bus line the prospective passenger wants to take, but a simple button might. It does help easily telling when a rambunctious 10 year-old pushes all the buttons and runs away.

Projecting the bus’ arrival time and planning the optimal route can be aided by city-supplied traffic information collected by induction loops and camera systems in streets and on traffic lights at crossings that can yield statistical projections for days and the time of day as well as ad-hoc data about current traffic disturbances or diversions as well as street conditions due to rain, ice, or fog – which is also supplied by the buses themselves (‘floating car data’) as they’re moving along in traffic. It’s also informed by the bus driver’s shift information, the legal and work-agreement based needs for rest times during the day, as well as the bus’ fuel or battery level, or other operational health parameters that may require a stop at a depot.

All that data informs the computation of the optimal route, which is provided to the bus stops, to the bus (-driver), and those lucky passengers who can afford a $200 pocket computer with a service plan and have asked to be notified when it’s time to leave the corner coffee shop in order to catch the next bus in comfort. What we have in this scenario is a set of bidirectional communication paths from and to bus, bus driver, bus stop, and passengers, aided by sensor data in streets and lights, all connecting up to interconnected set of control and information systems making decisions based on a combination of current input and past experience. Such systems need to ingest, process, and distribute information from and to tens of thousands of sources at the municipal level, and for them to be economically viable for operators and vendors they need to scale across thousands of municipalities. And the scenario I laid just out here is just one slice out of one particular vertical.

Those systems are hard, complex, and pose challenges in terms of system capacity, scalability, reliability, and – towering above all – security that are at the cutting edge or often still beyond the combined manufacturing (and IT) industry’s current abilities and maturity.

“Internet of Things” is not about having a little Arduino box fetch the TV schedule and sounding an alarm when your favorite show is coming on.

That is cool, but it’s just a thing on the Internet.

Categories: Architecture

I published a new video over on Subscribe about the "Internet of Things". Check it out.

Categories: Architecture

We're talking a lot about "Mobile" solutions in the industry, but the umbrella that this moniker casts has become far too big to be useful and doesn't represent any particular scenario subset that's useful for planning services for "mobile" devices. Nearly every personal computing scenario that consumers encounter today is "mobile".

This post is a personal perspective on "mobile" applications and how applications that run on devices labeled under this umbrella really enable a range of very different scenarios, and do require a different set of backend services, depending on the core scenario.

From this perspective, I present a taxonomy for these experiences that may be helpful with regards to their relationship to cloud services: Mobile, Outside, Inside, and Attached.

  • Mobile only applies to all scenarios where I am literally mobile. I'm moving.
  • Outside applies to all scenarios where I'm away from my office or my house, but at rest.
  • Inside applies to all scenarios where I'm at the office or at the house, but potentially roaming.
  • Attached applied to all scenarios where the experience is immovably attached to a physical location or device, appliance, or other machinery.


As soon as I get into my car and drive, my $700 phone is not really all that useful. Things will beep and ring and try to catch my attention, but they do that without respecting my physical world situation where I probably shouldn't pay much if any attention.

That said – the phone does integrate with my car's entertainment system, so that I can listen to music, podcasts, and Internet streams and the phone functionality also integrates for a hands-free experience. It also reads text message to me out loud as they arrived when the phone is paired with my car. That all happens because the OS supports these core functions directly.

In the case of my particular car, a 2013 Audi A6 with MMI Touch and Audi Connect services (I'm not at all meaning to be boasting here), the phone/entertainment system has even its own phone SIM, so all my phone gets to contribute is its address book and the music/audio link for playing songs from the phone. Text messages and phone communication and the car's built-in navigation features including getting live traffic data is all natively supported by the vehicle without needing the phone's help.

To help making sure that the traffic data is accurate, the vehicle sends – today; this isn't science fiction – anonymous motion information, so-called "floating car data" telemetry into a data pool where it gets analyzed and yields real-time information about slowdowns and traffic jams complementing stationary systems.

If you need to catch my attention while I am mobile in the sense of 'in motion' you either have to call me and hope that I choose to pick up and am not talking to someone else already, leave me a voice mail, send me a text message and hope I'll call back or, otherwise, wait. A text message will reach me right in the central dashboard display of my car.

If you send me a Twitter message or send me something on Facebook or send me an Email I most certainly won't see it until it's safe for me to take my eyes off the street – because it is much less targeted and not integrated in the experience that my personal safety depends on in that situation.

When I'm walking on the Microsoft campus or I'm at an airport in line to board a plane, it's very similar. You can try to reach me via any of these channels, but it's not too unlikely that I'll make you wait when the immediate circumstances demand my attention. Boarding that plane or getting to the next building in time for a meeting with 20 people while it's raining outside is likely of higher urgency than your message – I'm sorry.

A 'mobile' experience is one that supports my mobility and the fact that my primary focus is and must be elsewhere. It can augment that experience but it must not attempt to take center stage because that ought not to be its role. The "My Trips" app for TripIt.com on Windows Phone is a near perfect example of an experience that is truly tailored to mobility. The app doesn't make me ask questions. It knows my itinerary and anticipates what info I will need the next time I look at the live tile.

When I'm arriving at an airport, it will have looked up my connecting flight and will have sent a notification or repeatedly try to send one to fill the Live Tile with information about the connecting flight status and gate information. I don't even have to open the app. If there are critical disruptions it will send me a Toast notification that comes with an audible alarm and vibration to help getting my attention.

Avis, the rental car company, does the same thing via email and also their app for me since I'm a "Preferred" customer. Just before the scheduled pick-up time, which they can also adjust since I give them my flight info, I get a timely email with all the information I need to proceed straight to the stall where my rental car is parked and will find that within the last handful of emails as my plane lands. I proceed to the rental care facility, get into the vehicle, and I get the rental agreement slip as I exit the facility presenting my driver's license. No need to ask for anything; the system anticipates what I'll need and it excels at that.

The phone's calendar is obviously similar. It will show me the next relevant appointment including the location info so that's available at a glance when I just look at the phone while I'm walking to another building; and it will provide the most recent updates so if the meeting gets moved between rooms as I'm on my way then I'll see that reflected on the lock screen.

All these mobile experiences that I'm using today as I'm traveling, share that they are decoupled, asynchronous, often time-driven, and message based. I don't ask for things. I answer to and react to what needs my urgent attention and otherwise I will observe and then "get to it" as I truly have time to focus on something other than getting from A to B and being mobile. Mobilility is driven by messaging, not by request/response.


Being on the road, doesn't literally mean to be driving all the time, of course. Once I sit down and indeed start interacting with a device in order to read email, go through my other messages, read/watch news, or get some work done, I am still outside of the office or the house, but I am yet not on the move. I am at rest in relatively safety and can pay closer attention to the interaction with my information device.

The shape of that interaction differs from the pure mobile experience in that I commonly ask questions and interact with the device, with focus on the device experience. That includes everything from browsing the news, to researching with Wikipedia, watching training videos, to enjoying a movie. Listening to podcasts and/or radio is also one of those experiences even if we're often doing so while being on the move, i.e. walking or driving, as we're instantly able to turn our attention to more important matters as needed – like a nearing ambulance – if we're managing the audio volume as appropriate for the situation.

The outside experience is one where I can indeed get at most of my data assets, as much of it is readily accessible from anywhere since it's stored on the cloud or networked and accessible via VPN. Whether the device I am using to access that data is connected via 3G, LTE, WLAN, or wired Ethernet, and whether the screen is 5" or 27" is largely a question of what sort of an experience I'm looking for, and how big of a device I want to carry to where I'm going.

For many, if not most consumers, this outside experience is often the preferred interaction mode with their devices – and when they own only a single device it's largely indistinguishable from the Inside experience that I'll expand on in the next section. They sit in a cafe or elsewhere comfortable with connectivity, make notes, write email, hatch plans, capture snippets of their life in photos or videos and share them with friends through Instagram, Twitter, or Facebook.

For me, the Outside experience is however quite different from the Inside experience because it's constrained in two key ways: First, while and when connectivity is available, it's commonly either metered or it's provided on someone else's terms, for free or even paid, and that means I don't get a say on the bandwidth and quality and the bandwidth may be seriously constrained as it is, for instance, in most hotels.

What Outside also often means is that connectivity is sparse or non-existent. If I'm traveling and outside the country where I have my primary data contract, I will pay a platinum-coated-bits premium for data. Therefore I find myself Hotspot-hopping quite a bit. Outside may also mean that I'm going away from the core coverage zones of wireless networks, which means that I might quite well end up with no reliable access to network services because I'm either in a remote valley or inside the Faraday-Cage hull of a ship. It might also mean that I am in a stadium with 52,000 other people who are trying to use the same set of cell towers – which is the case about every two weeks for me.

Second, what I am connecting to is a shared network that I cannot trust, which is not well suited for easy discovery and sharing scenarios that rely on UPnP/SSDP and similar protocols.

From an infrastructure perspective, apps that focus on Outside experiences work best if they can deal with varying quality and availability of connectivity, and if they are built to hold and/or access data in a way that is independent – for better and worse – of the scope and sandboxing provided by the local network that I'm connecting to. Thus, Outside experiences are best suited for using cloud-based services.


The Inside experience is much like the Outside one but with the key difference that I either directly own the environment that I'm connecting into or that I at least have reason to trust the owners and anyone else they allow to connect to the environment. That's true for my home network and it's also true, even though with a few caveats, for the office network.

The Outside/Inside split and a further differentiation of Inside into work and home environments is also what the Windows Firewall uses to categorize networks. The public, outside networks are on the lowest trust level, domain networks are a notch higher, and private networks are most trusted.

The experiences that I use on my Inside network at home are indeed different from the experiences I use when Outside. Xbox Smart Glass is a pure inside experience that pairs my mobile device with my Xbox as a companion experience. Xbox connects to my Windows Media Center to make my DVB-S2 SmartCard tuner available to in the guest room, I have a remote control on my phone for my Onkyo A/V receiver, I have IPTV apps with which I can tune into HDTV streams available on my Internet service, I use file sharing to access my multi-TB local photo archive.

A great Inside-experience needs services that are very similar to those of Outside experiences, including state-roaming between devices, and even more so support for seamless multi-device "continuous client" experiences – but they are not necessarily cloud-bound.


Some of the latter Inside experiences, especially the photo archive, are on the brink towards being Attached scenarios. Since I'm shooting photos in RAW and video in 1080p/50, I easily bring home 30GB+ or more from a day out at a museum or air show, and I tend to keep everything. That much data develops quite a bit of gravitational pull, meaning to say that it's not easily moved around.

What's not easily moved around, at all, are experiences that depend on a particular physical asset that is located at a particular place. The satellite dish at my house is something I need to be close to or go to (in the network sense) in order to get at content that is exclusively delivered via that channel. It also, has to be decoded with that one precious smart card that I rent from the Pay-TV provider.

If I had surveillance cameras and motion sensors around the house (I'll let you speculate on whether I really do), those cameras and sensors are location locked and I need to go to them. I can conceivably take a WLAN hub and my Xbox when I go on a vacation trip (and some people do) to make an Inside experience at a hotel room, but I can hardly take the satellite dish and the cameras.

In the business world, even when interacting with consumers, there are plenty of these immobile experiences. An ATM is a big and heavy money safe designed to be as immobile as possible that is equipped with a computer that controls how much cash I can take from that safe. A check-in terminal at an airport makes sense there as a shared experience because it gives me a printout of a document – the boarding pass – that I can use to document my authorization to travel on a particular flight. That's convenient, since paper doesn't run out of battery.

What's particularly noteworthy is that some attached experiences, such as the huge center screen in Tesla Motors' Model S, are attached and inseparable from the larger context, and yet fulfill a mobility role at times – and at other times they function like an outside appliance.

We encounter "attached" experiences while we are mobile, but they're stationary in their own context. That context may, however, be mobile if the attached experience is an in-flight entertainment system or an information terminal on a train.


The Mobile, Inside, Outside, Attached terminology may be a tad bit factual and dry, but I believe it's a useful taxonomy nevertheless. If you have a set of catchier monikers I'm all ears. Let me know whether you find this useful.

Categories: Architecture

"Internet of Things" (IoT) is the grand catchphrase for network-enabling everyday objects and leveraging the new connectivity to collect information from the devices, allowing network-side control, and supplying information to those objects that allows them to do new tricks – like telling a toaster about the day's weather forecast so that it can burn a sun or a cloud into your morning slice of bread.

The opportunities and the use-cases in this space are almost limitless. Network-enabled, commercial vehicles or even subsystems like engines or brake systems can leverage the connectivity for conveying servicing information and predictive failure analysis, for route optimization, and driver-safety programs. Devices attached to power grid components can provide deeper insight into the health even of decades old equipment, and they can help with managing capacity now that consumers turn producers with wind-power generators and the flow of electricity has become a two-way flow. Smart devices will also help consumers to closely track their own energy consumption and, obviously, automate aspects of their households.

The "Internet of Things" wave will span many more industries and I could easily go on for many pages with more examples of which none are science-fiction. They're real and either already deployed in small scale or on the drawing boards of engineers under active development.

What's decidedly different about this new Internet wave is that it is driven by an entirely different class of people and companies as the wave of the consumer Web. The "Internet of Things" is (surprise!) about "Things" and the drivers are the makers of such things. Everyday devices and machines that consumers and businesses are buying today and that the manufacturers are looking to make better for tomorrow.

The Web is the grand success that it is because it was built for and on people-centric, general-purpose computing devices. It's also largely focused on people's interaction with information stores and sources, meaning that the impulse for interaction is usually coming from the end-user. Where user-initiated requests and the subsequent, hypertext-driven requests rooting from the same impulse are the overwhelming traffic motivation, the dominant HTTP protocol with its focus on request/response exchanges initiated by the client is a logical best fit.

Web technology also provides for an enormously low barrier of entry for new commercial providers of software-based services, as those are running on commodity hardware and solution builders can pick from a great choice of commoditized software platforms.

The Internet of Things is not the Web

There are some good reasons to believe that the Internet of Things will be different from the Web.

To play in this space, companies typically start with a set of existing physical products, a set of use-cases around these products, and very often an established and loyal customer base that's looking for new capabilities in the things they already use for their business or personal lives. There are a good number of startups operating in this space, but disrupting and displacing incumbents in the maritime industries, commercial vehicles, specialized production machinery, or high-speed train manufacturing may quite well happen at the component supplier level, but is harder to see coming at the product level.

If you are operating fleets of seagoing vessels, taxis, or trucks, or you run an electricity grid or manage street-lamps, there's a clear set of primary use-cases, like shipping things from A to B, along with well-established industries supplying machinery for these use-cases, and the digital component is a value add, not a purpose in itself.

While a 40,000 feet view onto the topology of a network of devices may look very similar to the topology of the network of the Web, with an interconnected core of services and a peripheral cloud of clients. As you get closer, the similarities start waning, though. While the Web is geared towards a primary interaction pattern where the clients initiate all activities and interaction is typically some form of information exchange – may that be a query being traded for a result list or a data-set update traded for a receipt – the interaction patterns are more differentiated for special-purpose devices where direct, human interaction is not in focus.

I generally classify the interaction patterns for 'things' into four major categories: Telemetry, Inquiries, Commands, and Notifications.

  • Telemetry is the flow of information about the current or temporally aggregated state of the device or the state of its environment (e.g. readings from its sensors) from the device to some other party. The information flow is unidirectional and away from the device.
  • Inquiries are questions that the device has about the state of the outside world based on its current circumstances; an inquiry can be a singular query akin to a database lookup, but it might also ask a service to supply a steady flow of information. For instance, the aforementioned toaster will ask for the weather and get a singular response, but a vehicle might supply a set of geo-coordinates for a route and ask for continuous traffic alert updates about particular route until it arrives at the destination. Only the former of these cases is the regular request/response case that HTTP is geared towards.
  • Commands are service-initiated instructions sent to the device. Commands can tell a device to send information about its state – either as a point-in-time observation or over continuously some period – or to change the state of the device, including performing activities with effects in the physical world. That includes, for instance, sending a command from a smartphone app to unlock the doors of your vehicle, whereby the command first flows to an intermediating service and from there it's routed to the vehicle's onboard control system.
  • Notifications are one-way, service-initiated messages that inform a device or a group of devices about some environment state they're otherwise not aware of. Cities may broadcast information about air pollution alerts suggesting fossil-fueled systems to throttle CO2 output – or, more simply, a car may want to show weather or news alerts or text messages to the driver.

For all four interaction categories it is, except for that one request-response sub-scenario, a clear requirement to have bi-directional information flow that can be client-initiated or server-initiated, depending on the particular function's need.

A requirement that indirectly grows out of the need for server-initiated flow (like telling your vehicle to toot a tune on its horn using a smartphone app when you've forgotten in what corner of which level of the structure you've parked at the airport a week ago) is that there must be a continuously maintained traffic route towards the client, which may only have to route a few messages per week, but if whenever a message is sent, the expectation is that the latency is in the order of a few seconds.

Because the scenarios around 'things' are quite different from those of the Web, where the focus is on people and their interactions, there's quite a bit of a risk that its well-known technologies turn into false friends. Are they a fit because they're ubiquitous?

VPN to the Rescue?

The standard Web interaction model where the client initiates and a service response is just one out of a several different patterns that are required for connected and non-interactive 'things', and this presents quite a bit of a challenge. How would I send a service-originated command or a notification to a connected device?

One option would indeed be HTTP long-polling or Web Sockets. The client could establish such a connection and the service would hold on to it and subsequently route all service-originated messages for the client through the established channel. That's a reasonable strategy, even if that introduces a solvable, but tricky service-side routing challenge of how the service will route to that ephemeral socket or pending request across a multi-node fabric.

But because that's tricky, many people in the devices space seem to go down a different route today: They're turning the devices into servers – and suddenly that routing problem is magically solved. If I can send a notification or command to a device by ways of issuing an HTTP request to it, I could use off-the-shelf components and HTTP to implement both the client-to-service Telemetry/Inquiry path and the service-to-client Command/Notification. Or even if I'm not into HTTP, I could just use whatever standard or proprietary protocol I like, and yet just treat either party as a server from the respective other party's perspective.

That's enormously attractive. It also poses a new challenge. How do I turn a truck or an off-shore wind-turbine into an addressable server endpoint? The answer to that question is, across many industries and companies, in unison, "VPN".

Virtual Private Networks (VPN) provide a link layer integration model between network participants. Expressed in a more pedestrian fashion, a VPN is akin to hooking everyone connected to the VPN onto the same Ethernet hub, whereby secured public Internet connections act as the network cables. Because the VPN illusion is created down at the link level and largely equivalent to having a network adapter on that network, participants on the VPN can speak practically any protocol, including but not limited to IPv4 and IPv6, and all protocols that ride on top of those two.

The steps for making a field-deployed device network addressable – assuming it supports VPN – are fairly straightforward: The device first establishes an external network identity that allows it to connect either to the public Internet or, as it is sometimes done for GPRS/3G/LTE devices, a carrier-provided closed network by ways of a dedicated in-network access point. Then it establishes the VPN tunnel by connecting to the VPN gateway's endpoint, which either resides on the Internet or on the closed network. Once the tunnel is established, the device is now connected to a separate, second network: the VPN.

Assuming that network is an IP network, the device either already has a pre-assigned address or requests an address lease from the network's DHCP service and is then a fully addressable network participant within the private address space of the VPN. If we further assume that the service who wants to address the device has direct or routed access to the VPN's address space, the service can now directly address the device and talk to any endpoints the device may be listening on. And because all of the tunnels into the VPN are secure, all the traffic exchanged between any of the parties is automatically secure without taking any extra precautions. Perfect solution. Is it?

Where's the Catch?

The biggest issue with the VPN approach for field-deployed devices lies where many people would expect it least: Security. That might be a surprise as VPN is often seen as being seen as almost synonymous with a "secure network space", which is not a proper way to look at it.

VPN provides a virtualized and private (isolated) network space. The secure tunnels are a mechanism to achieve an appropriately protected path into that space, but the space per-se is not secured, at all. It is indeed a feature that the established VPN space is fully transparent to all protocol and traffic above the link layer.

In the two predominant use-cases for VPN technology, its transparency is clearly desirable: The first use-case is the integration of corporate satellite assets like notebooks into secure networks. The second key use-case are inter-datacenter links fusing datacenter or application-scope networks over the public Internet. In the latter case, the connected parties are presumably following datacenter best-practices for physical and network access control. In the former case, the client is commonly in the possession of an authorized employee or vendor, requires individual user credentials for access, is often protected with a smartcard, is subject to device-level encryption, and often allows some degree of remote control including remote wipe if the asset becomes compromised. In both cases, the assets connecting into the VPN are either under immediate control of personnel authorized by the VPN owner, or there are several layers of safeguards in place to prevent access to the VPN should the assets become compromised.

The security of a virtual network space solely depends on controlling and securing all assets that connect into it, which obviously includes physical access security.

Now imagine you're an energy utility company planting a farm of wind-turbines into a field on a remote hill. Or imagine you're a city planting environmental sensors for pollution, humidity, barometric pressure, and temperature onto rooftops. Or imagine you're a manufacturer selling network-attachable kitchen appliances to the general public.

And now imagine that the way you're creating bi-directional connectivity to these devices and to make them addressable is by mapping them into a VPN, together with your services and any other such device – at the link layer.

How much can you trust or control that these devices don't get physically hijacked and compromised? What's the attack surface area of your services and the neighboring devices in case that were to happen?

It's one of the key principles of security that whoever has physical possession of a device owns ("pwns") the device from a security perspective. If you're handing complete strangers networked devices that can log themselves into your VPN based on secrets present on the device, you should expect that you'll eventually have unauthorized link-level visitors in the private network that you will have to be prepared to defend against – and you'll have to defend the device's neighbors as much as the services you map into the same private network space.

The security measures you'll have to put in place for this eventuality are largely equivalent to securing the services and devices as if they were directly attached to the public Internet. If you get uninvited visitors who exploit a device, you will have to assume malicious intent; these intruders will not show up by accident. Therefore, you'll have to firewall all devices, and you have to put authentication and access control measures on all exposed service endpoints. You'll also have to ensure that whatever service software stack is running on the device is "Internet hardened" and that you have an appropriate avenue to promptly distribute security updates, should that become necessary.

In addition to the security challenges, only some advanced VPN protocols over IPSec/IKEv2 (RFC 4555) allow for seamless handling of connection failure scenarios, client network roaming, and reconnect. With devices on unreliable or highly congested networks, or devices that are used in mobility scenarios where connections may be interrupted because of signal interruptions, a VPN client without this support will incur the cost of having to reestablish the tunnel and the VPN session whenever the connection drops. That, in turn, can lead to routing confusion when a client drops and reconnects and shows up on different VPN load balanced router while some service-side component wants to send data to the device.

Lastly, VPN is very resource hungry for establishing data tunnels to hundreds of thousands or more small devices that send and receive relatively few and usually fairly small messages each. It's demanding on the client in terms of the required stack and the processing needs, which may be a problem for small embedded devices. It's also enormously resource consuming on the service-side. Current, dedicated hardware for managing 10,000 simultaneous VPN tunnels can easily cost over $100,000 USD with single redundancy for one site.

As you contemplate the complexity consequences, it is possible you'll come to the conclusion that creating a VPN for the connected devices scenario may not be the obvious best choice that it seemed to be at first.


Even with the laid out constraints, VPN might be a viable model to enable two-way connectivity, if you're willing to make the right security investments on top, and if the devices are capable enough.

If a VPN solution and its consequential complexities were indeed turning out to be too heavyweight for you, you'll again have the problem of how to make the devices individually addressable in order to send a service-originated command or a notification to a connected device.

As mentioned, one possible alternative would be a long-polling Web Sockets based gateway, where the client establishes a connection that the service holds on to, and subsequently routes service-originated messages through the established channel back up to the client.

The advantage of this model is that the client will not have to be directly addressable. If the gateway is hosted on a public-facing Internet address, the client can establish a connection coming through any layers of NATs and/or Proxies and/or Firewalls and the service can route information back over that established link through these intermediary infrastructures.

What this model still wouldn't solve well is the case of clients that get occasionally disconnected due to weak wireless signals or congested networks. The client can park one of these sockets on the gateway and make itself known, but if the connection collapses and the device is out of reach for a little while and a message arrives for it, where does that message go?

Also, once the client comes back it may connect to a different gateway machine by ways of a load balancer, so if you're retaining that message on the original gateway node, you'd now have to route it to the current gateway node. Because the 'current' node may shift if the device repeatedly connects and disconnect when it is, for instance, at the edge of the wireless coverage area, chasing the client gets fairly complicated. And that's something you'd have to build.


A very practical and fairly straightforward solution to the entirely problem space is to use a scalable and Internet-facing messaging system using a bi-directional and multiplexing protocol like AMQP to facilitate the outbound as well as the inbound device traffic.

If each device is assigned an exclusive queue or a filtered subscription on a pub/sub topic for messages to it, the addressing problem is moved from the edge where the device (VPN) or its connection (Web Sockets) must be identified to one where messages for a device get routed to a well-known and stable location in the system and from which the device can pick them up as it can, depending on connectivity state. When a message is sent and the device is connected and waiting for a message, the message can be delivered in a matter of a few milliseconds. If the device is temporarily offline, it can pick up messages whenever it regains network access – unless the message expire, which is an option in most common messaging systems so that a command like "unlock door" isn't executed to everyone's surprise a day later if the device was disconnected for that long.

Since the devices will pull messages from the messaging system and send messages on the same path and with AMQP also over the very same multiplexed connection, the communication path – likely enveloped with SSL/TLS – is as secure as a VPN tunnel and an HTTPS-wrapped Web Socket, and have the same advantages as the Web Socket path in terms of not exposing the client to unwanted traffic because all traffic is outbound and coming from behind existing protection layers.

From a scalability perspective, a scalable pub/sub system with addressable entities and well-known scale characteristics also provides a good structure to allow for cleanly partitioning devices and device-groups across as many queues and topics as needed to accommodate a large device population.


Using VPNs for device connectivity is a viable if the solution addresses the inherent security issues. Using a VPN doesn't equate creating a secure network space. It creates a virtual network space with full fidelity at the link layer with protected paths into that network space. That's a big difference. The tax that needs to be paid for VPN support on the client is not insignificant, and securing the virtualized network doesn't pose a smaller challenge than securing Internet-exposed devices, especially when those devices are outside the manufacturer's or operator's immediate physical control.

After weighing these costs, a solution that builds purely on simple, client-originated connectivity with overlaid transport layer security is not only much simpler, but doesn't carry the same security risks or infrastructure tax.

Using a messaging system as the gateway technology for these client-originated connections where each client has a designated 'mailbox' in form of a queue or topic also elegantly solves the addressability issue with the added benefit of being resilient against occasional connection loss – while not causing significant extra latency cost or overhead.

For a walkthrough on how to architect a system of this kind, I'll recommend taking a look at my June 2012 MSDN Magazine article (which may have been published a year before its time) and you can expect more on this topic here in the upcoming months.

Categories: Architecture | IoT

January 15, 2013
@ 06:56 PM

File:ESB.pngThe basic idea of the Enterprise Service Bus paints a wonderful picture of a harmonious coexistence, integration, and collaboration of software services. Services for a particular general cause are built or procured once and reused across the Enterprise by ways of publishing them and their capabilities in a corporate services repository from where they can be discovered. The repository holds contracts and policy that allows dynamically generating functional adapters to integrate with services. Collaboration and communication is virtualized through an intermediary layer that knows how to translate messages from and to any other service hooked into the ESB like a babel fish in the Hitchhiker’s Guide to the Galaxy. The ESB is a bus, meaning it aspires to be a smart, virtualizing, mediating, orchestrating messaging substrate permeating the Enterprise, providing uniform and mediated access anytime and anywhere throughout today’s global Enterprise. That idea is so beautiful, it rivals My Little Pony. Sadly, it’s also about as realistic. We tried regardless.

As with many utopian ideas, before we can get to the pure ideal of an ESB, there’s some less ideal and usually fairly ugly phase involved where non-conformant services are made conformant. Until they are turned into WS-* services, any CICS transaction and SAP BAPI is fronted with a translator and as that skinning renovation takes place, there’s also some optimization around message flow, meaning messages get batched or de-batched, enriched or reduced. In that phase, there was also learning of the value and lure of the benefits of central control. SOA Governance is an interesting idea to get customers drunk on. That ultimately led to cheating on the ‘B’. When you look around and look at products proudly carrying the moniker ‘Enterprise Service Bus’ you will see hubs. In practice, the B in ESB is mostly just a lie. Some vendors sell ESB servers, some even sell ESB appliances. If you need to walk to a central place to talk to anyone, it’s a hub. Not a bus.

Yet, the bus does exist. The IP network is the bus. It turns out to suit us well on the Internet. Mind that I’m explicitly talking about “IP network” and not “Web” as I do believe that there are very many useful protocols beyond HTTP. The Web is obviously the banner example for a successful implementation of services on the IP network that does just fine without any form of centralized services other than the highly redundant domain name system.

Centralized control over services does not scale in any dimension. Intentionally creating a bottleneck through a centrally controlling committee of ESB machines, however far scaled out, is not a winning proposition in a time where every potential or actual customer carries a powerful computer in their pockets allowing to initiate ad-hoc transactions at any time and from anywhere and where we see vehicles, machines and devices increasingly spew out telemetry and accept remote control commands. Central control and policy driven governance over all services in an Enterprise also kills all agility and reduces the ability to adapt services to changing needs because governance invariably implies process and certification. Five-year plan, anyone?

If the ESB architecture ideal weren’t a failure already, the competitive pressure to adopt direct digital interaction with customers via Web and Apps, and therefore scale up not to the scale of the enterprise, but to scale up to the scale of the enterprise’s customer base will seal its collapse.

Service Orientation

While the ESB as a concept permeating the entire Enterprise is dead, the related notion of Service Orientation is thriving even though the four tenets of SOA are rarely mentioned anymore. HTTP-based services on the Web embrace explicit message passing. They mostly do so over the baseline application contract and negotiated payloads that the HTTP specification provides for. In the case of SOAP or XML-RPC, they are using abstractions on top that have their own application protocol semantics. Services are clearly understood as units of management, deployment, and versioning and that understanding is codified in most platform-as-a-service offerings.

That said, while explicit boundaries, autonomy, and contract sharing have been clearly established, the notion of policy-driven compatibility – arguably a political addition to the list to motivate WS-Policy as the time – has generally been replaced by something even more powerful: Code. JavaScript code to be more precise. Instead of trying to tell a generic client how to adapt to service settings by ways of giving it a complex document explaining what switches to turn, clients now get code that turns the switches outright. The successful alternative is to simply provide no choice. There’s one way to gain access authorization for a service, period. The “policy” is in the docs.

The REST architecture model is service oriented – and I am not meaning to imply that it is so because of any particular influence. The foundational principles were becoming common sense around the time when these terms were coined and as the notion of broadly interoperable programmable services started to gain traction in the late 1990s – the subsequent grand dissent that arose was around whether pure HTTP was sufficient to build these services, or whether the ambitious multi-protocol abstraction for WS-* would be needed. I think it’s fairly easy to declare the winner there.

Federated Autonomous Services

imageWindows Azure, to name a system that would surely be one to fit the kind of solution complexity that ESBs were aimed at, is a very large distributed system with a significant number of independent multi-tenant services and deployments that are spread across many data centers. In addition to the publicly exposed capabilities, there are quite a number of “invisible” services for provisioning, usage tracking and analysis, billing, diagnostics, deployment, and other purposes.  Some components of these internal services integrate with external providers. Windows Azure doesn’t use an ESB. Windows Azure is a federation of autonomous services.

The basic shape of each of these services is effectively identical and that’s not owing, at least not to my knowledge, to any central architectural directive even though the services that shipped after the initial wave certainly took a good look at the patterns that emerged. Practically all services have a gateway whose purpose it is to handle and dispatch and sometimes preprocess incoming network requests or sessions and a backend that ultimately fulfills the requests. The services interact through public IP space, meaning that if Service Bus wants to talk to its SQL Database backend it is using a public IP address and not some private IP. The Internet is the bus. The backend and its structure is entirely a private implementation matter.  It could be a single role or many roles.

Any gateway’s job is to provide network request management, which includes establishing and maintaining sessions, session security and authorization, API versioning where multiple variants of the same API are often provided in parallel, usage tracking, defense mechanisms, and diagnostics for its areas of responsibility. This functionality is specific and inherent to the service. And it’s not all HTTP. SQL database has a gateway that speaks the Tabular Data Stream protocol (TDS) over TCP, for instance, and Service Bus has a gateway that speaks AMQP and the binary proprietary Relay and Messaging protocols.

Governance and diagnostics doesn’t work by putting a man in the middle and watching the traffic coming by, which is akin to trying the tell whether a business is healthy by counting the trucks going to their warehouse. Instead we are integrating the data feeds that come out of the respective services and are generated fully knowing the internal state, and concentrate these data streams, like the billing stream, in yet other services that are also autonomous and have their own gateways. All these services interact and integrate even though they’re built by a composite team far exceeding the scale of most Enterprise’s largest projects, and while teams run on separate schedules where deployments into the overall system happen multiple times daily. It works because each service owns its gateway, is explicit about its versioning strategy, and has a very clear mandate to honor published contracts, which includes explicit regression testing. It would be unfathomable to maintain a system of this scale through a centrally governed switchboard service like an ESB.

Well, where does that leave “ESB technologies” like BizTalk Server? The answer is simply that they’re being used for what they’re commonly used for in practice. As a gateway technology. Once a service in such a federation would have to adhere to a particular industry standard for commerce, for instance if it would have to understand EDIFACT or X.12 messages sent to it, the Gateway would employ an appropriate and proven implementation and thus likely rely on BizTalk if implemented on the Microsoft stack. If a service would have to speak to an external service for which it would have to build EDI exchanges, it would likely be very cost effective to also use BizTalk as the appropriate tool for that outbound integration. Likewise, if data would have to be extracted from backend-internal message traffic for tracking purposes and BizTalk’s BAM capabilities would be a fit, it might be a reasonable component to use for that. If there’s a long running process around exchanging electronic documents, BizTalk Orchestration might be appropriate, if there’s a document exchange involving humans then SharePoint and/or Workflow would be a good candidate from the toolset.

For most services, the key gateway technology of choice is HTTP using frameworks like ASP.NET, Web API, probably paired with IIS features like application request routing and the gateway is largely stateless.

In this context, Windows Azure Service Bus is, in fact, a technology choice to implement application gateways. A Service Bus namespace thus forms a message bus for “a service” and not for “all services”. It’s as scoped to a service or a set of related services as an IIS site is usually scoped to one or a few related services. The Relay is a way to place a gateway into the cloud for services where the backend resides outside of the cloud environment and it also allows for multiple systems, e.g. branch systems, to be federated into a single gateway to be addressed from other systems and thus form a gateway of gateways. The messaging capabilities with Queues and Pub/Sub Topics provide a way for inbound traffic to be authorized and queued up on behalf of the service, with Service Bus acting as the mediator and first line of defense and where a service will never get a message from the outside world unless it explicitly fetches it from Service Bus. The service can’t be overstressed and it can’t be accessed except through sending it a message.

The next logical step on that journey is to provide federation capabilities with reliable handoff of message between services, meaning that you can safely enqueue a message within a service and then have Service Bus replicate that message (or one copy in the case of pub/sub) over to another service’s Gateway – across namespaces and across datacenters or your own sites, and using the open AMQP protocol. You can do that today with a few lines of code, but this will become inherent to the system later this year.

Categories: Architecture | SOA | Azure

Over on my new Channel 9 blog I've started a series that will (hopefully) help novices with getting started developing applications that leverage Windows Azure Service Bus (and, in coming episodes also Service Bus for Windows Server)

The first two episodes are up:

There's much more to come, and the best way to get at it as it comes out is to subscribe to the RSS feed or bookmark the landing page.

Categories: Architecture

Here's a short video explaining where I’m at now (here’s a map) and what I’m up to. Meanwhile I’ve also figured out how to put sound on both channels with the setup that I have, but here it’s still just on the left channel and also doesn’t sound as good as it should as I don’t have all the white-noise correcting Jedi motions mastered.  

Spoiler: I’ll start doing a video show on a regular basis and need input for content planning, so if you have any ideas, don’t hesitate to send me email or Tweet me at @clemensv.

Categories: Architecture | Blog | Talks

I just got off the call with a customer and had a bit of a déjà vu from a meeting at the beginning of the week, so it looks like the misconception I'll explain here is a bit more common than I expected.

In both cases, the folks I talked to, had the about equivalent of the following code in their app:

var qc = factory.CreateQueueClient(…);
for( int i = 0; i < 1000; i++ )
… create message …
qc.BeginSend( msg, null, null );

In both cases, the complaint was that messages were lost and strange exceptions occurred in the logs – which is because, well, this doesn't do what they thought it does.

BeginSend in the Service Bus APIs or other networking APIs as much as BeginWrite on the file system isn't really doing the work that is requested. It is putting a job into a job queue – the job queue of the I/O thread scheduler.

That means that once the code reaches qc.Close() and you have also been mighty lucky, a few messages may indeed have been sent, but the remaining messages will now still sit in that job queue and scheduled for an object that the code just forced to close. With the result that every subsequent send operation that is queued but hasn't been scheduled yet will throw as you're trying to send on a disposed object. Those messages will fail out and be lost inside the sender's process.

What's worse is that writing such code stuffs a queue that is both out of the app's control and out of the app's sight and that all the arguments (which can be pretty big when we talk about messages) dangle on those jobs filling up memory. Also, since the app doesn't call EndSend(), the application also doesn't pick up whatever exceptions are potentially raised by the Send operation and flies completely blind. If there is an EndXXX method for an async operation, you _must_ call that method even if it doesn't return any values, because it might quite well throw you back what went wrong.

So how should you do it? Don't throw messages blindly into the job queue. It's ok to queue up a few to make sure there's a job in the queue as another one completes (which is just slightly trickier than what I want to illustrate here), but generally you should make subsequent sends depend on previous sends completing. In .NET 4.5 with async/await that's a lot easier now:

var qc = factory.CreateQueueClient(…);
for( int i = 0; i < 1000; i++ )
… create message …
await task.Factory.FromAsync(qc.BeginSend, qc.EndSend, msg, null );

Keep in mind that the primary goal of async I/O is to not waste threads and lose time through excessive thread switching as threads hang on I/O operations. It's not making the I/O magically faster per-se. We achieve that in the above example as the compiler will break up that code into distinct methods where the loop continues on an I/O thread callback once the Send operation has completed.


  1. Don't stuff the I/O scheduler queue with loads of blind calls to BeginXXX without consideration for how the work gets done and completed and that it can actually fail
  2. Always call End and think about how many operations you want to have in flight and what happens to the objects that are attached to the in-flight jobs
Categories: Architecture | Technology

September 1, 2012
@ 04:49 AM

Today has been a lively day in some parts of the Twitterverse debating the Saga pattern. As it stands, there are a few frameworks for .NET out there that use the term "Saga" for some framework implementation of a state machine or workflow. Trouble is, that's not what a Saga is. A Saga is a failure management pattern.

Sagas come out of the realization that particularly long-lived transactions (originally even just inside databases), but also far distributed transactions across location and/or trust boundaries can't eaily be handled using the classic ACID model with 2-Phase commit and holding locks for the duration of the work. Instead, a Saga splits work into individual transactions whose effects can be, somehow, reversed after work has been performed and commited.


The picture shows a simple Saga. If you book a travel itinerary, you want a car and a hotel and a flight. If you can't get all of them, it's probably not worth going. It's also very certain that you can't enlist all of these providers into a distributed ACID transaction. Instead, you'll have an activity for booking rental cars that knows both how to perform a reservation and also how to cancel it - and one for a hotel and one for flights.

The activities are grouped in a composite job (routing slip) that's handed along the activity chain. If you want, you can sign/encrypt the routing slip items so that they can only be understood and manipulated by the intended receiver. When an activity completes, it adds a record of the completion to the routing slip along with information on where its compensating operation can be reached (e.g. via a Queue). When an activity fails, it cleans up locally and then sends the routing slip backwards to the last completed activity's compensation address to unwind the transaction outcome.

If you're a bit familiar with travel, you'll also notice that I've organized the steps by risk. Reserving a rental car almost always succeeds if you book in advance, because the rental car company can move more cars on-site of there is high demand. Reserving a hotel is slightly more risky, but you can commonly back out of a reservation without penalty until 24h before the stay. Airfare often comes with a refund restriction, so you'll want to do that last.

I created a Gist on Github that you can run as a console application. It illustrates this model in code. Mind that it is a mockup and not a framework. I wrote this in less than 90 minutes, so don't expect to reuse this.

The main program sets up an examplary routing slip (all the classes are in the one file) and creates three completely independent "processes" (activity hosts) that are each responsible for handling a particular kind of work. The "processes" are linked by a "network" and each kind of activity has an address for forward progress work and one of compensation work. The network resolution is simulated by 'Send".

   1:  static ActivityHost[] processes;
   3:  static void Main(string[] args)
   4:  {
   5:      var routingSlip = new RoutingSlip(new WorkItem[]
   6:          {
   7:              new WorkItem<ReserveCarActivity>(new WorkItemArguments{{"vehicleType", "Compact"}}),
   8:              new WorkItem<ReserveHotelActivity>(new WorkItemArguments{{"roomType", "Suite"}}),
   9:              new WorkItem<ReserveFlightActivity>(new WorkItemArguments{{"destination", "DUS"}})
  10:          });
  13:      // imagine these being completely separate processes with queues between them
  14:      processes = new ActivityHost[]
  15:                          {
  16:                              new ActivityHost<ReserveCarActivity>(Send),
  17:                              new ActivityHost<ReserveHotelActivity>(Send),
  18:                              new ActivityHost<ReserveFlightActivity>(Send)
  19:                          };
  21:      // hand off to the first address
  22:      Send(routingSlip.ProgressUri, routingSlip);
  23:  }
  25:  static void Send(Uri uri, RoutingSlip routingSlip)
  26:  {
  27:      // this is effectively the network dispatch
  28:      foreach (var process in processes)
  29:      {
  30:          if (process.AcceptMessage(uri, routingSlip))
  31:          {
  32:              break;
  33:          }
  34:      }
  35:  }

The activities each implement a reservation step and an undo step. Here's the one for cars:

   1:  class ReserveCarActivity : Activity
   2:  {
   3:      static Random rnd = new Random(2);
   5:      public override WorkLog DoWork(WorkItem workItem)
   6:      {
   7:          Console.WriteLine("Reserving car");
   8:          var car = workItem.Arguments["vehicleType"];
   9:          var reservationId = rnd.Next(100000);
  10:          Console.WriteLine("Reserved car {0}", reservationId);
  11:          return new WorkLog(this, new WorkResult { { "reservationId", reservationId } });
  12:      }
  14:      public override bool Compensate(WorkLog item, RoutingSlip routingSlip)
  15:      {
  16:          var reservationId = item.Result["reservationId"];
  17:          Console.WriteLine("Cancelled car {0}", reservationId);
  18:          return true;
  19:      }
  21:      public override Uri WorkItemQueueAddress
  22:      {
  23:          get { return new Uri("sb://./carReservations"); }
  24:      }
  26:      public override Uri CompensationQueueAddress
  27:      {
  28:          get { return new Uri("sb://./carCancellactions"); }
  29:      }
  30:  }

The chaining happens solely through the routing slip. The routing slip is "serializable" (it's not, pretend that it is) and it's the only piece of information that flows between the collaborating activities. There is no central coordination. All work is local on the nodes and once a node is done, it either hands the routing slip forward (on success) or backward (on failure). For forward progress data, the routing slip has a queue and for backwards items it maintains a stack. The routing slip also handles resolving and invoking whatever the "next" thing to call is on the way forward and backward.

   1:  class RoutingSlip
   2:  {
   3:      readonly Stack<WorkLog> completedWorkLogs = new Stack<WorkLog>();
   4:      readonly Queue<WorkItem> nextWorkItem = new Queue<WorkItem>();
   6:      public RoutingSlip()
   7:      {
   8:      }
  10:      public RoutingSlip(IEnumerable<WorkItem> workItems)
  11:      {
  12:          foreach (var workItem in workItems)
  13:          {
  14:              this.nextWorkItem.Enqueue(workItem);
  15:          }
  16:      }
  18:      public bool IsCompleted
  19:      {
  20:          get { return this.nextWorkItem.Count == 0; }
  21:      }
  23:      public bool IsInProgress
  24:      {
  25:          get { return this.completedWorkLogs.Count > 0; }
  26:      }
  28:      public bool ProcessNext()
  29:      {
  30:          if (this.IsCompleted)
  31:          {
  32:              throw new InvalidOperationException();
  33:          }
  35:          var currentItem = this.nextWorkItem.Dequeue();
  36:          var activity = (Activity)Activator.CreateInstance(currentItem.ActivityType);
  37:          try
  38:          {
  39:              var result = activity.DoWork(currentItem);
  40:              if (result != null)
  41:              {
  42:                  this.completedWorkLogs.Push(result);
  43:                  return true;
  44:              }
  45:          }
  46:          catch (Exception e)
  47:          {
  48:              Console.WriteLine("Exception {0}", e.Message);
  49:          }
  50:          return false;
  51:      }
  53:      public Uri ProgressUri
  54:      {
  55:          get
  56:          {
  57:              if (IsCompleted)
  58:              {
  59:                  return null;
  60:              }
  61:              else
  62:              {
  63:                  return
  64:                      ((Activity)Activator.CreateInstance(this.nextWorkItem.Peek().ActivityType)).
  65:                          WorkItemQueueAddress;
  66:              }
  67:          }
  68:      }
  70:      public Uri CompensationUri
  71:      {
  72:          get
  73:          {
  74:              if (!IsInProgress)
  75:              {
  76:                  return null;
  77:              }
  78:              else
  79:              {
  80:                  return
  81:                      ((Activity)Activator.CreateInstance(this.completedWorkLogs.Peek().ActivityType)).
  82:                          CompensationQueueAddress;
  83:              }
  84:          }
  85:      }
  87:      public bool UndoLast()
  88:      {
  89:          if (!this.IsInProgress)
  90:          {
  91:              throw new InvalidOperationException();
  92:          }
  94:          var currentItem = this.completedWorkLogs.Pop();
  95:          var activity = (Activity)Activator.CreateInstance(currentItem.ActivityType);
  96:          try
  97:          {
  98:              return activity.Compensate(currentItem, this);
  99:          }
 100:          catch (Exception e)
 101:          {
 102:              Console.WriteLine("Exception {0}", e.Message);
 103:              throw;
 104:          }
 106:      }
 107:  }

The local work  and making the decisions is encapsulated in the ActivityHost, which calls ProcessNext() on the routing slip to resolve the next activity and call its DoWork() function on the way forward or will resolve the last executed activity on the way back and invoke its Compensate() function. Again, there's nothing centralized here; all that work hinges on the routing slip and the three activities and their execution is completely disjoint.

   1:  abstract class ActivityHost
   2:  {
   3:      Action<Uri, RoutingSlip> send;
   5:      public ActivityHost(Action<Uri, RoutingSlip> send)
   6:      {
   7:          this.send = send;
   8:      }
  10:      public void ProcessForwardMessage(RoutingSlip routingSlip)
  11:      {
  12:          if (!routingSlip.IsCompleted)
  13:          {
  14:              // if the current step is successful, proceed
  15:              // otherwise go to the Unwind path
  16:              if (routingSlip.ProcessNext())
  17:              {
  18:                  // recursion stands for passing context via message
  19:                  // the routing slip can be fully serialized and passed
  20:                  // between systems. 
  21:                  this.send(routingSlip.ProgressUri, routingSlip);
  22:              }
  23:              else
  24:              {
  25:                  // pass message to unwind message route
  26:                  this.send(routingSlip.CompensationUri, routingSlip);
  27:              }
  28:          }
  29:      }
  31:      public void ProcessBackwardMessage(RoutingSlip routingSlip)
  32:      {
  33:          if (routingSlip.IsInProgress)
  34:          {
  35:              // UndoLast can put new work on the routing slip
  36:              // and return false to go back on the forward 
  37:              // path
  38:              if (routingSlip.UndoLast())
  39:              {
  40:                  // recursion stands for passing context via message
  41:                  // the routing slip can be fully serialized and passed
  42:                  // between systems 
  43:                  this.send(routingSlip.CompensationUri, routingSlip);
  44:              }
  45:              else
  46:              {
  47:                  this.send(routingSlip.ProgressUri, routingSlip);
  48:              }
  49:          }
  50:      }
  52:      public abstract bool AcceptMessage(Uri uri, RoutingSlip routingSlip);
  53:  }


That's a Saga.

Categories: Architecture | SOA

We get a ton of inquiries along the lines of “I want to program my firewall using IP ranges to allow outbound access only to my cloud-based apps”. If you (or the IT department) insist on doing this with Windows Azure, there is even a downloadable and fairly regularly updated list of the IP ranges on the Microsoft Download Center in a straightforward XML format.

Now, we do know that there are a lot of customers who keep insisting on using IP address ranges for that purpose, but that strategy is not a recipe for success.

The IP ranges shift and expand on a very frequent basis and cover all of the Windows Azure services. Thus, a customer will open their firewall for traffic for the entire multitenant range of Azure, which means that the customer’s environment can reach their own apps and the backend services for the “Whack A Panda” game just the same. With apps in the cloud, there is no actual security gain from these sorts of constraints; pretty much all the advantages of automated, self-service cloud environments stem from shared resources including shared networking and shared gateways and the ability to do dynamic failover including cross-DC failover and the like that means that there aren’t any reservations at the IP level that last forever.

The best way to handle this is to do the exact inverse of what’s being tried with these rules, and rather limit access to outside resources to a constrained set of services based on the services’ or users’ identity as it is done on our Microsoft corporate network. At Microsoft, you can’t get out through the NAT/Proxy unless you have an account that has external network privileges. If you are worried about a service or user abusing access to the Internet, don’t give them Internet. If you think you need to have tight control, make a DMZ – in the opposite direction of how you usually think about a DMZ.

Using IP-address based outbound firewall access rules constraining access to public cloud computing resources is probably getting a box on a check-list ticked, but it doesn’t add anything from a security perspective. It’s theater. IMHO.

Categories: Architecture | Technology

I had a email discussion late last weekend and through this weekend on the topic of transactions in Windows Azure. One of our technical account managers asked me on behalf of their clients how the client could migrate their solution to Windows Azure without having to make very significant changes to their error management strategy – a.k.a. transactions. In the respective solution, the customer has numerous transactions that are interconnected by queuing and they’re looking for a way to preserve the model of taking data from a queue or elsewhere, performing an operation on a data store and writing to a queue as a result as an atomic operation.

I’ve boiled down the question part of the discussion into single sentences and edited out the customer specific pieces, but left my answers mostly intact, so this isn’t written as a blog article. 

The bottom line is that Service Bus, specifically with its de-duplication features for sending and with its reliable delivery support using Peek-Lock (which we didn’t discuss in the thread, but see here and also here) is a great tool to compensate for the lack of coordinator support in the cloud. I also discuss why using DTC even in IaaS may not be an ideal choice:

Q: How do I perform distributed, coordinated transactions in Windows Azure?

2PC in the cloud is hard for all sorts of reasons. 2PC as implemented by DTC effectively depends on the coordinator and its log and connectivity to the coordinator to be very highly available. It also depends on all parties cooperating on a positive outcome in an expedient fashion. To that end, you need to run DTC in a failover cluster, because it’s the Achilles heel of the whole system and any transaction depends on DTC clearing it.

In cloud environments, it’s a very tall order to create a cluster that’s designed in a way similar to what you can do on-premises by putting a set of machines side-by-side and interconnecting them redundantly. Even then, use of DTC still put you into a CAP-like tradeoff situation as you need to scale up.

Since the systems will be running in a commoditized environment where the clustered assets may quite well be subject to occasional network partitions or at least significant congestion and the system will always require – per 2PC rules – full consensus by all parties about the transaction outcome, the system will inevitably grind to a halt whenever there are sporadic network partitions. That risk increases significantly as the scale of the solution and the number of participating nodes increases.

There are two routes out of the dilemma. The first is to localize any 2PC work onto a node and scale up, which lets you stay in the classic model, but will limit the benefits of using the cloud to having externalized hosting. The second is to give up on 2PC and use per-resource transaction support (i.e. transactions in SQL or transactions in Service Bus) as a foundation and knit components together using reliable messaging, sagas/compensation for error management and, with that, scale out. 

Q: Essentially you are saying that there is absolutely no way of building a coordinator in the cloud?

I’m not saying it’s absolutely impossible. I’m saying you’d generally be trading a lot of what people expect out of cloud (HA, scale) for a classic notion of strong consistency unless you do a lot of work to support it.

The Azure storage folks implement their clusters in a very particular way to provide highly-scalable, highly-available, and strongly consistent storage – and they are using a quorum based protocol (Paxos) rather than classic atomic TX protocol to reach consensus. And they do while having special clusters that are designed specifically to that architecture – because they are part of the base platform. The paper explains that well.

Since the storage system and none of the other components trust external parties to be in control of their internal consistency model and operations – which would be case if they’d enlist in distributed transactions – any architecture built on top of those primitives will either have to follow a similar path to what the storage folks have done, or start making trades.

You can stick to the classic DTC model with IaaS; but you will have to give up using the PaaS services that do not support it, and you may face challenges around availability traded for consistency as your resources get distributed across the datacenter and fault domains for – ironically – availability. So ultimately you’ll be hosting a classic workload in IaaS without having the option of controlling the hardware environment tightly to increase intra-cluster reliability.

The alternative is to do what the majority of large web properties do and that is to deal with these constraints and achieve reliability by combining per-resource transactions, sagas, idempotency, at-least-once messaging, and eventual consistency.

Q: What are the chances that you will build something that will support at least transactional handoffs between Service Bus the Azure SQL database?

We can’t directly couple a SQL DB and Service Bus because SQL, like storage, doesn’t allow transactions that span databases for the reasons I cited earlier.

But there is a workaround using Service Bus that gets you very close. If the customer’s solution DB had a table called “outbox” and the transactions would write messages into that table (including the destination queue name and the desired message-id), they can get full ACID around their DB transactions. With storage, you can achieve a similar model with batched writes into singular partitions.

We can’t make that “outbox” table, because it needs to be in the solution’s own DB and inside their schema. A background worker can then poll that table (or get post-processing handoff from the transaction component) and then replicate the message into SB.

If SB has duplicate detection turned on, even intermittent send failures or commit issues on deleting sent messages from the outbox won’t be a problem, so this simple message transfer doesn’t require 2PC since the message is 100% replicable including its message-id and thus the send is idempotent towards SB – while sending to SB in the context of the original transaction wouldn’t have that.

With that, they can get away without compensation support, but they need to keep the transactions local to SQL and the “outbox” model gives the necessary escape hatch to do that.

Q: How does that work with the duplicate detection?

The message-id is a free-form string that the app can decide on and set as it likes. So that can be an order-id, some contextual transaction identifier or just a Guid. That id needs to go into the outbox as the message is written.

If the duplicate detection in Service Bus is turned on for the particular Queue, we will route the first message and drop any subsequent message with the same message-id during the duplicate detection time window. The respective message(s) is/are swallowed by Service Bus and we don’t specifically report that fact.

With that, you can make the transfer sufficiently robust.

Duplicate Detection Sample: http://code.msdn.microsoft.com/windowsazure/Brokered-Messaging-c0acea25#content

Categories: Architecture

July 26, 2012
@ 06:25 PM

There’s a lot of talk about “Push” notifications both in web and mobile scenarios. “Push” is often positioned as something entirely different to “Pull” (or polling). The reality is that “Push” in the sense that it is used with Web Sockets or Apple/Windows/Android Push Notification systems is just a pattern nuance away from “Pull”. 

When I discuss that general realm with folks here, I use three terms. “Push”, “Solicited Push”, and “Pull”. When people talk about “Push” as an explicit thing today, they usually refer to “Solicited Push”.

“Solicited Push” and “Pull” are similar in that a message sink (client) receives messages after having established a connection to a message source. The only difference between them is how many messages are being asked for by the message sink – and, if you want to find a second one, whether the message sink will wait for messages to become available or instantly respond with a negative reply.  The clearly distinct third pattern is plain “Push” where a message source sends messages to message sinks on connections that the source initiates.

  • “Push” – a message source initiates a connection (or a datagram route) to a message sink (which has previously indicated the desire to be notified by other means) and sends a message. This requires that the message sink has independent and reachable network connectivity from the perspective of the message source.
  • “Solicited Push” – a message sink initiates  a connection (or a datagram route) to a message source and asks for an unbounded sequence of messages. As messages become available, they are routed via the connection/route. The connection is maintained for indeterminate time and reestablished once found to be broken for whatever reason.
  • “Pull” – message sink initiates a connection (or datagram route) to a message source and asks for a bounded sequence of messages (1 to N). For “Short Pull”, the message source immediately completes the operation providing a negative result or a sequence of less than N messages, for a “Long Pull” the source will keep the request pending until N messages have become available and routed to the sink. As the overall timeout for the request is reached or N messages have been retrieved, the message source completes the operation.

Bottom line: “Pull” or short/long polling and “Solicited Push” are just variations of the same thing. The message sink (client) provides a connection onto which the message source routes messages that the sink asks for. With “Solicited Push” it’s an unbounded sequence, with “Pull” is a bounded sequence.

Categories: Architecture

I actually had to double-check whether it’s really true given all my talks and places where I’ve published articles, but the June 2012 issue of MSDN Magazine is indeed the first of this storied publication in which I have an article. It’s about the “Internet of Things” and an exemplary architecture for how to manage flow from and to large numbers of devices using Service Bus – and the article signals the beginning of an increasingly deeper SB engagement in that space, so I expect things to get significantly easier and supported with better abstractions in the future. The July 2012 issue of MSDN Magazine will feature a follow-up article – with code – that shows how to implement the key parts of this architecture using an actual .NET Micro Framework powered device and an Azure application via the Service Bus.

And now go here to read the article over on the MSDN Magazine site.

Categories: Architecture

May 4, 2012
@ 03:15 PM

I’m toying around with very small and very constrained embedded devices right now. When you make millions of a small thing, every byte in code footprint and any processing cycle you can save saves real money. An XML parser is a big chunk of code. So is a JSON parser. Every HTTP stack already has a key/value pair parser for headers. We can use that.

NHTTP stands for NoHyperText Transfer Protocol. Yes, I made that up. No, this is not an April Fool’s joke. Hear me out.

All rules of RFC2616 apply, except for section 7.2, meaning there is must never an entity body on any request or reply. Instead we rely entirely on section 7.1 and its extensibility rule:

  • The extension-header mechanism allows additional entity-header fields to be defined without changing the protocol, but these fields cannot be assumed to be recognizable by the recipient. Unrecognized header fields SHOULD be ignored by the recipient and MUST be forwarded by transparent proxies.

All property payloads are expressed as key/value pairs that are directly mapped onto HTTP headers. No value can exceed 2KB in size and you can’t have more than 32 values per message so that we stay comfortably within common HTTP infrastructure quotas. To avoid collisions with existing headers and to allow for easy enumeration, each property key is prefixed with “P-“

POST /foo HTTP/1.1
Host: example.com
Content-Length: 0
P-Name: “Clemens”

HTTP/1.1 200 OK
Content-Length: 0
P-Greeting: “Hello, Clemens”

(The fun bit is that the Windows Azure Service Bus HTTP API for sending and receiving messages already supports this exact model since we map custom message properties to headers and the HTTP entity body to the body of broker messages and those can be empty)

Categories: Architecture | Technology

I just wrote this email on a private mailing list and thought it may make sense to share it. The context of the discussion was overuse of the term “REST” in a document discussing an HTTP API:

REST is a set of architectural principles. REST describes how state flows and describes the shape of relationships between the parties in a distributed system. HTTP is a protocol with a variety of stacks supporting it, and the REST principles were born out of developing HTTP. There could, in theory, a broad variety of protocols that also embody REST architecture, but  there are, in fact, very few (if any) that aren’t just variations of HTTP.

“The client sends …”, “The server receives …”, “The server provides an interface for …” are all statements about implementation and, thus, HTTP. It commonly starts making talking about REST specifically when debating whether a system is actually following the principles according to the 5.3.3 “Data View” section in [1], since everything up to that point in Fielding’s dissertation you get generally for free with HTTP. 

[1] http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm

Bottom line: HTTP APIs are HTTP APIs. REST is about how things hang together. The terms aren’t interchangeable. In most technical discussions about interfaces or methods or URIs and most other implementation details, HTTP API is the right term.

Categories: Architecture

March 5, 2012
@ 11:00 PM

Greg says what it’s not, and since he didn’t use the opportunity to also succinctly express what it is, I helped him out in the comments:

CQRS ("Command-Query Responsibility Segregation") is a simple pattern that strictly segregates the responsibility of handling command input into an autonomous system from the responsibility of handling side-effect-free query/read access on the same system. Consequently, the decoupling allows for any number of homogeneous or heterogeneous query/read modules to be paired with a command processor and this principle presents a very suitable foundation for event sourcing, eventual-consistency state replication/fan-out and, thus, high-scale read access. In simple terms: You don’t service queries via the same module of a service that you process commands through. For REST heads: GET wires to a different thing from what PUT/POST/DELETE wire up to.

Martin Fowler has a nice discussion here, with pictures. Udi Dahan has another nice description, also with pictures. To say it in yet another way, the key point of the pattern is that the read and write paths in a system are entirely separate.

Categories: Architecture | SOA

I've long wanted to write an architecture book. The problem with books is that writing them robs you of at least half a year of your life, and that half a year is a very painful one. I've done it. So instead of a book, I talked to the folks at Pluralsight and made an on-demand video course.  

"The Elements of Distributed Architecture" is about the foundational elements of distributed architecture and about the ‘physics’ that affect distributed software designs. The goal of this course, which is designed to be independent of specific languages, technologies, and products, is to provide software teams with a shared baseline of concepts and terminologies in the areas of information management, communication, presentation, processing, failure management, security, and safety.

I think of this course as a baseline and there's plenty of runway for more in-depth material. If you like this one that will give me motivation to spend more private time (this course is not related to my Microsoft job) to create architectural material.  

If you don't have a Pluralsight account, you can sign up for the free trial here

Categories: Architecture

December 1, 2011
@ 05:38 PM

I answered 4 questions in Richard Seroter’s series of interviews with folks working on connect systems. See the Q&A here.

Categories: Architecture | SOA

Elastic and dynamic multitenant cloud environments have characteristics that make traditional failure management mechanisms using coordinated 2-phase transactions a suboptimal choice. The common 2-phase commit protocols depend on a number of parties enlisted into a transaction making hard promises on the expected outcome of their slice a transaction. Those promises are difficult to keep in an environment where systems may go down at any time with their local state vanishing, where not all party trust each other, where significant latency may be involved, and network connectivity cannot be assumed to be reliable. 2-phase-commit is also not a good choice for operations that take significant amounts of time and span a significant amount of resources, because such a coordinated transaction may adversely affect the availability of said resources, especially in cases where the solution is a high-density multitenant solution where virtualized, and conceptually isolated resources are collocated on the same base resources. In such a case, database locks and locks on other resources to satisfy coordinated transaction promises may easily break the isolation model of a multitenant system and have one tenant affect the other.

Therefore, failure management – and this is ultimately what transactions are about – requires a somewhat different approach in cloud environments and other scalable distributed systems with similar characteristics.

To find a suitable set of alternative approaches, let’s quickly dissect what goes on in a distributed transaction:

To start, two or more parties ‘enlist’ into a shared transaction scope performing some coordinated work that’s commonly motivated by a shared notion of a ‘job’ that  needs to be executed. The goal of having a shared transaction scope is that the overall system will remain correct and consistent in both the success and the failure cases. Consistency in the success case is trivial. All participating parties could complete their slice of the job that had to be done. Consistency in the failure case is more interesting. If any party fails in doing their part of the job, the system will end up in a state that is not consistent. If you were trying to book a travel package and ticketing with the airline failed, you may end up with a hotel and a car, but no flight. In order to prevent that, a ‘classic’ distributed transaction asks the participants to make promises on the outcome of the transaction as the transaction is going on.

As all participating parties have tentatively completed but not finalized their work, the distributed transaction goes into a voting phase where every participant is asked whether it could tentatively complete its portion of the job and whether it can furthermore guarantee with a very high degree of certainty that it can finalize the job outcome and make it effective when asked to do so. Imagine a store clerk who puts an item on the counter that you’d like to purchase – you’ll show him your $10 and ask for a promise that he will hand you the item if you give him the money – and vice versa.

Finally, once all parties have made their promises and agreed that the job can be finalized, they are told to do so.

There are two big interesting things to observe about the 2-phase-commit (2PC) distributed transaction model that I just described: First, It’s incredibly simple from a developer’s perspective because the transaction outcome negotiation is externalized and happens as ‘magic’. Second, it’s not resembling anything that happens in real life and that should be somewhat suspicious. You may have noticed that there was no neutral escrow agent present when you bought the case of beverages at the store for $10 two paragraphs earlier.

The grand canonical example for 2PC transactions is a bank account transfer. You debit one account and credit another. These two operations need to succeed or fail together because otherwise you are either creating or destroying money (which is illegal, by the way). So that’s the example that’s very commonly used to illustrate 2PC transactions. The catch is – that’s not how it really works, at all. Getting money from one bank account to another bank account is a fairly complicated affair that touches a ton of other accounts. More importantly, it’s not a synchronous fail-together/success-together scenario. Instead, principles of accounting apply (surprise!). When a transfer is initiated, let’s say in online banking, the transfer is recorded in form of a message for submission into the accounting system and the debit is recorded in the account as a ‘pending’ transaction that affects the displayed balance. From the user’s perspective, the transaction is ’done’, but factually nothing has happened, yet. Eventually, the accounting system will get the message and start performing the transfer, which often causes a cascade of operations, many of them yielding further messages, including booking into clearing accounts and notifying the other bank of the transfer. The principle here is that all progress is forward. If an operation doesn’t work for some technical reason it can be retried once the technical reason is resolved. If operation fails for a business reason, the operation can be aborted – but not by annihilating previous work, but by doing the inverse of previous work. If an account was credited, that credit is annulled with a debit of the same amount. For some types of failed transactions,  the ‘inverse’ operation may not be fully symmetric but may result in extra actions like imposing penalty fees. In fact, in accounting, annihilating any work is illegal – ‘delete’ and ‘update’ are a great way to end up in prison.

As all the operations occur that eventually lead to the completion or failure of the grand complex operation that is a bank transfer, the one thing we’ll be looking to avoid is to be in any kind of ‘doubt’ of the state of the system. All participants must be able to have a great degree of confidence in their knowledge about the success or failure of their respective action. No shots into the dark. There’s no maybe. Succeed or fail.

That said, “fail” is a funny thing is distributed systems because it happens quite a bit. In many cases “fail” isn’t something that a bit of patience can’t fix. Which means that teaching the system some patience and tenacity is probably a good idea instead of giving up too easily. So if an operation fails because it runs into a database deadlock or the database is offline or the network is down or the local machine’s network adapter just got electrocuted that’s all not necessarily a reason to fail the operation. That’s a reason to write an alert into a log and call for help for someone to fix the environment condition.

If we zoom into an ‘operation’ here, we might see a message that we retrieve from some sort of reliable queue or some other kind of message store and subsequently an update of system state based on message. Once the state has been successfully updated, which may mean that we’ve inserted a new database record, we can tell the message system that the message has been processed and that it can be discarded. That’s the happy case.

Let’s say we take the message and as the process wants to walk up the database the power shuts off. Click. Darkness. Not a problem. Assuming the messaging system supports a ‘peek/lock’ model that allows the process to first take the message and only remove it from the queue once processing has been completed, the message will reappear on the queue after the lock has expired and the operation can be retried, possibly on a different node. That model holds true for all failures of the operation through to and in the database. If the operation fails due to some transient condition (including the network card smoking out, see above), the message is either explicitly abandoned by the process or returns into the queue by ways of a lock timeout.  If the operation fails because something is really logically wrong, like trying to ship a product out of the inventory that’s factually out of stock, we’ll have to take some forward action to deal with that. We’ll get to that in a bit.

Assuming the operation succeeded, the next tricky waypoint is failure after success, meaning that the database operation succeeded, but the message subsequently can’t be flagged as completed and thus can’t be removed from the queue. That situation would potentially lead to another delivery of the message even though the job has already been completed and therefore would cause the job to be executed again – which is only a problem if the system isn’t expecting that, or, in fancier terms, if it’s not ‘idempotent’. If the job is updating a record to absolute values and the particular process/module/procedure is the only avenue to perform that update (meaning there are no competing writers elsewhere), doing that update again and again and again is just fine. That’s natural idempotency. If the job is inserting a record, the job should contain enough information, such as a causality or case or logical transaction identifier that allows the process to figure out whether the desired record has already been inserted and if that’s the case it should do nothing, consider its own action a duplicate and just act as if it succeeded.

Checkpoint: With what I said in the last two paragraphs, you can establish pretty good confidence about failure or success of individual operations that are driven by messages. You fail and retry, you fail and take forward action, or you succeed and take steps to avoid retrying even if the system presents the same job again. There’s very little room for doubt. So that’s good.

The ‘forward action’ that results from failure is often referred to as ‘compensation’, but that’s a bit simplistic. The forward action resulting from running into the warehouse with the belief that there’s still product present while the shelf is factually empty isn’t to back out and cancel the order (unless you’re doing a firesale of a touch tablet your management just killed). Instead, you notify the customer of the shipping delay, flag a correction of the inventory levels, and put the item on backorder. For the most part, pure ‘compensation’ doesn’t really exist. With every action, the system ends up in a consistent state. It’s just that some states are more convenient than others and there are some state for which the system has a good answer and some states for which it doesn’t. If the system ends up in a dead end street and just wants to sit down and cry because nobody told it what to do now, it should phone home and ask for human intervention. That’s fine and likely a wise strategy in weird edge cases.

Initiating the ‘forward action’ and, really, any action in a system that’s using messaging as its lifeline and as a backplane for failure resilience as I’m describing it here is not entirely without failure risk in itself. It’s possible that you want to initiate an action and can’t reach the messaging system or sending the message fails for some other reason. Here again, patience and tenacity are a good idea. If we can’t send, our overall operation is considered failed and we won’t flag the initiating message as completed. That will cause the job to show up again, but since we’ve got idempotency in the database that operation will again succeed (even if by playing dead) or fail and we will have the same outcome allowing us to retry the send. If it looks like we can send but sending fails sometime during the operation, there might be doubt about whether we sent the message. Since doubt is a problem and we shouldn’t send the same message twice, duplicate detection in the messaging system can help suppressing a duplicate so that it never shows up at the receiver. That allows the sender to confidently resend if it’s in doubt about success in a prior incarnation of processing the same message.

Checkpoint: We now also can establish pretty good confidence about initiating forward action or any other action in the system given if the ‘current’ action is following the principles described above.

So far I’ve talked about individual actions and also about chains of actions, albeit just in the failure case. Obviously the same applies to success cases where you want to do something ‘next’ once you’re done with ‘this’.

Now let’s assume you want to do multiple things in parallel, like updating multiple stores as part of executing a single job – which gets us back to the distributed transaction scenario discussed earlier. What helps in these cases is if the messaging system supports ‘topics’ that allow dropping a message (the job) into the messaging system once and serve the message to each participant in the composite activity via their own subscription on the topic. Since the messaging system is internally transactional it will guarantee that each message that is successfully submitted will indeed appear on each subscription so it ensures the distribution. With that, the failure handling story for each slice of the composite job turns into the same model that I’ve been explaining above. Each participant can be patient and tenacious when it comes to transient error conditions. In hard failure cases, the forward action can be a notification to the initiator that will then have to decide how to progress forward, including annulling or otherwise invoking forward actions activities that have been executed in parallel. In the aforementioned case of a ticketing failure that means that the ticketing module throws its hands up and the module responsible for booking the travel package either decides to bubble the case to the customer or an operator leaving the remaining reservations intact or to cancel the reservations for the car and the hotel that have been made in parallel. Should two out of three or more participants’ operations fail and each report up to the initiator, the initiator can either keep track of whether it already took corrective forward action on the third participant or, in doubt, the idempotency rule should avoid doing the same thing twice.

The model described here is loosely based on the notion of ‘Sagas’, which were first described in a 1987 ACM paper by Hector Garcia-Molina and Kenneth Salem, so this isn’t grand news. However, the notion of such Sagas is only now really gaining momentum with long-running and far distributed transactions becoming more commonplace, so it’s well worth to drag the model further out into the limelight and give it coverage. The original paper on Sagas is still assuming that the individual steps can be encapsulated in a regular transaction, which may not even be the case in the cloud and with infrastructures that don’t have inherent transaction support. The role of the messaging system with the capabilities mentioned above is to help compensate for the absence of that support.

… to be continued …

Categories: Architecture | SOA

Load Balancing on the Service Bus Relay is by far our #1 most requested feature now that we’ve got Queues and Topics finally in production. It’s reasonable expectation for us deliver that capability in one of the next production updates and the good news is that we will. I’m not going to promise any concrete ship dates here, but it’d be sorely disappointing if that wouldn’t happen while the calendar still says 2011.

I just completed writing the functional spec for the feature and it’s worth communicating how the feature will show up, since there is a tiny chance that the behavioral change may affect implementations that rely on a particular exception to drive the strategy of how to perform failover.

The gist of the Load Balancing spec is that the required changes in your code and config to get load balancing are zero. With either the NetTcpRelayBinding or any of the HTTP bindings (WebHttpRelayBinding, etc) as well as the underlying transport binding elements, you’ll just open up a second (and third and fourth … up to 25) listener on the same name and instead of getting an AddressAlreadyInUseException as you get today, you’ll just get automatic load balancing. When a request for your endpoints shows up at Service Bus, the system will roll the dice on which of the connected listeners to route the request or connection/session to and perform the necessary handshake to make that happen.

The bottom line is that we’re effectively making the AddressAlreadyInUseException go away for the most part. It’ll still be thrown when the listener’s policy settings don’t match up, i.e. when one listener wants to have Access Control enabled and the other one doesn’t, but otherwise you’ll just won’t see it anymore.

The only way this way of just lighting up the feature may get anyone in trouble is if your application were to rely on that exception in a situation where you’ve got an active listener on Service Bus on one node and a ‘standby’ listener on another node that keeps trying to open up a listener into the same address to create a hot/warm cluster failover scheme and if the two nodes were tripping over each other if they were getting traffic concurrently. That doesn’t seem too likely. If you have questions about this drop me a line here in the comments, by email to clemensv at microsoft.com or on Twitter @clemensv. 

Categories: .NET Services | AppFabric | Architecture

From //build in Anaheim

Categories: AppFabric | Architecture | SOA | ISB | Web Services

Our team’s Development Manager MK (Murali Krishnaprasad) and me were interviewed by Michael Washam on May 2011 CTP release of Windows Azure AppFabric. We discuss new technologies such as Topics, Queues, Subscriptions and how this relates to doing async development in the cloud.


Republished from Channel 9

Categories: AppFabric | Architecture | ISB | Web Services

As our team was starting to transform our parts of the Azure Services Platform from a CTP ‘labs’ service exploring features into a full-on commercial service, it started to dawn on us that we had set ourselves up for writing a bunch of ‘enterprise apps’. The shiny parts of Service Bus and Access Control that we parade around are all about user-facing features, but if I look back at the work we had to go from a toy service to a commercial offering, I’d guess that 80%-90% of the effort went into aspects like infrastructure, deployment, upgradeability, billing, provisioning, throttling, quotas, security hardening, and service optimization. The lesson there was: when you’re boarding the train to shipping a V1, you don’t load new features on that train –  you rather throw some off.

The most interesting challenge for these infrastructure apps sitting on the backend was that we didn’t have much solid ground to stand on. Remember – these were very early days, so we couldn’t use SQL Azure since the folks over in SQL were on a pretty heroic schedule themselves and didn’t want to take on any external dependencies even from close friends. We also couldn’t use any of the capabilities of our own bits because building infrastructure for your features on your features would just be plain dumb. And while we could use capabilities of the Windows Azure platform we were building on, a lot of those parts still had rough edges as those folks were going through a lot of the same that we went through. In those days, the table store would be very moody, the queue store would sometimes swallow or duplicate messages, the Azure fabric controller would occasionally go around and kill things. All normal –  bugs.

So under those circumstances we had to figure out the architecture for some subsystems where we need to do a set of coordinated action across a distributed set of resources – a distributed transaction or saga of sorts. The architecture had a few simple goals: when we get an activation request, we must not fumble that request under any circumstance, we must run the job to completion for all resources and, at the same time, we need to minimize any potential for required operator intervention, i.e. if something goes wrong, the system better knows how to deal with it – at best it should self-heal.

My solution to that puzzle is a pattern I call “Scheduler-Agent-Supervisor Pattern” or, short, “Supervisor Pattern”. We keep finding applications for this pattern in different places, so I think it’s worth writing about it in generic terms – even without going into the details of our system.

The pattern foots on two seemingly odd and very related assumptions: the system is perfect’ and ‘all error conditions are transient’. As a consequence, the architecture has some character traits of a toddler. It’s generally happily optimistic and gets very grumpy, very quickly when things go wrong – to the point that it will simply drop everything and run away screaming. It’s very precisely like that, in fact.


The first picture here shows all key pieces except the Supervisor that I’ll introduce later. At the core we have a Scheduler that manages a simple state machine made up of Jobs and those jobs have Steps. The steps may have a notion of interdependency or may be completely parallelizable. There is a Job Store that holds jobs and steps and there are Agents that execute operations on some resource.  Each Agent is (usually) fronted by a queue and the Scheduler has a queue (or service endpoint) through which it receives reply messages from the Agents.

Steps are recorded in a durable storage table of some sort that has at least the following fields: Current State (say: Disabled, Active), Desired State (say: Disabled, Active), LockedUntil (Date/Time value), and Actor plus any step specific information you want to store and eventually submit with the job to the step agent.

When Things Go Right

The initial flow is as follows:

(1)a – Submit a new job into the Scheduler (and wait)
(2)a – The Scheduler creates a new job and steps with an initial current state (‘Disabled’) in the job store 
(2)b – The Scheduler sets ‘desired state’ of the job and of all schedulable steps (dependencies?) to the target state (‘Active’) and sets the ‘locked until’ timeout of the step to a value in the near future, e.g. ‘Now’ + 2 minutes.
(1)b – Job submission request unblocks and returns

If all went well, we now have a job record and, here in this example, two step records in our store. They have a current state of ‘Disabled’ and a desired state of ‘Active’. If things didn’t go well, we’d have incomplete or partially wedged records or nothing in the job store, at all. The client would also know about it since we’ve held on to the reply until we have everything done – so the client is encouraged to retry. If we have nothing in the store and the client doesn’t retry – well, then the job probably wasn’t all that important, after all. But if we have at least a job record, we can make it all right later. We’re optimists, though; let’s assume it all went well.

For the next steps we assume that there’s a notion of dependencies between the steps and the second steps depends on the first. If that were not the case, the two actions would just be happening in parallel.

(3) – Place a step message into the queue for the actor for the first step; Agent 1 in this case. The message contains all the information about the step, including the current and desired state and also the LockedUntil that puts an ultimatum on the activity. The message may further contain an action indicator or arguments that are taken from the step record.
(4) – After the agent has done the work, it places a completion record into the reply queue of the Scheduler.
(5) – The Scheduler records the step as complete by setting the current state from ‘Disabled’ to ‘Active’; as a result the desired and the current state are now equal.
(6) – The Scheduler sets the next step’s desired state to the target state (‘Active’) and sets the LockedUntil timeout of the step to a value in the near future, e.g. ‘Now’ + 1 minute. The lock timeout value is an ultimatum for when the operation is expected to be complete and reported back as being complete in a worst-case success case. The actual value therefore depends on the common latency of operations in the system. If operations usually complete in milliseconds and at worst within a second, the lock timeout can be short – but not too short. We’ll discuss this  value in more detail a bit later.
(7), (8), (9) are equivalent to (3), (4), (5).

Once the last step’s current state is equal to the current state, the job’s current state gets set to the desired state and we’re done. So that was the “99% of the time” happy path.


When Things Go Wrong

So what happens when anything goes wrong? Remember the principle ‘all errors are transient’. What we do in the error case – anywhere – is to log the error condition and then promptly drop everything and simply hope that time, a change in system conditions, human or divine intervention, or – at worst – a patch will heal matters. That’s what the second principle ‘the system is perfect’ is about; the system obviously isn’t really perfect, but if we construct it in a way that we can either wait for it to return from a wedged state into a functional state or where we enable someone to go in and apply a fix for a blocking bug while preserving the system state, we can consider the system ‘perfect’ in the sense that pretty much any conceivable job that’s already in the system can be driven to completion.

In the second picture, we have Agent 2 blowing up as it is processing the step it got handed in (7). If the agent just can’t get its work done since some external dependency isn’t available – maybe a database can’t be reached or a server it’s talking to spews out ‘server too busy’ errors – it may be able to back off for a moment and retry. However, it must not retry past the LockedUntil ultimatum that’s in the step record. When things fail and the agent is still breathing, it may, as a matter of courtesy, notify the scheduler of the fact and report that the step was completed with no result, i.e. the desired state and the achieved state don’t match. That notification may also include diagnostic information. Once the LockedUntil ultimatum has passed, the Agent no longer owns the job and must drop it. It must even not report failure state back to the Scheduler past that point.

If the agent keels over and dies as it is processing the step (or right before or right after), it is obviously no longer in a position to let the scheduler know about its fate. Thus, there won’t be any message flowing back to the scheduler and the job is stalled. But we expect that. In fact, we’re ok with any failure anywhere in the system. We could lose or fumble a queue message, we could get a duplicate message, we could have the scheduler die a fiery death (or just being recycled for patching at some unfortunate moment) – all of those conditions are fine since we’ve brought the doctor on board with us: the Supervisor. 


The Supervisor

The Supervisor is a schedule driven process (or thread) of which one or a few instances may run occasionally. The frequency depends on much on the average duration of operations and the expected overall latency for completion of jobs.

The Supervisor’s job is to recover steps or jobs that have failed – and we’re assuming that failures are due to some transient condition. So if the system would expect a transient resource failure condition that prevented a job from completing just a second ago to be healed two seconds later, it’d depend on the kind of system and resource whether that’d be a good strategy.  What’s described here is a pattern, not a solution, so it depends on the concrete scenario to get the  timing right for when to try operations again once they fail.

This desired back-off time manifests in the LockedUntil value.  When a step gets scheduled, the Scheduler needs to state how long it is willing to wait for that step to complete; this includes some back-off time padding. Once that ultimatum has passed and the step is still in an inconsistent state (desired state doesn’t equal the current state)  the Supervisor can pick it up at any time and schedule it.

(1) – Supervisor queries the job store for any inconsistent steps whose LockedUntil value has expired.
(2) – The Supervisor schedules the step again by setting the LockedUntil value to a new timeout and submitting the step into the target actor’s queue
(3) – Once the step succeeds, the step is reported as complete on the regular path back to the Scheduler  where it completes normally as in steps (8), (9) from the happy-path scenario above. If it fails, we simply drop it again. For failures that allow reporting an error back to the Scheduler it may make sense to introduce an error counter that round-trips with the step so that the system could detect poisonous steps that fail ‘forever’ and have the Supervisor ignore those after some threshold.

The Supervisor can pursue a range of strategies for recovery. It can just take a look at individual steps and recover them by rescheduling them – assuming the steps are implemented as idempotent operations. If it were a bit cleverer, it may consider error information that a cooperative (and breathing) agent has submitted back to the Scheduler and even go as far as to fire an alert to an operator if the error condition were to require intervention and then take the step out of the loop by marking it and setting the LockedUntil value to some longer timeout so it’s taken out of the loop and someone can take a look.

At the job-scope, the Supervisor may want to perform recovery such that it first schedules all previously executed steps to revert back to the initial state by performing compensation work (all resources that got set to active are getting disabled again here in our example) and then scheduling another attempt at getting to the desired state.

In step (2)b up above, we’ve been logging current and desired state at the job-scope and with that we can also always find inconsistent jobs where all steps are consistent and wouldn’t show up in the step-level recovery query. That situation can occur if the Scheduler were to crash between logging one step as complete and scheduling the next step. If we find inconsistent jobs with all-consistent steps, we just need to reschedule the next step in the dependency sequence whose desired state isn’t matching the desired state of the overall job.

To be thorough, we could now take a look at all the places where things can go wrong in the system. I expect that survey to yield that at as long we can successfully get past step (2)b from the first diagram, the Supervisor is always in a position to either detect that a job isn’t making progress and help with recovery or can at least call for help. The system always knows what its current intent is, i.e. which state transitions it wants to drive, and never forgets about that intent since that intent is logged in the job store at all times and all progress against that intent is logged as well.  The submission request (1) depends on the outcome of (2)a/b to guard against failures while putting a job and its steps into the system so that a client can take corrective action. In fact, once the job record is marked as inconsistent in step (2)b, the scheduler could already report success back to the submitting party even before the first step is scheduled, because the Supervisor would pick up that inconsistency eventually.


Categories: Architecture | SOA | Azure | Technology

This post explains an essential class for asynchronous programming that lurks in the depths of the WCF samples: InputQueue<T>. If you need to write efficient server-side apps, you should consider reading through this and add InputQueue<T> to your arsenal. 

Let me start with: This blog post is 4 years late. Sorry! – and with that out of the way:

The WCF samples ship with several copies of a class that’s marked as internal in the System.ServiceModel.dll assembly: InputQueue<T>. Why are these samples – mostly those implementing channel-model extensions – bringing local copies of this class with them? It’s an essential tool for implementing the asynchronous call paths of many aspects of channels correctly and efficiently.

If you look closely enough, the WCF channel infrastructure resembles the Berkeley Socket model quite a bit – especially on the server side. There’s a channel listener that’s constructed on the server side and when that is opened (usually under the covers of the WCF ServiceHost) that operation is largely equivalent to calling ‘listen’ on a socket – the network endpoint is ready for business.  On sockets you’ll then call ‘accept’ to accept the next available socket connection from a client, in WCF you call ‘AcceptChannel’ to accept the next available (session-) channel. On sockets you then call ‘receive’ to obtain bytes, on a channel you call ’Receive’ to  obtain a message.

Before and between calls to '’AcceptChannel’ made by the server-side logic,  client-initiated connections – and thus channels – may be coming in and queue up for a bit before they handed out to the next caller of ‘AcceptChannel’, or the asynchronous equivalent ‘Begin/EndAcceptChannel’ method pair. The number of channels that may be pending is configured in WCF with the ‘ListenBacklog’ property that’s available on most bindings.

I wrote ‘queue up’ there since that’s precisely what happens – those newly created channels on top of freshly accepted sockets or HTTP request channels are enqueued into an InputQueue<T> instance and (Begin-)Accept is implemented as a dequeue operation on that queue. There are two particular challenges here that make the regular Queue<T> class from the System.Collections.Generic namespace unsuitable for use in the implementation of that mechanism: Firstly, the Dequeue method there is only available as a synchronous variant and also doesn’t allow for specifying a timeout. Secondly, the queue implementation doesn’t really help much with implementing the ListenBacklog quota where not only the length of the queue is limited to some configured number of entries, but accepting further connections/channels from the underlying network is also suspended for as long as the queue is at capacity and needs to resume as soon as the pressure is relieved, i.e. a caller takes a channel out of the queue.

To show that InputQueue<T> is a very useful general purpose class even outside of the context of the WCF channel infrastructure, I’ve lifted a version of it from one of the most recent WCF channel samples, made a small number of modifications that I’ll write about later, and created a little sample around it that I’ve attached to this post.

The sample I’ll discuss here is simulating parsing/reading IP addresses from a log-file and then performing a reverse DNS name resolution on those addresses – something that you’d do in a web-server log-analyzer or as the background task in a blog engine wile preparing statistics.

Reverse DNS name resolution is quite interesting since it’s embarrassingly easy to parallelize and each resolution commonly takes a really long time (4-5 seconds) –whereby all the work is done elsewhere. The process issuing the queries is mostly sitting around idle waiting for the response.  Therefore, it’s a good idea to run a number of DNS requests in parallel, but it’s a terrible idea to have any of these requests execute as a blocking call and burning a thread. Since we’re assuming that we’re reading from a log file that requires some parsing, it would also be a spectacularly bad idea to have multiple concurrent threads compete for access to that file and get into each other’s way. And since it is a file and we need to lift things up from disk, we probably shouldn’t do that ‘just in time’ as a DNS resolution step is done, but there should rather be some data readily waiting for processing.  InputQueue<T> is enormously helpful in such a scenario.

The key file of the sample code – the implementation of the queue itself aside – is obviously Program.cs. Here’s Main() :

static void Main(string[] args)
    int maxItemsInQueue = 10;
    InputQueue<IPAddress> logDataQueue = new InputQueue<IPAddress>();
    int numResolverLoops = 20;
    ManualResetEvent shutdownCompleteEvent = new ManualResetEvent(false);
    List<IPAddressResolverLoop> resolverLoops = new List<IPAddressResolverLoop>();
    Console.WriteLine("You can stop the program by pressing ENTER.");

We’re setting up a new InputQueue<IPAddress> here into which we’ll throw the parsed addresses from our acquisition loop that simulates reading from the log. The queue’s capacity will be limited to just 10 entries (maxItemsInQueue is the input value) and we will run 20 'resolver loops’, which are logical threads that process IP-to-hostname resolution steps.

    Console.WriteLine("You can stop the program by pressing ENTER.");
    // set up the loop termination callback
    WaitCallback loopTerminationCallback = o =>
        if (Interlocked.Decrement(ref numResolverLoops) == 0)
    // set up the resolver loops
    for (int loop = 0; loop < numResolverLoops; loop++)
        // add the resolver loop 'i' and set the done flag when the
        // last of them terminates
            new IPAddressResolverLoop(
                logDataQueue, loop, 
                loopTerminationCallback, null));

Next we’re kicking off the resolver loops – we’ll look at these in detail a bit later. We’ve got a ManualResetEvent lock object that guards the program’s exit until all these loops have completed and we’re going to set that to signaled once the last loop completes – that’s what the loopTerminationCallback anonymous method is for.  We’re registering the method with each of the loops and as they complete the method gets called and the last call sets the event. Each loop gets a reference to the logDataQueue from where it gets its work.

   // set up the acquisition loop; the loop auto-starts
    using (LogDataAcquisitionLoop acquisitionLoop =
        new LogDataAcquisitionLoop(logDataQueue, maxItemsInQueue))
        // hang main thread waiting for ENTER
        Console.WriteLine("*** Shutdown initiated.");

Finally we’re starting the acquisition loop that gets the data from the log file. The loop gets a reference to the logDataQueue where it places the acquired items and it’s passed the maxItemsInQueue quota that governs how many items may be read ahead into the queue. Once the user presses the ENTER key, the acquisition loop object is disposed by ways of exiting the using scope, which stops the loop.

    // shut down the queue; the resolvers will auto-close
    // as the queue drains. We don't need to close them here.
    // wait for all work to complete

Lastly, the queue is shut down (by fittingly calling Shutdown). Shutdown closes the queue (all further enqueue operations are absorbed) and causes all pending readers for which no more entries are available on the queue to unblock immediately  and return null. The resolver loops will complete their respective jobs and will terminate whenever they dequeue null from the queue. As they terminate, they call the registered termination callback (loopTerminationCallback from above) and that will eventually cause shutdownCompletedEvent to become signaled as discussed above.

The log-reader simulator isn’t particularly interesting for this sample, even though one of the goodies is that the simulation executes on an I/O completion port instead of a managed thread-pool thread – that’s another blog post. The two methods of interest are Begin/EndGetLogData – all that’s of interest here is that EndGetLogData returns an IPAddress that’s assumed to be parsed out of a log.

class IPAddressLogReaderSimulator
    public IAsyncResult BeginGetLogData(AsyncCallback callback, object data);
    public IPAddress EndGetLogData(IAsyncResult result);

The simulator is used internally  by the LogDataAcquisitionLoop class – which we’ll drill into because it implements the throttling mechanism on the queue.

class LogDataAcquisitionLoop : IDisposable
    readonly IPAddressLogReaderSimulator ipAddressLogReaderSimulator;
    readonly InputQueue<IPAddress> logDataQueue;
    int maxItemsInQueue;
    int readingSuspended;
    bool shuttingDown;
    public LogDataAcquisitionLoop(InputQueue<IPAddress> logDataQueue, int maxItemsInQueue)
        this.logDataQueue = logDataQueue;
        this.maxItemsInQueue = maxItemsInQueue;
        this.shuttingDown = false;
        this.ipAddressLogReaderSimulator = new IPAddressLogReaderSimulator();
        this.ipAddressLogReaderSimulator.BeginGetLogData(this.LogDataAcquired, null);

The constructor sets up the shared state of the loop and kicks off the first read operation on the simulator. Once BeginGetLogData has acquired the first IPAddress (which will happy very quickly), the LogDataAcquired callback method will be invoked. 

    void LogDataAcquired(IAsyncResult result)
        IPAddress address = this.ipAddressLogReaderSimulator.EndGetLogData(result);
        Console.WriteLine("-- added {0}", address);
        this.logDataQueue.EnqueueAndDispatch(address, this.LogDataItemDequeued);
        if (!this.shuttingDown && this.logDataQueue.PendingCount < this.maxItemsInQueue)
            this.ipAddressLogReaderSimulator.BeginGetLogData(this.LogDataAcquired, null);
            // the queue will be at the defined capacity, thus abandon 
            // the read loop - it'll be picked up by LogDataItemDequeued
            // as the queue pressure eases
            Interlocked.Exchange(ref this.readingSuspended, 1);
            Console.WriteLine("-- suspended reads");

The callback method gets the IPAddress and puts it into the queue – using the InputQueue<T>.EnqueueAndDispatch(T, Action) method. There are two aspects that are quite special about that method when compared to the regular Queue<T>.Enqueue(T) method. First, it does take a callback as the second argument alongside the item to be enqueued; second, the method name isn’t just Enqueue, it also says Dispatch.

When EnqueueAndDispatch() is called, the item and the callback get put into an internal item queue – that’s the ‘enqueue’ part. As we will see in context a bit later in this post, the ‘dequeue’ operation on the queue is the BeginDequeue/EndDequeue asynchronous method call pair. There can be any number of concurrent BeginDequeue requests pending on the queue. ‘Pending’ means that the calls – rather their async callbacks and async state – are registered in another queue internal to InputQueue<T> that preserves the call order. Thus, BeginDequeue always only puts the async callback and async state into that queue and returns afterwards. There is no thread spun or hung. That’s all it does. 

As things go, the best opportunity to service a pending dequeue operation on a queue is when an item is being enqueued. Consequently, EnqueueAndDispatch() will first put the item into the internal queue and will then look whether there are registered waiters and/or readers – waiters are registered by ‘(Begin-)WaitForItem’, readers are registered by ‘(Begin-)Dequeue’. Since it’s known that there a new item in the queue now, the operation will iterate overall waiters and complete them – and does so by invoking their async callbacks, effectively lending the  enqueue operation’s thread to the waiters. If there’s at least one pending reader, it’ll then pop a message from the head of the internal item queue and call the reader’s async callback, lending the enqueue operation’s thread to processing of the dequeue operation. If that just made your head spin – yes, the item may have been dequeued and processed as EnqueueAndDispatch returns.

There is an overload for EnqueueAndDispatch() that takes an extra boolean parameter that lets you cause the dispatch operation to happen on a different thread, and there is also a EnqueueWithoutDispatch() method that just won’t dispatch through and a standalone Dispatch() method. 

The callback supplied to EnqueueAndDispatch(), here the LogDataItemDequeued method, is am Action delegate. The queue will call this callback as the item is being dequeued and, more precisely, when the item has been removed from the internal item queue, but just before it is returned to the caller. That turns out to be quite handy. If you take another look at the LogDataAcquired method you’ll notice that we’ve got two alternate code paths after EnqueueAndDispatch(). The first branch is called when the queue has not reached capacity and it’s not shutting down. When that’s so, we’re scheduling getting the next log item – otherwise we don’t. Instead, we set the readingSuspended flag and quit – effectively terminating and abandoning the loop. So how does that get restarted when the queue is no longer at capacity? The LogDataItemDequeued callback!

    void LogDataItemDequeued()
        // called whenever an item is dequeued. First we check 
        // whether the queue is no longer full after this 
        // operation and the we check whether we need to resume
        // the read loop.
        if (!this.shuttingDown &&
            this.logDataQueue.PendingCount < this.maxItemsInQueue &&
            Interlocked.CompareExchange(ref this.readingSuspended, 0, 1) == 1)
            Console.WriteLine("-- resuming reads");
            this.ipAddressLogReaderSimulator.BeginGetLogData(this.LogDataAcquired, null);

The callback gets called for each item that gets dequeued. Which means that we’ll get an opportunity to restart the loop when it’s been stalled because the queue reached capacity. So we’re checking here whether the queue isn’t shuttong down and whether it’s below capacity and if that’s so and the readingSuspended flag is set, we’re  restarting the read loop. And that’s how the throttle works.

So now we’ve got the data from the log in the queue and we’re throttling nicely so that we don’t pull too much data into memory. How about taking a look at the DNS resolver loops that process the data?

class IPAddressResolverLoop : IDisposable
    readonly InputQueue<IPAddress> logDataQueue;
    readonly int loop;
    readonly WaitCallback loopCompleted;
    readonly object state;
    bool shutdown;
    public IPAddressResolverLoop(InputQueue<IPAddress> logDataQueue, int loop, WaitCallback loopCompleted, object state)
        this.logDataQueue = logDataQueue;
        this.loop = loop;
        this.loopCompleted = loopCompleted;
        this.state = state;
        this.logDataQueue.BeginDequeue(TimeSpan.MaxValue, this.IPAddressDequeued, null);

This loop is also implemented as a class and the fields hold shared that that’s initialized in the constructor. This loop also auto-starts and does so by calling BeginDequeue on the input queue. As stated above, BeginDequeue  commonly just parks the callback and returns.

    void IPAddressDequeued(IAsyncResult ar)
        IPAddress address = this.logDataQueue.EndDequeue(ar);
        if (!this.shutdown && address != null)
            Console.WriteLine("-- took {0}", address);
            Dns.BeginGetHostEntry(address, this.IPAddressResolved, new object[] { Stopwatch.StartNew(), address });

As an IPAddress is becomes available on the queue, the callback is being invoked and that’s quite likely on a thread lent by EnqueueAndDispatch() and therefore sitting  on the thread the log file generator is using to call back for completion of the BeginGetLogData method if you trace things back. If we get an address and the value isn’t null, we’ll then proceed to schedule the DNS lookup via Dns.BeginGetHostEntry. Otherwise we’ll terminate the loop and call the loopCompleted callback. In Main() that’s the anonymous method that counts down the loop counter and signals the event when it falls to zero.

    void IPAddressResolved(IAsyncResult ar)
        var args = ((object[])ar.AsyncState);
        var stopwatch = (Stopwatch)args[0];
        var address = (IPAddress)args[1];
        double msecs = stopwatch.ElapsedMilliseconds;
            IPHostEntry entry = Dns.EndGetHostEntry(ar);
            Console.WriteLine("{0}: {1} {2}ms", this.loop, entry.HostName, msecs);
        catch (SocketException)
            // couldn't resolve. print the literal address
            Console.WriteLine("{0}: {1} {2}ms", this.loop, address, msecs);
        // done with this entry, get the next
        this.logDataQueue.BeginDequeue(TimeSpan.MaxValue, this.IPAddressDequeued, null);

The IPAddressResolved method just deals with the mechanics of printing out the result of the lookup and then schedules another BeginDequeue call to start the next iteration.

Summary: The enabler for and the core piece of the implementation of this scenario is InputQueue<T> – the dequeue-callback enables implementing throttling effectively and the dispatch logic provides an efficient way to leverage threads in applications that leverage asynchronous programming patterns, especially in I/O driven situations as illustrated here.

And last but not least – here’s teh codez; project file is for VS2010, throw the files into a new console app for VS2008 and mark the project to allow unsafe code (for the I/O completion thread pool code).

UsingInputQueue.zip (13.85 KB) 

or if you'd rather have a version of InputQueue that is using the regular thread pool, download the WCF samples and look for InputQueue.cs.

[The sample code posted here is subject to the Windows SDK sample code license]

Categories: Architecture | CLR | WCF

I put the slides for my talks at NT Konferenca 2010 on SkyDrive. The major difference from my APAC slides is that I had to put compute and storage into one deck due to the conference schedule, but instead of purely consolidating and cutting down the slide count,  I also incorporated some common patterns coming out from debates in Asia and added slides on predictable and dynamic scaling as well as on multitenancy. Sadly, I need to rush through all that in 45 minutes today.


Categories: AppFabric | Architecture | Azure | Talks | Technology | Web Services

I'm on a tour through several countries right now and I'm talking to ISVs about the Windows Azure platform, its capabilities and the many opportunities ISVs have to transform the way they do business by moving to the cloud. The first day of the events is an introduction to the platform at the capability level; it's not a coding class, that would be impossible to fit.

I've shared the slides on SkyDrive. Steal liberally if you find the material useful.


Categories: AppFabric | Architecture | Azure | Talks

Anyone using the .NET Service Bus should take a good look at the SocketShifter project started by Rob Blackwell and Richard Prodger from AWS in the UK. AWS stands for Active Web Solutions, not for the "other" AWS. The full project is up on Codeplex.

What makes SocketShifter significant is that it takes the network abstraction of SOAP, WS-Addressing, and the Service Bus full circle and layers the very bottom of that stack - plain TCP connections - as a virtualization on top of the the stack. In other words: SocketShifter allows you to create full-fidelity, bi-directional socket connections through the .NET Service Bus.

We've created something very similar to SocketShifter last year (we're using it for a few internal purposes), but haven't made it public so far. I'm glad that the AWS folks built this, so that you get to play with it.   

Categories: .NET Services | Architecture | Technology

.NET Service Bus Reverse Web Proxy: Click here to download the source


Using the application/service built from the sample linked at the top of this post you can host a publicly discoverable and accessible website or Web service from your Windows notebook or desktop machine from within most network environments without having to open up a port on the firewall, mapping a port on your NAT, or using some type of dynamic DNS service to make the site discoverable. All those essential connectivity features are provided by the .NET Service Bus and with the help of the included sample code.

I’m intentionally not bundling this up as a conveniently installable binary along with a nice configuration UI – that’s not my role here. If you want to grab the code and make it part of a cool personal media sharing app, provide external access to a departmental enterprise app, put a prototype out there for a client to play with, host a web service you want to show off, or or provide an installable version with a nice configuration UI – go ahead.

The attached sample application/service has two key capabilities that I’ve repeatedly been asked for:

a) It is a reverse web proxy that can run either as a console application or as a Windows (NT-) service. The reverse web proxy can sit in front of any web server and forward requests to it. I’ve tested this only with IIS as the backend, but I don’t see a reason why this shouldn’t work with Apache or the Web Server built into some J2EE application server.

b) It is a scripting policy host that projects the crossdomain.xml and ClientPolicyAccess.xml files required by Adobe Flash and Microsoft Silverlight into the root of a .NET Services namespace, permitting cross-domain script access from Flash and Silverlight for all endpoints hosted within the namespace. You can easily adjust the code in the sample to restrict access to particular resources within the namespace.

The fundamental architecture is illustrated in the picture. The web application that you want to project out to the public internet sits on some web server on your machine. “Your machine” may be a desktop machine at home or at work or a notebook in a hotel lobby or an airport on WiFi. As long as you’ve got line-of-sight to the .NET Service Bus and the TCP ports 828 and 818 are available for outbound traffic, you’re good. The reverse web proxy app will map any local HTTP server to a name in the .NET Service Bus and forward the traffic between the .NET Service Bus and the HTTP server. The client (any web browser, but also any HTTP Web Service client) will talk to the .NET Service Bus at the given name, the traffic flows to the reverse proxy on your machine and from there to the HTTP server.

I’m hosting (for a few days) a sample dasBlog site instance at http://clemensv6.servicebus.windows.net/dasblog/. The hosting machine for that blog is one of my personal machines. It’s got a local network address assigned by DHCP, it’s not listed in any NAT mappings, and it’s local Firewall isn’t even open for inbound HTTP traffic. 

How to install, build, and run

As a prerequisite you will need three things:

  1. Visual Studio 2008 SP1 with the .NET Framework 3.5 SP1.
  2. A .NET Services project account. The quickest route is to go to http://portal.ex.azure.microsoft.com and click “Sign up”. The approval/provisioning is pretty much instantaneous (plus 20 seconds for the provisioning to run through) once you provide your Windows Live ID. No more access codes.
  3. The .NET Services SDK for the March 2009 CTP. Click here to get it.

Unpack the files, and open ServiceBusReverseWebProxy.sln with Visual Studio 2008. In the ServiceBusReverseWebProxy project, find the app.config file and open it. Here’s where you need to put your project name and password and where you map your sites:

  1: <?xml version="1.0" encoding="utf-8" ?>
  2: <configuration>
  3:     <configSections>
  4:         <section name="reverseWebProxy" 
  5:                  type="Microsoft.Samples.ServiceBusReverseWebProxy.ReverseWebProxySection, ServiceBusReverseWebProxy" />
  6:     </configSections>
  7:     <!-- Add your .NET Services project account information here -->
  8:     <!-- Create a project at http://portal.ex.azure.microsoft.com/ -->
  9:     <reverseWebProxy netServicesProjectName="!!myproject!!" netServicesProjectPassword="!!mypassword!!"
 10:                      enableSilverlightPolicy="true">
 11:     <pathMappings>
 12:       <add namespacePath="mysite" localUri="http://localhost/mysite/" />
 13:     </pathMappings>
 14:   </reverseWebProxy>
 15: </configuration>

Put your .NET Services project/solution name into the netServicesProjectName and the password into netServicesProjectPassword.

Then pick a local HTTP server or site and give it a name in your .NET Service Bus namespace. That mapping is done in the <pathMappings> section. There are a few things that are important to note here:

  1. If your project name were ‘clemensv6’ and you map some local URI to the namespacePath ‘dasBlog’, the resulting .NET Service Bus URI would be http://clemensv6.servicebus.windows.net/dasblog/.
  2. The web application should only emit relative paths for links or, otherwise, should have a way to specify the external host address for links. That means that the web application needs to be able to deal with the presence of a reverse proxy. There is no content-level URL rewriter in this sample that would make any corrections to HTML or XML that’s handed upstream. DasBlog allows you to specify the blog site address as some external address and therefore satisfies that requirements.
  3. Redirects and any other HTTP responses that emit the HTTP ‘Location’ header or any other HTTP headers where URIs are returned are rewritten to map the internal view to the external view.
  4. If you set enableSilverlightPolicy to true, there will be crossdomain.xml and ClientPolicyAccess.xml endpoints projected into the root of your project’s namespace, ie. http://clemensv6.servicebus.windows.net/crossdomain.xml

Build. Run.

By default, the ServiceBusReverseWebProxy.exe application will simply run as a console application. If you use installutil –i ServiceBusReverseWebProxy.exe the application will be installed as a Windows Service. The default identity that it is installed under is ‘NETWORK SERVICE’. In restricted networks with constrained IPSec policies (such as the Microsoft Corporate Network), you may have to use a user account instead. You may also have to use some special Firewall-gateway software such as the ISA Firewall client to allow for outbound access to ports 828 and 818.

The actual application code isn’t really all that complicated. The ‘beef’ is in ReverseWebProxy.cs. What might be surprising here is that this class doesn’t use the WCF Service Model, but is using naked WCF channels for the upstream traffic to .NET Services and it’s using HttpWebRequest for the downstream traffic to the local Web Server. The reason for using channels is that the app is never doing any processing on the messages, so the channel model is the most straightforward and efficient way. The reason for using HttpWebRequest is that you can’t suppress auto-redirects on a WCF HTTP client. Since the stack needs to be completely transparent to redirects so that it’s the browser client up on top that gets redirected instead of someone on the way, I simply couldn’t use a WCF channel downstream. Seems to be one of these edge cases that the WCF team downstairs didn’t think anyone would ever need.

Let me know whether and how this works for you. Share the code, improve it, re-blog, let me know. @clemensv on Twitter, same name @microsoft.com for email.

Categories: .NET Services | Architecture

seht Euch mal die Wa an, wie die Wa ta kann. Auf der Mauer auf der Lauer sitzt ‘ne kleine Wa!.

It’s a German children’s song. The song starts out with “… sitzt ‘ne kleine Wanze” (bedbug) and with each verse you leave off a letter: Wanz, Wan, Wa, W, – silence.

I’ll do the same here, but not with a bedbug:

Let’s sing:

<soap:Envelope xmlns:soap=”” xmlns:wsaddr=”” xmlns:wsrm=”” xmlns:wsu=”” xmlns:app=””>
          <wsse:Security xmlns:wsse=”…”>
               <wsse:BinarySecurityToken ValueType="
                                         EncodingType="...#Base64Binary" wsu:Id=" MyID ">
                      <ds:CanonicalizationMethod Algorithm="
                      <ds:SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#md5"/
                      <ds:Reference URI="#MsgBody">
                            <ds:DigestMethod  Algorithm="
                    <wsse:Reference URI="#MyID"/>
         <app:Key wsu:Id=”AppKey”>27729912882….</app:Key>
    <soap:Body wsu:Id=”MyId”>
          <app:status>Hello, I’m good</app:status>

Not a very pretty song, I’ll admit. Let’s drop a some stuff. Let’s assume that we don’t need to tell the other party that we’re looking to give it an MD5 signature, but let’s say that’s implied and so were the canonicalization algorithm. Let’s also assume that the other side already knows the security token and the key. Since we only have a single signature digest here and yield a single signature we can just collapse to the signature value. Heck, you may not even know about what that all means. Verse 2:

<soap:Envelope xmlns:soap=”” xmlns:wsaddr=”” xmlns:wsrm=”” xmlns:wsu=”” xmlns:app=””>
          <wsse:Security xmlns:wsse=”…”>
         <app:Key wsu:Id=”AppKey”>27729912882….</app:Key>
    <soap:Body wsu:Id=”MyId”>
          <app:status>Hello, I’m good</app:status>

Better. Now let’s strip all these extra XML namespace decorations since there aren’t any name collisions as far as I can see. We’ll also collapse the rest of the security elements into one element since there’s no need for three levels of nesting with a single signature. Verse 3:

       <status>Hello, I’m good</status>

Much better. The whole angle-bracket stuff and the nesting seems semi-gratuitous and repetitive here, too. Let’s make that a bit simpler. Verse 4:

         status=Hello, I’m good

Much, much better. Now let’s get rid of that weird URI up there and split up the action and the version info, make some of these keys are little more terse and turn that into a format that’s easily transmittable over HTTP. By what we have here application/www-form-urlencoded would probably be best. Verse 5:


Oops. Facebook’s Status.set API. How did that happen? I thought that was REST?

Now play the song backwards. The “new thing” is largely analogous to where we started before the WS* Web Services stack and its CORBA/DCE/DCOM predecessors came around and there are, believe it or not, good reasons for having of that additional “overhead”. A common way to frame message content and the related control data, a common way to express complex data structures and distinguish between data domains, a common way to deal with addressing in multi-hop or store-and-forward messaging scenarios, an agreed notion of sessions and message sequencing, a solid mechanism for protecting the integrity of messages and parts of messages. This isn’t all just stupid.

It’s well worth discussing whether messages need to be expressed as XML 1.0 text on the wire at all times. I don’t think they need to and there are alternatives that aren’t as heavy. JSON is fine and encodings like the .NET Binary Encoding or Fast Infoset are viable alternatives as well. It’s also well worth discussing whether WS-Security and the myriad of related standards that were clearly built by security geniuses for security geniuses really need to be that complicated or whether we could all live with a handful of simple profiles and just cut out 80% of the options and knobs and parameters in that land.

I find it very sad that the discussion isn’t happening. Instead, people use the “REST” moniker as the escape hatch to conveniently ignore any existing open standard for tunnel-through-HTTP messaging and completely avoid the discussion.

It’s not only sad, it’s actually a bit frustrating. As one of the people responsible for the protocol surface of the .NET Service Bus, I am absolutely not at liberty to ignore what exists in the standards space. And this isn’t a mandate handed down to me, but something I do because I believe it’s the right thing to live with the constraints of the standards frameworks that exist.

When we’re sitting down and talk about a REST API, were designing a set of resources – which may result in splitting a thing like a queue into two resources, head and tail - and then we put RFC2616 on the table and try to be very precise in picking the appropriate predefined HTTP method for a given semantic and how the HTTP 2xx, 3xx, 4xx, 5xx status codes map to success and error conditions. We’re also trying to avoid inventing new ways to express things for which standards exists. There’s a standard for how to express and manage lists with ATOM and APP and hence we use that as a foundation. We use the designed extension points to add data to those lists whenever necessary.

When we’re designing a RPC SOAP API, we’re intentionally trying to avoid inventing new protocol surface and will try to leverage as much from the existing and standardized stack as we possibly can – at a minimum we’ll stick with established patterns such as the Create/GetInfo/Renew/Delete patterns for endpoint factories with renewal (which is used in several standards). I’ll add that we are – ironically - a bit backlogged on the protocol documentation for our SOAP endpoints and have more info on the REST endpoint in the latest SDK, but we’ll make that up in the near future.

So - can I build “REST” (mind the quotes) protocols that are as reduced as Facebook, Twitter, Flickr, etc? Absolutely. There wouldn’t be much new work. It’s just a matter of how we put messages on and pluck message off the wire. It’s really mostly a matter of formatting and we have a lot of the necessary building blocks in the shipping WCF bits today. I would just omit a bunch of decoration as things go out and make a bunch of assumptions on things that come in.

I just have a sense that I’d be hung upside down from a tree by the press and the blogging, twittering, facebooking community if I, as someone at Microsoft, wouldn’t follow the existing open and agreed standards or at least use protocols that we’ve published under the OSP and instead just started to do my own interpretative dance - even if that looked strikingly similar to what the folks down in the Valley are doing. At the very least, someone would call it a rip-off.

What do you think? What should I/we do?

Categories: .NET Services | Architecture | Azure | Technology | ISB | Web Services

Didn't I write that I wanted to blog more this year? It's June, you see what came out of that.

First things, first; I'm flying to Orlando tomorrow for TechEd. Looking back at what my conference schedule looked like up until 2 years ago, it's hard to believe that this is my first (!) scheduled conference talk this year. I actually do miss the life on the road a little bit. The compensation for it is that I get to see my family every day (my daughter Eva's first birthday is coming up on June 25th) and that I'm getting to work on and define the stuff that I 'just' used to be talking about. This really is the first time that I do a talk about a Microsoft technology that I own; so that's a bit of a thing:

SOA 403 Building Federated Solutions on the Internet Service Bus
Thursday, June 5, 2008 10:15AM-11:30AM
Room: S220 C (DEV)

'Own' means here that I'm the responsible Program Manager for the entire 'Messaging' feature area of BizTalk Services in what we call the '.NET Online Services' team around here. The PM title isn't entirely accurate, because I'm also writing pretty substantial amounts of product code these days. The ability to write and contribute code into the product was the primary reason why I switched jobs and joined the team I'm now in, but it turned out that the PM role was the overall better fit for me. So I'm 60% PM and 40% Dev. Or something like that.

Back to TechEd. There are two talk about what we're building. The first one is 'today' (I'm still on Pacific Time so I realize that may be a bit late); Justin Smith will provide a broad overview on the services we're building:

SOA206 Messaging, Identity, and Workflow in the Cloud
Tuesday, June 3 10:30 AM - 11:45 AM
Room: S220 C  

The second talk is mine (above) and as you might be able to tell by the '400' classification I've got the clear intent not to spend too much time in Powerpoint. I am going to show four common architectural issues and ways to deal with them using the cloud platform. And I'm going to show you the code for it. I also plan (we'll see how that part goes with the on-site network) to host an app for 'crowd participation' so that I'm explicitly not going to ask you to turn your laptops off. Since the BizTalk Services SDK hasn't spread very broadly, yet, I'll base the majority of the demos on the SDK samples so that you can easily repro the stuff that I show you.
Now ... you say ... "BizTalk Services? I don't have anything to do with BizTalk! Do you want to sell me BizTalk Server?" 
Well, it's always nice if customers decide to pick up some BizTalk Server licenses, but: No, I don't. Our stuff does actually compose with BizTalk Server 2006 R2 through the WCF Adapter, but the way to think about this code-name is that 'BizTalk' just happens to be the brand that our division has been using for Messaging. There was the BizTalk Framework, BizTalk Server and now we've got BizTalk Services. It's a brand. And we're actually finding that that name isn't really a perfect fit for what we're doing; customers suggest the same. So there'll be a different name. I'm guessing we're going to talk about that new name and some other cards we hold in our hands at or around PDC.
The stuff that I own in the 'Cloud' Messaging area are Naming, Service Registry, Connectivity/NAT Traversal, Relay, Eventing, a bunch of internal, servide-side infrastructure supporting those feature areas and some feature areas that we'll talk about more at PDC. So the fun part of TechEd for me (and you) is that the 'feedback opportunity' is pretty immediate. We're updating the services (just about) every quarter and I'll probably check in my last set of stuff for the current release cycle from Orlando or the night I get back here. From there I'm switching into planning mode for the next release (aligned with PDC) and if you bring good ideas that we can fit into the next cycle, I'm very inclined to take them. Not that we'd have any shortage of feature ideas, mind you. More is better.
If you are in Orlando .. I'll have booth duty at the WCF booth in the Exhibition Hall (or whatever they call it this year) both Wednesday and Thursday from 2:30PM to closing so come see me there or come to see my talk or just grab me at the Attendee Party if you can recognize me. ;-)
If you are not: http://labs.biztalk.net  
Categories: Architecture | TechEd US | ISB

April 3, 2008
@ 06:10 AM

Earlier today I hopefully gave a somewhat reasonable, simple answer to the question "What is a Claim?" Let's try the same with "Token":

In the WS-* security world, "Token" is really just a another name the security geniuses decided to use for "Handy package for all sorts of security stuff". The most popular type of token is the SAML (just say "samel") token. If the ladies and gentlemen designing and writing security platform infrastructure and frameworks are doing a good job you might want to know about the existence of such a thing, but otherwise be blissfully ignorant of all the gory details.

Tokens are meant to be a thing that you need to know about in much the same way you need to know about ... ummm... rebate coupons you can cut out of your local newspaper or all those funny books that you get in the mail. I have really no idea how the accounting works behind the scenes between the manufacturers and the stores, but it really doesn't interest me much, either. What matters to me is that we get $4 off that jumbo pack of diapers and we go through a lot of those these days with a 9 month old baby here at home. We cut out the coupon, present it at the store, four bucks saved. Works for me.

A token is the same kind of deal. You go to some (security) service, get a token, and present that token to some other service. The other service takes a good look at the token and figures whether it 'trusts' the token issuer and might then do some further inspection; if all is well you get four bucks off. Or you get to do the thing you want to do at the service. The latter is more likely, but I liked the idea for a moment.

Remember when I mentioned the surprising fact that people lie from time to time when I wrote about claims? Well, that's where tokens come in. The security stuff in a token is there to keep people honest and to make 'assertions' about claims. The security dudes and dudettes will say "Err, that's not the whole story", but for me it's good enough. It's actually pretty common (that'll be their objection) that there are tokens that don't carry any claims and where the security service effectively says "whoever brings this token is a fine person; they are ok to get in". It's like having a really close buddy relationship with the boss of the nightclub when you are having troubles with the monsters guarding the door. I'm getting a bit ahead of myself here, though.

In the post about claims I claimed that "I am authorized to approve corporate acquisitions with a transaction volume of up to $5Bln". That's a pretty obvious lie. If there was such a thing as a one-click shopping button for companies on some Microsoft Intranet site (there isn't, don't get any ideas) and I were to push it, I surely should not be authorized to execute the transaction. The imaginary "just one click and you own Xigg" button would surely have some sort of authorization mechanism on it.

I don't know what Xigg is assumed to be worth these days, but there is actually be a second authorization gate to check. I might indeed be authorized to do one-click shopping for corporate acquisitions, but even with my made-up $5Bln limit claim, Xigg may just be worth more that I'm claiming I'm authorized to approve. I digress.

How would the one-click-merger-approval service be secured? It would expect some sort of token that absolutely, positively asserts that my claim "I am authorized to approve corporate acquisitions with a transaction volume of up to $5Bln" is truthful and the one-click-merger-approval service would have to absolutely trust the security service that is making that assertion. The resulting token that I'm getting from the security service would contain the claim as an attribute of the assertion and that assertion would be signed and encrypted in mysterious (for me) yet very secure and interoperable ways, so that I can't tamper with it as much as I look at the token while having it in hands.

The service receiving the token is the only one able to crack the token (I'll get to that point in a later post) and look at its internals and the asserted attributes. So what if I were indeed authorized to spend a bit of Microsoft's reserves and I were trying to acquire Xigg at the touch of a button and, for some reason I wouldn't understand, the valuation were outside my acquisition limit? That's the service's job. It'd look at my claim, understand that I can't spend more than $5Bln and say "nope!" - and it would likely send email to SteveB under the covers. Trouble.

Bottom line: For a client application, a token is a collection of opaque (and mysterious) security stuff. The token may contain an assertion (saying "yep, that's actually true") about a claim or a set of claims that I am making. I shouldn't have to care about the further details unless I'm writing a service and I'm interested in some deeper inspection of the claims that have been asserted. I will get to that.

Before that, I notice that I talked quite a bit about some sort of "security service" here. Next post...

Categories: Architecture | SOA | CardSpace | WCF | Web Services

April 2, 2008
@ 08:20 PM

If you ask any search engine "What is a Claim?" and you mean the sort of claim used in the WS-* security space, you'll likely find an answer somewhere, but that answer is just as likely buried in a sea of complex terminology that is only really comprehensible if you have already wrapped your head around the details of the WS-* security model. I would have thought that by now there would be a simple and not too technical explanation of the concept that's easy to find on the Web, but I haven't really had success finding one. 

So "What is a Claim?" It's really simple.

A claim is just a simple statement like "I am Clemens Vasters", or "I am over 21 years of age", or "I am a Microsoft employee", or "I work in the Connected Systems Division", or "I am authorized to approve corporate acquisitions with a transaction volume of up to $5Bln". A claim set is just a bundle of such claims.

When I walk up to a service with some client program and want to do something on the service that requires authorization, the client program sends a claim set along with the request. For the client to know what claims to send along, the service lets it know about its requirements in its policy.

When a request comes in, this imaginary (U.S.) service looks at the request knowing "I'm a service for an online game  promoting alcoholic beverages!". It then it looks at the claim set, finds the "I am over 21 years of age" claim and thinks "Alright, I think we got that covered".

The service didn't really care who was trying to get at the service. And it shouldn't. To cover the liquor company's legal behind, they only need to know that you are over 21. They don't really need to know (and you probably don't want them to know) who is talking to them. From the client's perspective that's a good thing, because the client is now in a position to refuse giving out (m)any clues about the user's identity and only provide the exact data needed to pass the authorization gate. Mind that the claim isn't the date of birth for that exact reason. The claim just says "over 21".

Providing control over what claims are being sent to a service (I'm lumping websites, SOAP, and REST services all in the same bucket here) is one of the key reasons why Windows CardSpace exists, by the way. The service asks for a set of claims, you get to see what is being asked for, and it's ultimately your personal, interactive decision to provide or refuse to provide that information.

The only problem with relying on simple statements (claims) of that sort is that people lie. When you go to the Jack Daniel's website, you are asked to enter your date of birth before you can proceed. In reality, it's any date you like and an 10-year old kid is easily smart enough to figure that out.

All that complex security stuff is mostly there to keep people honest. Next time ...

Categories: Architecture | SOA | CardSpace | WCF | Web Services

A flock of pigs has been doing aerobatics high up over Microsoft Campus in Redmond in the past three weeks. Neither City of Redmond nor Microsoft spokespeople returned calls requesting comments in time for this article. An Microsoft worker who requested anonymity and has seen the pigs flying overhead commented that "they are as good as the Blue Angels at Seafair, just funnier" and "they seem to circle over building 42 a lot, but I wouldn't know why".

In related news ...

We wrapped up the BizTalk Services "R11" CTP this last Thursday and put the latest SDK release up on http://labs.biztalk.net/. As you may or may not know, "BizTalk Services" is the codename for Microsoft's cloud-based Identity and Connectivity services - with a significant set of further services in the pipeline. The R11 release is a major milestone for the data center side of BizTalk Services, but we've also added several new client-facing features, especially on the Identity services. You can now authenticate using a certificate in addition to username and CardSpace authentication, we have enabled support for 3rd party managed CardSpace cards, and there is extended support for claims based authorization.

Now the surprising bit:

Only about an hour before we locked down the SDK on Thursday, we checked a sample into the samples tree that has a rather unusual set of prerequisites for something coming out of Microsoft:

Runtime: Java EE 5 on Sun Glassfish v2 + Sun WSIT/Metro (JAX-WS extensions), Tool: Netbeans 6.0 IDE.

The sample shows how to use the BizTalk Services Identity Security Token Service (STS) to secure the communication between a Java client and a Java service providing federated authentication and claims-based authorization.

The sample, which you can find in ./Samples/OtherPlatforms/StandaloneAccessControl/JavaEE5 once you installed the SDK, is a pure Java sample not requiring any of our bits on either the service or client side. The interaction with our services is purely happening on the wire.

If you are a "Javahead", it might seem odd that we're shipping this sample inside a Windows-only MSI installer and I will agree that that's odd. It's simply a function of timing and the point in time when we knew that we could get it done (some more on that below). For the next BizTalk Services SDK release I expect there to be an additional .jar file for the Java samples.

It's important to note that this isn't just a thing we did as a one-time thing and because we could. We have done a significant amount of work on the backend protocol implementations to start opening up a very broad set of scenarios on the BizTalk Services Connectivity services for platforms other than .NET. We already have a set of additional Java EE samples lined up for when we enable that functionality on the backend. However, since getting security and identity working is a prerequisite for making all other services work, that's where we started. There'll be more and there'll be more platform and language choice than Java down the road.

Just to be perfectly clear: Around here we strongly believe that .NET and the Windows Communication Foundation in particular is the most advanced platform to build services, irrespective of whether they are of the WS-* or REST variety. If you care about my personal opinion, I'll say that several months of research into the capabilities of other platforms has only reaffirmed that belief for me and I don't even need to put a Microsoft hat on to say that.

But we recognize and respect that there are a great variety of individual reasons why people might not be using .NET and WCF. The obvious one is "platform". If you run on Linux or Unix and/or if your deployment target is a Java Application Server, then your platform is very likely not .NET. It's something else. If that's your world, we still think that our services are something that's useful for your applications and we want to show you why. And it is absolutely not enough for us to say "here is the wire protocol documentation; go party!". Only Code is Truth.

I'm also writing "Only Code is Truth" also because we've found - perhaps not too surprisingly - that there is a significant difference between reading and implementing the WS-* specs and having things actually work. And here I get to the point where a round of public "Thank You" is due:

The Metro team over at Sun Microsystems has made a very significant contribution to making this all work. Before we started making changes to accommodate Java, there would have been very little hope for anyone to get this seemingly simple scenario to work. We had to make quite a few changes even though our service did follow the specs.

While we were adjusting our backend STS accordingly, the Sun Metro team worked on a set of issues that we identified on their end (with fantastic turnaround times) and worked those into their public nightly builds. The Sun team also 'promoted' a nightly build of Metro 1.2 to a semi-permanent download location (the first 1.2 build that got that treatment), because it is the build tested to successfully interop with our SDK release, even though that build is known to have some regressions for some of their other test scenarios. As they work towards wrapping up their 1.2 release and fix those other bugs, we’ll continue to test and talk to help that the interop scenarios keep working.

As a result of this collaboration, Metro 1.2 is going to be a better and more interoperable release for the Sun's customers and the greater Java community and BizTalk Services as well as our future identity products will be better and more interoperable, too. Win-Win. Thank you, Sun.

As a goodie, I put some code into the Java sample that might be useful even if you don't even care about our services. Since configuring the Java certificate stores for standalone applications can be really painful, I added some simple code that's using a week-old feature of the latest Metro 1.2 bits that allows configuring the Truststores/Keystores dynamically and pull the stores from the client's .jar at runtime. The code also has an authorization utility class that shows how to get and evaluate claims on the service side by pulling the SAML token out of the context and pulling the correct attributes from the token.

Have fun.

[By the way, this is not an April Fool's joke, in case you were wondering]

Categories: Architecture | IT Strategy | Technology | CardSpace | ISB | WCF

Even though the TechEd Europe Developer Website doesn't yet clearly say so, Steve Swartz and myself will "of course!" be back with a new set of Steve & Clemens talks in Barcelona for TechEd Europe Developer (November 5-9). And for the first time we'll stay for another week and also give a talk at TechEd Europe ITForum (November 12-16) this year.

What will we talk about?

Last year we've started with a history lesson, did a broad and mostly technology agnostic overview of distributed systems architecture across 4 talks and closed with a talk that speculated about the future.

This year at the TechEd Developer show, we'll be significantly more concrete and zoom in on the technologies that make up the Microsoft SOA and Business Process platform and show how things are meant to fit together. We'll talk about the rise of declarative programming and composition and how that manifests in the .NET Framework and elsewhere. And as messaging dudes we'll also talk about messaging again. At TechEd ITForum we'll talk about the end-to-end lifecycle of composite applications and how to manage it effectively.

And of course there'll be "futures". Much less handwavy futures than last year, actually.

So .... We'll be in Barcelona for TechEd. You too?

Categories: Architecture | Talks | TechEd Europe

Christian Weyer shows off the few lines of pretty straightforward WCF code & config he needed to figure out in order to set up a duplex conversation through BizTalk Services.

Categories: Architecture | SOA | BizTalk | WCF | Web Services | XML

Steve has a great analysis of what BizTalk Services means for Corzen and how he views it in the broader industry context.

Categories: Architecture | SOA | IT Strategy | Technology | BizTalk | WCF | Web Services

April 25, 2007
@ 03:28 AM

"ESB" (for "Enterprise Service Bus") is an acronym floating around in the SOA/BPM space for quite a while now. The notion is that you have a set of shared services in an enterprise that act as a shared foundation for discovering, connecting and federating services. That's a good thing and there's not much of a debate about the usefulness, except whether ESB is the actual term is being used to describe this service fabric or whether there's a concrete product with that name. Microsoft has, for instance, directory services, the UDDI registry, and our P2P resolution services that contribute to the discovery portion, we've got BizTalk Server as a scalable business process, integration and federation hub, we've got the Windows Communication Foundation for building service oriented applications and endpoints, we've got the Windows Workflow Foundation for building workflow-driven endpoint applications, and we have the Identity Platform with ILM/MIIS, ADFS, and CardSpace that provides the federated identity backplane.

Today, the division I work in (Connected Systems Division) has announced BizTalk Services, which John Shewchuk explains here and Dennis Pilarinos drills into here.

Two aspects that make the idea of a "service bus" generally very attractive are that the service bus enables identity federation and connectivity federation. This idea gets far more interesting and more broadly applicable when we remove the "Enterprise" constraint from ESB it and put "Internet" into its place, thus elevating it to an "Internet Services Bus", or ISB. If we look at the most popular Internet-dependent applications outside of the browser these days, like the many Instant Messaging apps, BitTorrent, Limewire, VoIP, Orb/Slingbox, Skype, Halo, Project Gotham Racing, and others, many of them depend on one or two key services must be provided for each of them: Identity Federation (or, in absence of that, a central identity service) and some sort of message relay in order to connect up two or more application instances that each sit behind firewalls - and at the very least some stable, shared rendezvous point or directory to seed P2P connections. The question "how does Messenger work?" has, from an high-level architecture perspective a simple answer: The Messenger "switchboard" acts as a message relay.

The problem gets really juicy when we look at the reality of what connecting such applications means and what an ISV (or you!) were to come up with the next cool thing on the Internet:

You'll soon find out that you will have to run a whole lot of server infrastructure and the routing of all of that traffic goes through your pipes. If your cool thing involves moving lots of large files around (let's say you'd want to build a photo sharing app like the very unfortunately deceased Microsoft Max) you'd suddenly find yourself running some significant sets of pipes (tubes?) into your basement even though your users are just passing data from one place to the next. That's a killer for lots of good ideas as this represents a significant entry barrier. Interesting stuff can get popular very, very fast these days and sometimes faster than you can say "Venture Capital".

Messenger runs such infrastructure. And the need for such infrastructure was indeed an (not entirely unexpected) important takeaway from the cited Max project. What looked just to be a very polished and cool client app to showcase all the Vista and NETFX 3.0 goodness was just the tip of a significant iceberg of (just as cool) server functionality that was running in a Microsoft data center to make the sharing experience as seamless and easy as it was. Once you want to do cool stuff that goes beyond the request/response browser thing, you easily end up running a data center. And people will quickly think that your application sucks if that data center doesn't "just work". And that translates into several "nines" in terms of availability in my book. And that'll cost you.

As cool as Flickr and YouTube are, I don't think of none of them or their brethren to be nearly as disruptive in terms of architectural paradigm shift and long-term technology impact as Napster, ICQ and Skype were as they appeared on the scene. YouTube is just a place with interesting content. ICQ changed the world of collaboration. Napster's and Skype's impact changed and is changing entire industries. The Internet is far more and has more potential than just having some shared, mashed-up places where lots of people go to consume, search and upload stuff. "Personal computing" where I'm in control of MY stuff and share between MY places from wherever I happen to be and NOT giving that data to someone else so that they can decorate my stuff with ads has a future. The pendulum will swing back. I want to be able to take a family picture with my digital camera and snap that into a digital picture frame at my dad's house at the push of a button without some "place" being in the middle of that. The picture frame just has to be able to stick its head out to a place where my camera can talk to it so that it can accept that picture and know that it's me who is sending it.

Another personal, and very concrete and real point in case: I am running, and I've written about that before, a custom-built (software/hardware) combo of two machines (one in Germany, one here in the US) that provide me and my family with full Windows Media Center embedded access to live and recorded TV along with electronic program guide data for 45+ German TV channels, Sports Pay-TV included. The work of getting the connectivity right (dynamic DNS, port mappings, firewall holes), dealing with the bandwidth constraints and shielding this against unwanted access were ridiculously complicated. This solution and IP telephony and video conferencing (over Messenger, Skype) are shrinking the distance to home to what's effectively just the inconvenience of the time difference of 9 hours and that we don't see family and friends in person all that often. Otherwise we're completely "plugged in" on what's going on at home and in Germany in general. That's an immediate and huge improvement of the quality of living for us, is enabled by the Internet, and has very little to do with "the Web", let alone "Web 2.0" - except that my Program Guide app for Media Center happens to be an AJAX app today. Using BizTalk Services would throw out a whole lot of complexity that I had to deal with myself, especially on the access control/identity and connectivity and discoverability fronts. Of course, as I've done it the hard way and it's working to a degree that my wife is very happy with it as it stands (which is the customer satisfaction metric that matters here), I'm not making changes for technology's sake until I'm attacking the next revision of this or I'll wait for one of the alternative and improving solutions (Orb is on a good path) to catch up with what I have.

But I digress. Just as much as the services that were just announced (and the ones that are lined up to follow) are a potential enabler for new Napster/ICQ/Skype type consumer space applications from innovative companies who don't have the capacity or expertise to run their own data center, they are also and just as importantly the "Small and Medium Enterprise Service Bus".

If you are an ISV catering shrink-wrapped business solutions to SMEs whose network infrastructure may be as simple as a DSL line (with dynamic IP) that goes into a (wireless) hub and is as locked down as it possibly can be by the local networking company that services them, we can do as much as we want as an industry in trying to make inter-company B2B work and expand it to SMEs; your customers just aren't playing in that game if they can't get over these basic connectivity hurdles.

Your app, that lives behind the firewall shield and NAT and a dynamic IP, doesn't have a stable, public place where it can publish its endpoints and you have no way to federate identity (and access control) unless you are doing some pretty invasive surgery on their network setup or you end up building and running run a bunch of infrastructure on-site or for them. And that's the same problem as the mentioned consumer apps have. Even more so, if you look at the list of "coming soon" services, you'll find that problems like relaying events or coordinating work with workflows are very suitable for many common use-cases in SME business applications once you imagine expanding their scope to inter-company collaboration.

So where's "Megacorp Enterprises" in that play? First of all, Megacorp isn't an island. Every Megacorp depends on lots of SME suppliers and retailers (or their equivalents in the respective lingo of the verticals). Plugging all of them directly into Megacorp's "ESB" often isn't feasible for lots of reasons and increasingly less so if the SME had a second or third (imagine that!) customer and/or supplier. 

Second, Megacorp isn't a uniform big entity. The count of "enterprise applications" running inside of Megacorp is measured in thousands rather than dozens. We're often inclined to think of SAP or Siebel when we think of enterprise applications, but the vast majority are much simpler and more scoped than that. It's not entirely ridiculous to think that some of those applications runs (gasp!) under someone's desk or in a cabinet in an extra room of a department. And it's also not entirely ridiculous to think that these applications are so vertical and special that their integration into the "ESB" gets continuously overridden by someone else's higher priorities and yet, the respective business department needs a very practical way to connect with partners now and be "connectable" even though it sits deeply inside the network thicket of Megacorp. While it is likely on every CIO's goal sheet to contain that sort of IT anarchy, it's a reality that needs answers in order to keep the business bring in the money.

Third, Megacorp needs to work with Gigacorp. To make it interesting, let's assume that Megacorp and Gigacorp don't like each other much and trust each other even less. They even compete. Yet, they've got to work on a standard and hence they need to collaborate. It turns out that this scenario is almost entirely the same as the "Panic! Our departments take IT in their own hands!" scenario described above. At most, Megacorp wants to give Gigacorp a rendezvous and identity federation point on neutral ground. So instead of letting Gigacorp on their ESB, they both hook their apps and their identity infrastructures into the ISB and let the ISB be the mediator in that play.

Bottom line: There are very many solution scenarios, of which I mentioned just a few, where "I" is a much more suitable scope than "E". Sometimes the appropriate scope is just "I", sometimes the appropriate scope is just "E". They key to achieve the agility that SOA strategies commonly promise is the ability to do the "E to I" scale-up whenever you need it in order to enable broader communication. If you need to elevate one or a set services from your ESB to Internet scope, you have the option to go and do so as appropriate and integrated with your identity infrastructure. And since this all strictly WS-* standards based, your "E" might actually be "whatever you happen to run today". BizTalk Services is the "I".

Or, in other words, this is a pretty big deal.

Categories: Architecture | SOA | IT Strategy | Microsoft | MSDN | BizTalk | WCF | Web Services

My first of two sessions this week here at TechEd is on Thursday, at 2:45pm in room 153ABC on "Designing Bindings and Contracts".

I realize that the title sounds a bit abstract and a different way to put this would be "How to choose the correct bindings and what to consider about contracts in a variety of architectual scenarios", but that would have been a bit long as a title. in the talk I'll explain the system-defined bindings that we ship in the product so that we've got stuff to work with and then I'll get out the tablet pen and draw up a bunch of scenarios and how our bindings (read: communication options) make sense in those. What's the best choice for N-Tier inside and outside of the corporate perimeter, what do you do for queueing-style apps, how do you implement volatile or durable 1:1 pub/sub, how do you implement broadcasts and where do they make sense, etc.

Categories: Architecture | Indigo | WCF

A question that is raised quite often in the context of “SOA” is that of how to deal with data.  Specifically, people are increasingly interested in (and concerned about) appropriate caching strategies. What I see described in that context is often motivated by the fundamental misunderstanding that the SO tenet that speaks about ”automony” is perceived to mean “autonomous computing” while it really means “avoid coupling”. The former is an architecture prescription, the latter is just a statement about the quality of a network edge.

I will admit that it the use of “autonomy” confused me for a while as well. Specifically, in my 5/2004 “Data Services” post, I’ve shown principles of autonomous computing and how there is a benefit to loose coupling at the network edge when combined with autonomous computing principles, but at the time I did not yet fully understand how orthogonal those two things really are. I guess that one of the aspects of blogging is that you’ve got to be ready to learn and evolve your knowledge in front of all people. Mind that I stand by the architectural patterns and the notion of data services that I explained in that post, except for the notion that the “Autonomy” SO tenet speaks about autonomous computing.

The picture here illustrates the difference. By autonomous computing principles the left shape of the service is “correct”. The service is fully autonomous and protects its state. That’s a model that’s strictly following the Fiefdoms/Emissaries idea that Pat Helland formulated a few years back. Very many applications look like the shape on the right. There are a number of services sticking up that share a common backend store. That’s not following autonomous computing principles. However, if you look across the top, you’ll see that the endpoints (different colors, different contracts) look precisely alike from the outside for both pillars. That’s the split: Autonomous computing talks very much about how things are supposed to look behind your service boundary (which is not and should not be anyone’s business but yours) and service orientation really talks about you being able to hide any kind of such architectural decision between a loosely coupled network edge. The two ideas compose well, but they are not the same, at all.

Which leads me to the greater story: In terms of software architecture, “SOA” introduces very little new. All distributed systems patterns that have evolved since the 1960 stay true. I haven’t really seen any revolutionary new architecture pattern come out since we speak about Web Services. Brokers, Intermediaries, Federators, Pub/Sub, Queuing, STP, Conversations – all of that has been known for a long time. We’ve just commonly discovered that loose coupling is a quality that’s worth something.

In all reality, the “SOA” hype is about the notion of aligning business functions with software in order to streamline integration. SOA doesn’t talk about software architecture; in other words: SOA does not talk about how to shape the mechanics of a system. From a software architecture perspective, any notion of an “SOA revolution” is complete hogwash. From a Business/IT convergence perspective – to drive analysis and high-level design – there’s meat in the whole story, but I see the SOA term being used mostly for describing technology pieces. “We are building a SOA” really means “we are building a distributed system and we’re trying to make all parts loosely coupled to the best of our abilities”. Whether that distributed system is indeed aligned with the business functions is a wholly different story.

However, I digress. Coming back to the data management issue, it’s clear that a stringent autonomous computing design introduces quite a few challenges in terms of data management. Data consolidation across separate stores for the purposes of reporting requires quite a bit of special consideration and so does caching of data. When the data for a system is dispersed across a variety of stores and comes together only through service channels without the ability to freely query across the data stores and those services are potentially “far” away in terms of bandwidth and latency, data management becomes considerably more difficult than in a monolithic app with a single store. However, this added complexity is a function of choosing to make the service architecture follow autonomous computing principles, not one of how to shape the service edge and whether you use service orientation principles to implement it.

To be clear: I continue to believe that aligning data storage with services is a good thing. It is an additional strategy for looser coupling between services and allows the sort of data patterns and flexibility that I have explained in the post I linked to above. However, “your mileage may vary” is as true here as anywhere. For some scenarios, tightly coupling services in the backyard might be the right thing to do. That’s especially true for “service-enabling” existing applications. All these architectural considerations are, however, strictly orthogonal to the tenets of SO.

Generally, my advice with respect to data management in distributed systems is to handle all data explicitly as part of the application code and not hide data management in some obscure interception layer. There are a lot of approaches that attempt to hide complex caching scenarios away from application programmers by introducing caching magic on the call/message path. That is a reasonable thing to do, if the goal is to optimize message traffic and the granularity that that gives you is acceptable. I had a scenario where that was a just the right fit in one of my last newtelligence projects. Be that as it may, proper data management, caching included, is somewhat like the holy grail of distributed computing and unless people know what they’re doing, it’s dangerous to try to hide it away.

That said, I believe that it is worth a thought to make caching a first-class consideration in any distributed system where data flows across boundaries. If it’s known at the data source that a particular record or set of records won’t be updated until 1200h tomorrow (many banks, for instance, still do accounting batch runs just once or twice daily) then it is helpful to flow that information alongside the data to allow any receiver determine the caching strategy for the particular data item(s). Likewise, if it’s know that a record or record set is unlikely to change or even guaranteed to not change within an hour/day/week/month or if some staleness of that record is typically acceptable, the caching metadata can indicate an absolute or relative time instant at which the data has to be considered stale and possibly a time instant at which it absolutely expires and must be cleaned from any cache. Adding caching hints to each record or set of records allows clients to make a lot better informed decisions about how to deal with that data. This is ultimately about loose coupling and giving every participant of a distributed system enough information to make their own decisions about how to deal with things.

Which leaves the question about where to cache stuff. The instant “obvious best idea” is to hold stuff in memory. However, if the queries into the cached data become more complex than “select all” or reasonably simple hashtable lookups, it’s not too unlikely that, if you run on Windows, a local SQL Server (-Express) instance holding the cache data will do as good or better (increasingly with data volume) compared a custom query “engine” in terms of performance – even if it serves data out from memory. That’s especially true for caching frameworks that can be written within the time/budget of a typical enterprise project. Besides, long-lived cached data whose expiration window exceeds the lifetime of the application instance needs a home, too. One of the bad caching scenarios is that the network gets saturated at 8 in the morning when everybody starts up their smart client apps and tries to suck the central database dry at once – that’s what in-memory database approaches cause.

Categories: Architecture

Recently, a gentleman from Switzerland wrote me an email after attending the “WinFX Tour” presentations in Zurich. He is a business consultant advising corporations on the IT strategy and an IT industry veteran with his first programming work dating as long back as 1962. He was quite interested in the Workflow part of my presentation, but wrote me that he thinks that those abstraction efforts go the wrong way. He sees the fundamental gap between business and IT widening and sees very little hope for the two sides to ever find a way to communicate effectively with each other. In his view, IT isn’t truly interested in the reality of business. He wrote me a very long email with several statements and questions, which I won’t quote – the (very long) reply below should give you enough context: 

Your main concern is, in my words, about the disconnect between the reality of the business vs. the snapshot of a perceived business reality that is translated into a software system. I say “perceived” because the capturing of the actual business reality is done by analysts who are on the fence between being business experts and IT experts and even though they would ideally be geniuses in both worlds to do that translation, they often are coming down on one side of that fence in terms of their core competencies.  

The only way to close that gap is to pull people off that fence onto the business side and enable them to capture the reality of the business and the way the business processes flow with tools that fit their needs and don’t demand that they are programmers or even have the sense of abstraction that a software designer or process analyst possesses. Our industry is only starting to understand what is required to achieve this and we are certainly thinking hard about these problems. 

You state that the Business/IT gap cannot be bridged. I do not fully agree with that assessment. I think what you are observing is a particular effect of software architecture and implementation as it exists today. You are truly an “industry veteran” you can certainly see much clearer how software has evolved since you got into the trade in 1962 than I can as a relative youngling. However, my (humble) observation is that the fundamental concepts of business software design haven’t changes all that much since then. A business application is a scoped set of siloed functionality built for a set of predefined purposes and whether the user interacts with the system through batch jobs, green screens, web sites or whether the system is made up of 5000 identical fat client applications with identical logic that talk to a central database is merely an implementation detail. The tradition of (interactive) business software is very much that we’ve got a system with some sort of menu screen or other form of selecting the task you want to perform with the system and any number of forms/screens/dialogs with which you can interact with the system. The reason for your observation of IT conveniently neglecting at least 20% of what they are told to do is not only caused by them not understanding the business, but also caused by them being not in control of the monsters they create because the scope grows too big, everything is tightly coupled to everything else, and a lot of functionalities are crammed together in ”multi-purpose” user interfaces that often make changes or adjustments mutually exclusive and hence “impossible”. 

The actual business process is often external to the software or if the software has workflow guidance, the workflows are not taking all the “offline” activities into account and the process becomes lossy. The way that most applications are built today is that they present the user with a grab-bag of tools and procedures and leaves it up to them to navigate through it. Even worse, business processes are often changed to fit the constraints of software – and not the other way around. Testimony for this is that the organizational structure of many companies is hardly recognizable once the SAP/PeopleSoft/Oracle/etc. ERP consultants have left the building. 

So what’s changing with the SOA/BPM “hype” as you call it? Or I shall better say: What’s the opportunity?  

First, we’re working hard to get to a point where we can build interoperable, autonomous pieces of software with well-defined interfaces where we don’t have to spend 80% of our time and money trying to make those pieces communicate with each other. Before the industry consensus around XML, SOAP and the WS-* specifications this most fundamental requirement for breaking up the solution silos simple didn’t exist (despite previous efforts like DCE or CORBA). I am not saying that we have completely arrived at that point, but we’ve got a better foundation than ever to build composable, loosely coupled, distributed systems that can interact irrespective of their implementation specifics. 

Second, we are starting to see growing insight on the IT side for the need of a common understanding of the idea of “services”. Any employee, department, division or vendor assumes various roles within an organization and renders a certain portfolio of services towards the organization. The notion of service-oriented architecture speaks about writing software that fits into the organization instead of fitting the organization to the software. Just like an employee assumes roles, software assumes roles as well. From an architecture perspective, people and software are peers collaborating on the same business task. The former are just a lot more flexible than the latter. 

Business processes (or bureaucracy), whether formalized or ad-hoc, are driven by the flow of information. If the information flow is not central to the execution of the process you have friction and efficiency suffers. With a paper-bound “offline” process where all information flows in a (paper-) file folder and on forms carried by courier from department to department there’s arguably a better and more complete information flow than in an environment where information is scattered around dozens of different computer systems where the flow of information and the flow of tasks are disconnected and only knitted together by people shuffling mice around on their desks. 

The opportunity is right at that point. With the technology we have available and a common notion of “services”, the line between the implementation of deterministic work (done by programs) and non-deterministic work (done by humans) begins to blur. It is fairly easy today to write chat-bots, mail-bots, or speech-enabled services that allow rich computer/human interaction within their respective scope. It is likewise fairly easy to expose application to application services that allow exchanging rich data across system and platform boundaries using Web Services. The concrete form of how a service interacts with a peer depends on who the peer is. If you have an address book lookup service, it may have all of these capabilities at once. With that, services can be integrated into real-life ad-hoc scenarios singly or in combination because they are built to satisfy specific roles and their capabilities are exposed for (re)use in arbitrary contexts.  

When you have a service that’s specialized on creating complex sales offers and you are a salesman on the road an sit in a taxi, it’s absolutely fathomable to build a solution that allows you to call in, identify yourself towards a voice service and ask for an baseline offer for 500 units of A and 300 units of B for a specific customer and have the result, with all applicable rebates and possible delivery time frames, including shipping cost and considering the schedule of the container freighter ship from China, pushed onto your hotel fax or by email or SMS/MMS. However, the question of how realistic it is to build that service purely depends on how easy it is to make all the necessary backend systems talk to each other and wire up all the roles into a process that can jointly accomplish the task. And it also depends on how well any necessary human intervention into that process can be integrated into the respective flow. Assuming that the resulting offer would cross a certain threshold in terms of the total order amount, the offer might require approval by a manager – the manager renders a “decision service” towards the process and might so by responding to an email that is sent to him/her by a program and will be evaluated by a program.  

Ideally, you could teach a service (through mail, speech or chat) the workflow ad-hoc just as you would tell an administrative assistant a sequence of activities. “Get A and B, do C, let Mike take a look at it and send it to me”. The information can be parsed, mapped to activities, the activities can be wired up to a one-off workflow and the workflow can be launched. The crux is that you need to have those individual capabilities and activities catalogued and available in a fashion that allows composition. And the above example of the approval manager goes to show that people’s roles and capabilities must be part of that same catalogue. 

We (Microsoft) are already shipping and will ship even more building blocks not only for creating such services and workflows, but we also have an increasingly complete federated identity and access control infrastructure that allows to realize all that decentralized interaction in a secure fashion. From a purely technical perspective, the above scenario is not utopia. We have every single component in place to let customers build this, voice recognition included. However, mind that I am not saying “very easy”. 

“BPM” tools such as the Windows Workflow Foundation or BizTalk Orchestration are all about putting the process into the center. Services are all about creating easy-to-integrate, loosely coupled, business-aligned pieces of functionality that can be used and reused in as many contexts as a role in a business can be used in different contexts. A workflow may just be one step that you just have in your head as you are interacting with the address lookup service or it may be a more formalized workflow with several steps and intertwined people-based and program-based activities. The realization here is that while individual roles in an organization are relatively sticky, the way that the organization acts across those roles and tunes the rules for the roles is very dynamic. BPM tools are precisely about shaping and reshaping the flow and rules quickly and deploying those instantly into the business environment. 

The design of the Windows Workflow Foundation also recognizes that business processes, especially those that run for days and weeks rather than seconds or minutes, are never set in stone. If you’ve got a (very) long running master workflow that were, for instance, tracking an insurance policy and applies rebates and handles claims as time progresses and suddenly the client comes around and sues the insurance company, that policy certainly no longer belongs into the same bucket as the other 100000s of policies that are being tracked. So for such unforeseeable circumstances, the foundation makes it possible to jump right in, assess the status and redirect or reshape the respective flow on a case-by-case basis and if the new action that you add into the flow in order to terminate it is merely that you hand off the entire case with all of its status to your legal department’s “dropoff service”. 

You write about SO/A and BPM as an “effort to extend the borders of IT”. From my perspective it’s rather the attempt by IT to humbly fit itself into the ever changing dynamic nature of business. On an industry level we have started to realize that we need to get away from the thinking that applications are silos and that nobody should factually care about whether he/she interacts with a “CRM” or “ERP” system or needs to navigate across a dozen of intranet websites to get a job done. The whole notion of “we build a loan application handling program” is indeed misguided. That’s the disconnect. 

But how alien is the notion of building a library of services that are not wired up into a thing that you can install and instantly run as a program? We hire people into organizations who have a broad education if which we only tap a fraction at any given point in time, but we can trust that we can instantly tap some other capability as the process changes. Which CEO/CFO/CIO will commit to a project that “educates” a software system to have a broad spectrum of capabilities that may or may not be used in the circumstances of “now” but which may become a pressing necessity as you need to quickly adapt to a change in the business? How strange of an idea is it that you might produce a software package that consists of hundreds of roles and thousands of activities but none of them are connected in any way, because that’s the job of the space that’s intentionally left blank and undefined to host the customized business process? How does the buyer justify the expense? What can the vendor charge? Would you continually service and update a “dormant” software-based capability to the latest policies, laws and regulations so that it can be used whenever the need arises? 

All that comes back to a completely different communication breakdown: How does IT explain that sort of perspective to the business stakeholders? The great challenge is not in the bits, it’s in the heads.

Categories: Architecture | SOA

Inside the big house....

Back in December of last year and about two weeks before I publicly announced that I will be working from Microsoft, I started a nine-part series on REST/POX* programming with Indigo WCF. (1, 2, 3, 4, 5, 6, 7, 8, 9). Since then, the WCF object model has seen quite a few feature and usability improvements across the board and those are significant enough to justify that I rewrite the entire series to get it up to the February CTP level and I will keep updating it through Vista/WinFX Beta2 and as we are marching towards our RTM. We've got a few changes/extensions in our production pipeline to make the REST/POX story for WCF v1 stronger and I will track those changes with yet another re-release of this series.

Except in one or two occasions, I haven't re-posted a reworked story on my blog. This here is quite a bit different, because of it sheer size and the things I learned in the process of writing it and developing the code along the way. So even though it is relatively new, it's already due for an end-to-end overhaul to represent my current thinking. It's also different, because I am starting to cross-post content to http://blogs.msdn.com/clemensv with this post; however http://friends.newtelligence.net/clemensv remains my primary blog since that runs my engine ;-)


The "current thinking" is of course very much influenced by now working for the team that builds WCF instead of being a customer looking at things from the outside. That changes the perspective quite a bit. One great insight I gained is how non-dogmatic and customer-oriented our team is. When I started the concrete REST/POX work with WCF back in last September (on the customer side still working with newtelligence), the extensions to the HTTP transport that enabled this work were just showing up in the public builds and they were sometimes referred to as the "Tim/Aaaron feature". Tim Ewald and Aaron Skonnard had beat the drums for having simple XML (non-SOAP) support in WCF so loudly that the team investigated the options and figured that some minimal changes to the HTTP transport would enable most of these scenarios**. Based on that feature, I wrote the set of dispatcher extensions that I've been presenting in the V1 of this series and newtellivision as the applied example did not only turn out to be a big hit as a demo, it also was one of many motivations to give the REST/POX scenario even deeper consideration within the team.

REST/POX is a scenario we think about as a first-class scenario alongside SOAP-based messaging - we are working with the ASP.NET Atlas team to integrate WCF with their AJAX story and we continue to tweak the core WCF product to enable those scenarios in a more straightforward fashion. Proof for that is that my talk (PPT here) at the MIX06 conference in Las Vegas two weeks ago was entirely dedicated to the non-SOAP scenarios.

What does that say about SOAP? Nothing. There are two parallel worlds of application-level network communication that live in peaceful co-existence:

  • Simple point-to-point, request/response scenarios with limited security requirements and no need for "enterprise features" along the lines of reliable messaging and transaction integration.
  • Rich messaging scenarios with support for message routing, reliable delivery, discoverable metadata, out-of-band data, transactions, one-way and duplex, etcetc.

The Faceless Web

The first scenario is the web as we know it. Almost. HTTP is an incredibly rich application protocol once you dig into RFC2616 and look at the methods in detail and consider response codes beyond 200 and 404. HTTP is strong because it is well-defined, widely supported and designed to scale, HTTP is weak because it is effectively constrained to request/response, there is no story for server-to-client notifications and it abstracts away the inherent reliability of the transmission-control protocol (TCP). These pros and cons lists are not exhaustive.

What REST/POX does is to elevate the web model above the "you give me text/html or */* and I give you application/x-www-form-urlencoded" interaction model. Whether the server punts up markup in the form of text/html or text/xml or some other angle-bracket dialect or some raw binary isn't too interesting. What's changing the way applications are built and what is really creating the foundation for, say, AJAX is that the path back to the server is increasingly XML'ised. PUT and POST with a content-type of text/xml is significantly different from application/x-www-form-urlencoded. What we are observing is the emancipation of HTTP from HTML to a degree that the "HT" in HTTP is becoming a misnomer. Something like IXTP ("Interlinked XML Transport Protocol" - I just made that up) would be a better fit by now.

The astonishing bit in this is that there has been been no fundamental technology change that has been driving this. The only thing I can identify is that browsers other than IE are now supporting XMLHTTP and therefore created the critical mass for broad adoption. REST/POX rips the face off the web and enables a separation of data and presentation in a way that mashups become easily possible and we're driving towards a point where the browser cache becomes more of an application repository than merely a place that holds cacheable collateral. When developing the newtellivision application I have spent quite a bit of time on tuning the caching behavior in a way that HTML and script are pulled from the server only when necessary and as static resources and all actual interaction with the backend services happens through XMLHTTP and in REST/POX style. newtellivision is not really a hypertext website, it's more like a smart client application that is delivered through the web technology stack.

Distributed Enterprise Computing

All that said, the significant investments in SOAP and WS-* that were made my Microsoft and industry partners such as Sun, IBM, Tibco and BEA have their primary justification in the parallel universe of highly interoperable, feature-rich intra and inter-application communication as well as in enterprise messaging. Even though there was a two-way split right through through the industry in the 1990s with one side adopting the Distributed Computing Environment (DCE) and the other side driving the Common Object Request Broker Architecture (CORBA), both of these camps made great advances towards rich, interoperable (within their boundaries) enterprise communication infrastructures. All of that got effectively killed by the web gold-rush starting in 1994/1995 as the focus (and investment) in the industry turned to HTML/HTTP and to building infrastructures that supported the web in the first place and everything else as a secondary consideration. The direct consequence of the resulting (even if big) technology islands hat sit underneath the web and the neglect of inter-application communication needs was that inter-application communication has slowly grown to become one of the greatest industry problems and cost factors. Contributing to that is that the average yearly number of corporate mergers and acquisitions has tripled compared to 10-15 years ago (even though the trend has slowed in recent years) and the information technology dependency of today's corporations has grown to become one of the deciding if not the deciding competitive factor for an ever increasing number of industries.

What we (the industry as a whole) are doing now and for the last few years is that we're working towards getting to a point where we're both writing the next chapter of the story of the web and we're fixing the distributed computing story at the same time by bringing them both onto a commonly agreed platform. The underpinning of that is XML; REST/POX is the simplest implementation. SOAP and the WS-* standards elevate that model up to the distributed enterprise computing realm.

If you compare the core properties of SOAP+WS-Adressing and the Internet Protocol (IP) in an interpretative fashion side-by-side and then also compare the Transmission Control Protocol (TCP) to WS-ReliableMessaging it may become quite clear to you what a fundamental abstraction above the networking stacks and concrete technology coupling the WS-* specification family has become. Every specification in the long list of WS-* specs is about converging and unifying formerly proprietary approaches to messaging, security, transactions, metadata, management, business process management and other aspects of distributed computing into this common platform.


The beauty of that model is that it is an implementation superset of the web. SOAP is the out-of-band metadata container for these abstractions. The key feature of SOAP is SOAP:Header, which provides a standardized facility to relay the required metadata alongside payloads. If you are willing to constrain out-of-band metadata to one transport or application protocol, you don't need SOAP.

There is really very little difference between SOAP and REST/POX in terms of the information model. SOAP carries headers and HTTP carries headers. In HTTP they are bolted to the protocol layer and in SOAP they are tunneled through whatever carries the envelope. [In that sense, SOAP is calculated abuse of HTTP as a transport protocol for the purpose of abstraction.] You can map WS-Addressing headers from and to HTTP headers.

The SOAP/WS-* model is richer, more flexible and more complex. The SOAP/WS-* set of specifications is about infrastructure protocols. HTTP is an application protocol and therefore it is naturally more constrained - but has inherently defined qualities and features that require an explicit protocol implementation in the SOAP/WS-* world; one example is the inherent CRUD (create, read, update, delete) support in HTTP that is matched by the explicitly composed-on-top WS-Transfer protocol in SOAP/WS-*

The common platform is XML. You can scale down from SOAP/WS-* to REST/POX by putting the naked payload on the wire and rely on HTTP for your metadata, error and status information if that suits your needs. You can scale up from REST/POX to SOAP/WS-* by encapsulating payloads and leverage the WS-* infrastructure for all the flexibility and features it brings to the table. [It is fairly straightforward to go from HTTP to SOAP/WS-*, and it is harder to go the other way. That's why I say "superset".]

Doing the right thing for a given scenario is precisely what are enabling in WCF. There is a place for REST/POX for building the surface of the mashed and faceless web and there is a place for SOAP for building the backbone of it - and some may choose to mix and match these worlds. There are many scenarios and architectural models that suit them. What we want is

One Way To Program

* REST=REpresentational State Transfer; POX="Plain-Old XML" or "simple XML"

Categories: Architecture | SOA | MIX06 | Technology | Web Services

March 14, 2006
@ 02:17 PM

I kicked off quite a discussion with my recent post on O/R mapping. Some people think I am completely wrong, some say that it resonates with their experience, some say I wrote this in mean spirit, some are jubilating. I particularly liked the "Architectural Truthiness" post by David Ing and the comment by "Scott E" in my comments section who wrote:

I've hiked up the learning curve for Hibernate (the Java flavor) only to find that what time was saved in mapping basic CRUD functionality got eaten up by out-of-band custom data access (which always seems to be required) and tuning to get performance close to what it would have been with a more specialized, hand-coded DAL.

As always, it's a matter of perspective. Here is mine: I went down the O/R mapping route in a project in '98/'99 when my group at the company I was working for at the time was building a new business framework. We wrote a complete, fully transparent O/R mapper in C++. You walked up to a factory which dehydrated objects and you could walk along the association links and the object graph would either incrementally dehydrate or dehydrate in predefined segments. We had filtering capabilities that allowed to constrain 1:N collections with large N's, we could auto-resolve N:M relationships, had support for inheritance, and all that jazz. The whole framework was written with code generation in mind. Our generators were fed with augmented UML class diagrams and spit out the business layer, whereby we had a "partial classes" concept where we'd keep the auto-gen'd code in one tree and the parts that were supposed to be filled manually in another part of the code tree. Of course we'd preserve changes across re-gen's. Pure OO nirvana.

While the platforms have evolved substantially in the past 7 years, the fundamental challenges for transparent (fully abstracted) mapping of data to objects remain essentially the same.

  • Given metadata to do the mapping, implementing CRUD functionality with an O/R mapper is quite easy. We had to put lots of extra metadata into our C++ classes back in the day, but with .NET and Java the metadata is all there and therefore CRUD O/R mapping is very low-hanging fruit on both platforms. That's why there's such a large number of projects and products.
  • Defining and resolving associations is difficult. 1:N is hard, because you need to know what your N looks like. You don't want to dehydrate 10000 objects to find a value in one of them or to calculate a sum over a column. That's work that's, quite frankly, best left in the database. I realize that some people worry how that leads to logic bleeding into the database, but for me that's a discussion about pureness vs. pragmatism. If the N is small, grabbing all related objects is relatively easy - unless you support polymorphism, which forces the mapper into all sorts of weird query trees. 1:N is so difficult because an object model is inherently about records, while SQL is about sets. N:M is harder.
  • "Object identity" is a dangerous lure. Every object has its own identifier. In memory that is its address, on disk that's some form of unique identifier. The idea of making the persistent identifier also the in-memory identifier often has the design consequence of an in-memory "running object table" with the goal of avoiding to load the same object twice but rather linking it appropriately into the object graph. That's a fantastic concept, but leads to all sort of interesting concurrency puzzles: What do you do if you happen to find an object you have already loaded as you resolve an 1:N association and realize that the object has meanwhile changed on disk? Another question is what the scope of the object identity is. Per appdomain/process, per machine or even a central object server (hope not)?
  • Transactions are hard. Databases are doing a really good job with data concurrency management, especially with stored procedures. If you are loading and managing data as object-graphs, how do you manage transaction isolation? How do you identify the subtree that's being touched by a transaction? How do you manage rollbacks? What is a transaction, anyways?
  • Changing the underlying data model is hard. I've run into several situations where existing applications had to be, with the customer willing to put money on the table, be integrated with existing data models. O/R mapping is relatively easy of the data model falls out of the object model. If an existing data model bubbles up against an object model, you often end up writing a DAL or the O/R in stored procedures.
  • Reporting and data aggregation is hard. I'll use an analogy for that: It's really easy to write an XPath query against an XML document, but it is insanely difficult to do the same navigating the DOM.

That said, I am not for or against O/R mapping. There are lots of use cases with a lot of CRUD work where O/R saves a lot of time. However, it is a leaky abstraction. In fact is is so leaky that we ended up not using all that much of the funkyness we put into our framework, because "special cases" kept popping up. I am pointing out that there are a lot of fundamental differences between what an RDBMS does with data and how OOP treats data. The discussion is in part a discussion about ISAM vs. RDBMS.

The number of brain cycles that need to be invested for a clean O/R mapping of a complex object model in the presence of the fundamental challenges I listed here (and that list isn't exhaustive) are not automatically less than for a plain-old data layer. It may be more. YMMV.

Now you can (and some already have) ask how all of that plays with LINQ and, in particular, DLINQ. Mind that I don't work in the LINQ team, but I think to be observing a subtle but important difference between LINQ and O/R*: 

  • O/R is object->relational mapping.
  • LINQ is relational->object mapping.

LINQ acknowledges the relational nature of the vast majority of data, while O/R attempts to deny it. LINQ speaks about entities, relations and queries and maps result-sets into the realm of objects, even cooking up classes on the fly if it needs to. It's bottom up and the data (from whatever source) is king. Objects and classes are just tooling. For O/R mapping, the database is just tooling.

Categories: Architecture | Technology

To (O/R) map or not to map.

The monthly discussion about the benefits and dangers of O/R mapping is making rounds on one of the mailing lists that I am signed up to. One big problem in this space - from my experience of discussing this through with a lot of people over and over – is that O/R mapping is one of those things where the sheer wish for an elegant solution to the data/object schism obscures most of the rational argumentation. If an O/R mapper provides a nice programming or tooling experience, developers (and architects) are often willing to accept performance hits and a less-than-optimal tight coupling to the data model, because they are lured by the aesthetics of the abstraction.

Another argument I keep hearing is that O/R mapping yields a significant productivity boost. However, if that were the case and if using O/R mapping would shorten the average development cost in a departmental development project by – say – a quarter or more, O/R mapping would likely have taken over the world by now. It hasn't. And it's not that the idea is new. It’s been around for well more than a decade.

To me, O/R mapping is one of the unfortunate consequences of trying to apply OOP principles to anything and everything. For "distributed objects", we’re fixing that with the service orientation idea and the consequential constraints when we talk about the network edge of applications. It turns out that the many of the same principles apply to the database edge as well. The list below is just for giving you the idea. I could write a whole article about this and I wish I had the time:

  • Boundaries are explicit => Database access is explicit
  • Services avoid coupling (autonomy) => Database schema and in-process data representation are disjoint and mapped explicitly
  • Share schema not code => Query/Sproc result sets and Sproc inputs form data access schema (aliased result sets provide a degree of separation from phys. schema)

In short, I think the dream of transparent O/R mapping is the same dream that fueled the development of fully transparent distributed objects in the early days of DSOM, CORBA and (D)COM when we all thought that'd just work and were neglecting the related issues of coupling, security, bandwidth, etc.

Meanwhile, we’ve learned the hard way that even though the idea was fantastic, it was rather naïve to apply local development principles to distributed systems. The same goes for database programming. Data is the most important thing in the vast majority of applications. Every class of data items (table) surround special considerations: read-only, read/write, insert-only; update frequency, currency and replicability; access authorization; business relevance; caching strategies; etcetc. 

Proper data management is the key to great architecture. Ignoring this and abstracting data access and data management away just to have a convenient programming model is … problematic.

And in closing: Many of the proponents of O/R mapping that I run into (and that is a generalization and I am not trying to offend anyone – just an observation) are folks who don't know SQL and RDBMS technology in any reasonable depth and/or often have no interest in doing so. It may be worth exploring how tooling can better help the SQL-challenged instead of obscuring all data access deep down in some framework and make all data look like a bunch of local objects. If you have ideas, shoot. Comment section is open for business.

Categories: Architecture | SOA

See Part 1

Before we can do anything about deadlocks or deal with similar troubles, we first need to be able to tell that we indeed have a deadlock situation. Finding this out is a matter of knowing the respective error codes that your database gives you and a mechanism to bubble that information up to some code that will handle the situation. So before we can think about and write the handling logic for failed/failing but safely repeatable transactions, we need to build a few little things. The first thing we’ll need is an exception class that will wrap the original exception indicating the reason for the transaction failure. The new exception class’s identity will later serve to filter out exceptions in a “catch” statement and take the appropriate actions.

using System;
using System.Runtime.Serialization;

namespace newtelligence.EnterpriseTools.Data
   public class RepeatableOperationException : Exception
       public RepeatableOperationException():base()

       public RepeatableOperationException(Exception innerException)

       public RepeatableOperationException(string message, Exception innerException)

       public RepeatableOperationException(string message):base(message)

        public RepeatableOperationException(
          SerializationInfo serializationInfo,
          StreamingContext streamingContext)

        public override void GetObjectData(
           System.Runtime.Serialization.SerializationInfo info,
           System.Runtime.Serialization.StreamingContext context)
            base.GetObjectData (info, context);

Having an exception wrapper with the desired semantics, we know need to be able to figure out when to replace the original exception with this wrapper and re-throw it up on the call stack. The idea is that whenever you execute a database operation – or, more generally, any operation that might be repeatable on failure – you will catch the resulting exception and run it through a factory, which will analyze the exception and wrap it with the RepeatableOperationException if the issue at hand can be resolved by re-running the transaction. The (still a little naïve) code below illustrates how to such a factory in the application code. Later we will flesh out the catch block a little more, since we will lose the original call stack if we end up re-throwing the original exception like shown here:

   sprocUpdateAndQueryStuff.Parameters["@StuffArgument"].Value = argument;
   result = this.GetResultFromReader( sprocUpdateAndQueryStuff.ExecuteReader() );
catch( Exception exception )
   throw RepeatableOperationExceptionMapper.MapException( exception );                           

The factory class itself is rather simple in structure, but a bit tricky to put together, because you have to know the right error codes for all resource managers you will ever run into. In the example below I put in what I believe to be the appropriate codes for SQL Server and Oracle (corrections are welcome) and left the ODBC and OLE DB factories (for which would have to inspect the driver type and the respective driver-specific error codes) blank. The factory will check out the exception data type and delegate mapping to a private method that is specialized for a specific managed provider.

using System;
using System.Data.SqlClient;
using System.Data.OleDb;
using System.Data.Odbc;
using System.Data.OracleClient;

namespace newtelligence.EnterpriseTools.Data
   public class RepeatableOperationExceptionMapper
        /// <summary>
        /// Maps the exception to a Repeatable exception, if the error code
        /// indicates that the transaction is repeatable.
        /// </summary>
        /// <param name="sqlException"></param>
        /// <returns></returns>
        private static Exception MapSqlException( SqlException sqlException )
            switch ( sqlException.Number )
                case -2: /* Client Timeout */
                case 701: /* Out of Memory */
                case 1204: /* Lock Issue */
                case 1205: /* Deadlock Victim */
                case 1222: /* Lock Request Timeout */
                case 8645: /* Timeout waiting for memory resource */
                case 8651: /* Low memory condition */
                    return new RepeatableOperationException(sqlException);
                    return sqlException;

        private static Exception MapOleDbException( OleDbException oledbException )
            switch ( oledbException.ErrorCode )
                    return oledbException;

        private static Exception MapOdbcException( OdbcException odbcException )
            return odbcException;           

        private static Exception MapOracleException( OracleException oracleException )
            switch ( oracleException.Code )
                case 104:  /* ORA-00104: Deadlock detected; all public servers blocked waiting for resources */
                case 1013: /* ORA-01013: User requested cancel of current operation */
                case 2087: /* ORA-02087: Object locked by another process in same transaction */
                case 60:   /* ORA-00060: Deadlock detected while waiting for resource */
                    return new RepeatableOperationException( oracleException );
                    return oracleException;

        public static Exception MapException( Exception exception )
            if ( exception is SqlException )
                return MapSqlException( exception as SqlException );
            else if ( exception is OleDbException )
                return MapOleDbException( exception as OleDbException );
            else if (exception is OdbcException )
                return MapOdbcException( exception as OdbcException );
            else if (exception is OracleException )
                return MapOracleException( exception as OracleException );
                return exception;

With that little framework of two classes, we can now selectively throw exceptions that convey whether a failed/failing transaction is worth repeating. Next step: How do we do actually run such repeats and make sure we neither lose data nor make the user unhappy in the process? Stay tuned.

Categories: Architecture | SOA | Enterprise Services | MSMQ

Deadlocks and other locking conflicts that cause transactional database operations to fail are things that puzzle many application developers. Sure, proper database design and careful implementation of database access (and appropriate support by the database engine) should take care of that problem, but it cannot do so in all cases. Sometimes, especially under stress and other situations with high lock contention, a database just has not much of a choice but picking at least one of the transactions competing for the same locks as the victim in resolving the deadlock situation and then aborts the chosen transaction. Generally speaking, transactions that abort and roll back are a good thing, because this behavior guarantees data integrity. In the end, we use transaction technology for those cases where data integrity is at risk. What’s interesting is that even though transactions are a technology that is explicitly about things going wrong, the strategy for dealing with failing transaction is often not much more than to bubble the problem up to the user and say “We apologize for the inconvenience. Please press OK”.

The appropriate strategy for handling a deadlock or some other recoverable reason for a transaction abort on the application level is to back out of the entire operation and to retry the transaction. Retrying is a gamble that the next time the transaction runs, it won’t run into the same deadlock situation again or that it will at least come out victorious when the database picks its victims. Eventually, it’ll work. Even if it takes a few attempts. That’s the idea. It’s quite simple.

What is not really all that simple is the implementation. Whenever you are using transactions, you must make your code aware that such “good errors” may occur at any time. Wrapping your transactional ODBC/OLEDB/ADO/ADO.NET code or calls to transactional Enterprise Services or COM+ components with a try/catch block, writing errors to log-files and showing message boxes to users just isn’t the right thing to do. The right thing is to simply do the same batch of work again and until it succeeds.

The problem that some developers seem to have with “just retry” is that it’s not so clear what should be retried. It’s a problem of finding and defining the proper transaction scope. Especially when user interaction is in the picture, things easily get very confusing. If a user has filled in a form on a web page or some dialog window and all of his/her input is complete and correct, should the user be bothered with a message that the update transaction failed due to a locking issue? Certainly not. Should the user know when the transaction fails because the database is currently unavailable? Maybe, but not necessarily. Should the user be made aware that the application he/she is using is for some sudden reason incompatible with the database schema of the backend database? Maybe, but what does Joe in the sales department do with that valuable piece of information?

If stuff fails, should we just forget about Joe’s input and tell him to come back when the system is happier to serve him? So, in other words, do we have Joe retry the job? That’s easy to program, but that sort of strategy doesn’t really make Joe happy, does it?

So what’s the right thing to do? One part of the solution is a proper separation between the things the user (or a program) does and the things that the transaction does. This will give us two layers and “a job” that can be handed down from the presentation layer down to the “transaction layer”. Once this separation is in place, we can come up with a mechanism that will run those jobs in transactions and will automate how and when transactions are to be retried. Transactional MSMQ queues turn out to be a brilliant tool to make this very easy to implement. More tomorrow. Stay tuned.

Categories: Architecture | SOA | Enterprise Services | MSMQ

I feel like I have been "out of business" for a really long time and like I really got nothing done in the past 3 months, even though that's objectively not true. I guess that's "conference & travel withdrawal", because I had tone and tons of bigger events in the first half of the year and 3 smaller events since TechEd Amsterdam in July. On the upside, I am pretty relaxed and have certainly reduced my stress-related health risks ;-)

So with winter and its short days coming up, the other half of my life living a 1/3 around the planet until next spring, I can and am going to spend some serious time on a bunch of things:

On the new programming stuff front:
     Catch up on what has been going on in Indigo in recent months, dig deeper into "everything Whidbey", figure out the CLR aspects of SQL 2005 and familiarize myself with VS Team System.

On the existing programming stuff front:
      Consolidate my "e:\development\*"  directory on my harddrive and pull together all my samples and utilities for Enterprise Services, ASP.NET Web Services and other enterprise-development technologies and create a production-quality library from of them for us and our customers to use. Also, because the Indigo team is doing quite a bit of COM/COM+ replumbing recently in order to have that prohgraming model ride on Indigo, I have some hope that I can now file bugs/wishes against COM+ that might have a chance of being addressed. If that happens and a particular showstopper is getting out of the way, I will reopen this project here and will, at the very least, release it as a toy.

On the architectural stuff front:
         Refine our SOA Workshop material, do quite a bit of additional work on the FABRIQ, evolve the Proseware architecture model, and get some pending projects done. In addition to our own SOA workshops (the next English-language workshop is held December 1-3, 2004 in Düsseldorf), there will be a series of invite-only Microsoft events on Service Orientation throughout Europe this fall/winter, and I am very happy that I will be speaking -- mostly on architecture topics -- at the Microsoft Eastern Mediterranean Developer Conference in Amman/Jordan in November and several other locations in the Middle East early next year. 

And even though I hate the effort around writing books, I am seriously considering to write a book about "Services" in the next months. There's a lot of stuff here on the blog that should really be consolidated into a coherent story and there are lots and lots of considerations and motiviations for decisons I made for FABRIQ and Proseware and other services-related work that I should probably write down in one place. One goal of the book would be to write a pragmatic guide on how to design and build services using currently shipping (!) technologies that does focus on how to get stuff done and not on how to craft new, exotic SOAP headers, how to do WSDL trickery, or do other "cool" but not necessarily practical things. So don't expect a 1200 page monster. 

In addition to the "how to" part, I would also like to incorporate and consolidate other architect's good (and bad) practical design and implementation experiences, and write about adoption accelerators and barriers, and some other aspects that are important to get the service idea past the CFO. That's a great pain point for many people thinking about services today. If you would be interested in contributing experiences (named or unnamed), I certainly would like to know about it.

And I also think about a German-to-English translation and a significant (English) update to my German-language Enterprise Services book.....

[And to preempt the question: No, I don't have a publisher for either project, yet.]

Categories: Architecture | SOA | Blog | IT Strategy | newtelligence | Other Stuff | Talks

I was a little off when I compared my problem here to a tail call. Gordon Weakliem corrected me with the term "continuation".

The fact that the post got 28 comments shows that this seems to be an interesting problem and, naming aside, it is indeed a tricky thing to implement in a framework when the programming language you use (C# in my case) doesn't support the construct. What's specifically tricky about the concrete case that I have is that I don't know where I am yielding control to at the time when I make the respective call.

I'll recap. Assume there is the following call

CustomerService cs = new CustomerService();

FindCustomer is a call that will not return any result as a return value. Instead, the invoked service comes back into the caller's program at some completely different place such this:

public void
FindCustomerReply(Customer[] result)

So what we have here is a "duplex" conversation. The result of an operation initiated by an outbound message (call) is received, some time later, through an inbound message (call), but not on the same thread and not on the same "object". You could say that this is a callback, but that's not precisely what it is, because a "callback" usually happens while the initiating call (as above FindCustomer) has not yet returned back to its scope or at least while the initiating object (or an object passed by some sort of reference) is still alive. Here, instead, processing of the FindCustomer call may take a while and the initiating thread and the initiating object may be long gone when the answer is ready.

Now, the additional issue I have is that at the time when the FindCustomer call is made, it is not known what "FindCustomerReply" message handler it going to be processing the result and it is really not know what's happening next. The decision about what happens next and which handler is chosen is dependent on several factors, including the time that it takes to receive the result. If the FindCustomer is called from a web-page and the service providing FindCustomer drops a result at the caller's doorstep within 2-3 seconds [1], the FindCustomerReply handler can go and hijack the initial call's thread (and HTTP context) and render a page showing the result. If the reply takes longer, the web-page (the caller) may lose its patience [2] and choose to continue by rendering a page that says "We are sending the result to your email account." and the message handler with not throw HTML into an HTTP response on an open socket, but rather render it to an email and send it via SMTP and maybe even alert the user through his/her Instant Messenger when/if the result arrives.

[1] HTTP Request => FindCustomer() =?> "FindCustomerReply" => yield to CustomerList.aspx => HTTP Response
[2] HTTP Request => FindCustomer() =?> Timeout!            => yield to YouWillGetMail.aspx => HTTP Response
                               T+n =?> "FindCustomerReply" => SMTP Mail
                                                           => IM Notification

So, in case [1] I need to correlate the reply with the request and continue processing on the original thread. In case [2], the original thread continues on a "default path" without an available reply and the reply is processed on (possibly two) independent threads and using two different notification channels.

A slightly different angle. Consider a workflow application environment in a bank, where users are assigned tasks and simply fetch the next thing from the to-do list (by clicking a link in an HTML-rendered list). The reply that results from "LookupAndDoNextTask" is a message that contains the job that the user is supposed to do.  

[1] HTTP Request => LookupAndDoNextTask() =?> Job: "Call Customer" => yield to CallCustomer.aspx => HTTP Response
[2] HTTP Request => LookupAndDoNextTask() =?> Job: "Review Credit Offer" => yield to ReviewCredit.aspx => HTTP Response
[3] HTTP Request => LookupAndDoNextTask() =?> Job: "Approve Mortgage" => yield to ApproveMortgage.aspx => HTTP Response
[4] HTTP Request => LookupAndDoNextTask() =?> No Job / Timeout => yield to Solitaire.aspx => HTTP Response

In all of these cases, calls to "FindCustomer()" and "LookupAndDoTask()" that are made from the code that deals with the incoming request will (at least in the theoretical model) never return to their caller and the thread will continue to execute in a different context that is "TBD" at the time of the call. By the time the call stack is unwound and the initiating call (like FindCustomer) indeed returns, the request is therefore fully processed and the caller may not perform any further actions. 

So the issue at hand is to make that fact clear in the programming model. In ASP.NET, there is a single construct called "Server.Transfer()" for that sort of continuation, but it's very specific to ASP.NET and requires that the caller knows where you want to yield control to. In the case I have here, the caller knows that it is surrendering the thread to some other handler, but it doesn't know to to whom, because this is dynamically determined by the underlying frameworks. All that's visible and should be visible in the code is a "normal" method call.

cs.FindCustomer(customerId) might therefore not be a good name, because it looks "too normal". And of course I don't have the powers to invent a new statement for the C# language like continue(cs.FindCustomer(customerId)) that would result in a continuation that simply doesn't return to the call location. Since I can't do that, there has to be a different way to flag it. Sure, I could put an attribute on the method, but Intellisense wouldn't show that, would it? So it seems the best way is to have a convention of prefixing the method name.

There were a bunch of ideas in the comments for method-name prefixes. Here is a selection:

  • cs.InitiateFindCustomer(customerId)
  • cs.YieldFindCustomer(customerId)
  • cs.YieldToFindCustomer(customerId)
  • cs.InjectFindCustomer(customerId)
  • cs.PlaceRequestFindCustomer(customerId)
  • cs.PostRequestFindCustomer(customerId)

I've got most of the underlying correlation and dispatch infrastructure sitting here, but finding a good programming model for that sort of behavior is quite difficult.

[Of course, this post won't make it on Microsoft Watch, eWeek or The Register]

Categories: Architecture | SOA | Technology | ASP.NET | CLR

newtelligence AG will be hosting an open workshop on service-oriented development, covering principles, architecture ideas and implementation guidance on October 13-15 in Düsseldorf, Germany.

The workshop will be held in English, will be hosted by my partner and “Mr. Methodologies” Achim Oellers and myself, and is limited to just 15 (!) attendees to assure an interactive environment that maximizes everyone’s benefit. The cap on the number of attendees also allows us to adjust the content to individual needs to some extent.

We will cover the “services philosophy” and theoretical foundations of service-compatible transaction techniques, scalability and federation patterns, autonomy and other important aspects. And once we’ve shared our “services mind-set”, we will take the participants on a very intense “guided tour” through (a lot of) very real and production-level quality code (including the Proseware example application that newtelligence built for Microsoft Corporation) that turns the theory to practice on the Windows platform and shows that there’s no need to wait for some shiny future technology to come out in 2 year’s time to benefit from services today.

Regular pricing for the event is €2500.00 (plus applicable taxes) and includes:

  • 3-day workshop in English from 9:00 – 18:00 (or later depending on topic/evening) 
  •  2 nights hotel stay (Oct 13th and 14th)
  • Group dinner with the experts on the first night.  The 2nd night is at your disposal to enjoy Düsseldorf’s fabulous Altstadt at your own leisure
  • Lunch (and snacks/drinks throughout the day)
  • Printed materials (in English), as appropriate
  • Post-Workshop CD containing all presentations and materials used/shown

For registration inquiries, information about the prerequisites, as well as for group and early-bird discount options, please contact Mr. Fons Habes via training@newtelligence.com. If the event is sold out at the time of your inquiry or if you are busy on this date, we will be happy to pre-register you for one of the upcoming event dates or arrange for an event at your site.

Categories: Architecture | SOA | newtelligence

Carl invited me for .NET Rocks on Thursday night. That is July 15th, 10 PM-Midnight Eastern Standard Time (U.S.) which is FOUR A.M. UNTIL SIX A.M. Central European Time (CET) on Friday morning. I am not sure whether my brain can properly operate at that time. The most fun thing would be to go out drinking Thursday night ;-)   I want to talk about (guess what) Services. Not Indigo, not WSE, not Enterprise Services, not SOAP, not XML. Services. Mindset first, tools later.

Categories: Architecture | SOA

July 12, 2004
@ 01:07 PM

I've had several epiphanies in the 12 months or so. I don't know how it is for other people, but the way my thinking evolves is that I've got some inexpressible "thought clouds" going around in my head for months that I can't really get on paper or talk about in any coherent way. And then, at some point, there's some catalyst and "bang", it all comes together and suddenly those clouds start raining ideas and my thinking very rapidly goes through an actual paradigm shift.

The first important epiphany occurred when Arvindra gave me a compact explanation of his very pragmatic view on Agent Technology and Queueing Networks, which booted the FABRIQ effort. Once I saw what Arvindra had done in his previous projects and I put that together with my thinking about services, a lot of things clicked. The insight that formed from there was that RPC'ish request/response interactions are very restrictive exceptions in a much larger picture where one-way messages and much more complex message flow-patterns possibly involving an arbitrary number of parties are the norm.

The second struck me while on stage in Amsterdam and during the "The Nerd, The Suit, and the Fortune Teller" play as Pat and myself were discussing Service Oriented User Interaction. (You need to understand that we had very limited time for preparation and hence we had a good outline, but the rest of the script essentially said "go with the flow" and so most of it was pure improvisation theater). The insight that formed can (with all due respect) be shortened "the user is just another service". Not only users shall drive the interaction by issuing messages (commands) to a systems for which they expect one or more out of a set of possible replies, but there should also be a way how systems can be drive an interaction by issuing messages to users expecting one or more out of a set of possible replies. There is no good reason why any of these two directions of driving the interaction should receive preferred treatment. There is no client and there is no server. There are just roles in interactions. That moment, the 3-layer/3-tier model of building applications died a quick and painless death in my head. I think I have a new one, but the clouds are still raining ideas. Too early for details. Come back and ask in a few months.

Categories: Architecture | SOA

July 9, 2004
@ 09:10 AM

Jimmy Nilsson is really good at spotting flamebait.

Categories: Architecture

July 8, 2004
@ 12:48 PM

Do I do this because I want to or do I do this because I need to?

Categories: Architecture

In my comment view for the last post (comment #1), Piyush Pant writes about the confusion around different pipeline models and frameworks that are popping up all over the place and mentions Proseware, so I need to clarify some things:

I'll address the "too many frameworks" concern first: Proseware's explicit design goal and my job was to use the technologies ASP.NET Web Services, WSE 2.0, IIS, MSMQ, and Enterprise Services as pure as possible and I did intentionally not introduce yet another framework for the runtime bits beyond a few utility classes used by the services as a common infrastructure (like a config-driven web service proxy factory, the queue listener, or the just-in-time activation proxy pooling). What my job was and what I reasonably succeeded at was to show that:

Writing Service Oriented Applications on today's Windows Server 2003 platform does not require yet another framework.

The framework'ish pieces that I had to add are simply addressing some deployment issues like creating accounts, setting ACLs or setting up databases, that need to be done in a "real" app hat isn't a toy. Such things are sometimes difficult to abstract on the level of what the .NET Framework can offer as a general-purpose platform or are simply not there yet. All of these extra classes reside in an isolated assembly that's only used by the installers.

The total number of utility classes that play a role of any importance at runtime is 5 (in words five) and none of them has more than three screen pages worth of actual code. Let me repeat:

Writing Service Oriented Applications on today's Windows Server 2003 platform does not require yet another framework.

I do have a dormant (newtelligence-owned) code branch sitting here that'd make a lot of things in Proseware easier and more elegant to develop and makes reconfiguring services more convenient, but it's a developer convenience and productivity framework. No pipelines, no other architecture, just a prettier shell around the exact Proseware architecture and technologies I chose.

To illustrate my point about the fact that we don't need another entirely new framework, I have here (MessageQueueWebRequest.cs.txt, MessageQueueWebResponse.cs.txt) an early 0.1 prototype copy of our MessageQueueWebRequest/-WebResponse class pair that supports sending WS messages through MSMQ. (That prototype only does very simple one-way messages; you can do a lot more with MSMQ).  

Take the code, put it in yours, create a private queue, take an arbitrary ASMX WebService proxy, call MessageQueueWebRequest.RegisterMSMQProtocol() when your app starts, instantiate the proxy, set the Url property of the proxy to msmq://mymachine/private$/myqueue, invoke the proxy and watch how a SOAP message materializes in the queue.

Next step: use a WSE proxy. Works too. I'll leave the receiver logic to your imagination, but that's not really much more than listening to the queue and throwing the message into a WSE 2.0 SoapMethod or throwing it as a raw HTTP request at an ASMX WebMethod or by using a SimpleWorkerRequest on a self-hosted ASP.NET AppDomain (just like WebMatrix's Cassini hosts that stuff).


On to "pipelines" in the same context: Pipelines are a very common design pattern and you can find hundreds of variations of them in many projects (likely dozens from MS) which all have some sort of a notion of a pipeline. It's just "pipeline", not Pipeline(tm) 2003 SP1.

User-extensible pipeline models are a nice idea, but I don't think they are very useful to have or consider for most services of the type that Proseware has (and that covers a lot of types).

Frankly, most things that are done with pipelines in generalized architectures that wrap around endpoints (in/out crosscutting pipelines) and that are not about "logging" (which is, IMHO, more useful if done explicitly and in-context) are already in the existing technology stack (Enterprise Services, WSE) or are really jobs for other services.

There is no need to invent another pipeline to process custom headers in ASMX, if you have SoapExtensions. There is no need to invent a new pipeline model to do WS-Security, if you can plug the WSE 2.0 pipeline into the ASMX SoapExtension pipeline already. There is no need to invent a new pipeline model to push a new transaction context on the stack, if you can hook the COM+ context pipeline into your call chain by using ES. There is no need to invent another pipeline for authorization, if you can hook arbitrary custom stuff into the ASP.NET Http Pipeline or the WSE 2.0 pipeline already has or simply use what the ES context pipeline gives you.

I just enumerated four (!) different pipeline models and all of them are in the bits you already have on a shipping platform today and as it happens, all of them compose really well with each other. The fact that I am writing this might show that most of us just use and configure their services without even thinking of them as a composite pipeline model.

"We don't need another Pipeline" (I want Tina Turner to sing that for me).

Of course there's other pipeline jobs, right? Mapping!

Well, mapping between schemas is something that goes against the notion of a well-defined contract of a service. Either you have a well-defined contract or two or three or you don't. If you have a well-defined contract and there's a sender that doesn't adhere to it, it's the job of another service to provide that sort of data negotiation, because that's a business-logic task in and by itself.

Umm ... ah! Validation!

That might be true if schema validation is enough, but validation of data is a business logic level task if things get more complex (like if you need to check a PO against your catalog and need to check whether that customer is actually entitled to get a certain discount bracket). That's not a cross-cutting concern. That's a core job of the app.

Pipelines are for plumbers


Now, before I confuse everyone (and because Piyush mentioned it explicitly):

FABRIQ is a wholly different ballgame, because it is precisely a specialized architecture for dynamically distributable, queued (pull-model), one-way pipeline message processing and that does require a bit of a framework, because the platform doesn't readily support it.

We don't really have a notion of an endpoint in FABRIQ that is the default terminal for any message arriving at a node. We just let stuff asynchronously flow in one direction and across machines and handlers can choose to look at, modify, absorb or yield resultant messages into the pipeline as a result of what they do. In that model, the pipeline is the application. Very different story, very different sets of requirements, very different optimization potential and not really about services in the first place (although we stick to the tenets), but rather about distributing work dynamically and about doing so as fast as we can make it go.

Sorry, Piyush! All of that totally wasn't going against your valued comments, but you threw a lit match into a very dry haystack.


Categories: Architecture | SOA

Benjamin Mitchell wrote a better summary of my "Building Proseware Inc." session at TechEd Amsterdam than I ever could.

Because ... whenever the lights go on and the mike is open, I somehow automatically switch into an adrenalin-powered auto-pilot mode that luckily works really well and since my sessions take up so much energy and "focus on the moment", I often just don't remember all the things I said once the session is over and I am cooled down. That also explains why I almost never rehearse sessions (meaning: I never ever speak to the slides until I face an audience) except when I have to coordinate with other speakers. Yet, even though most of my sessions are really ad-hoc performances, whenever I repeat a session I usually remember whatever I said last time just at the very moment when the respective topic comes up, so there's an element of routine. It is really strange how that works. That's also why I am really a bad advisor on how to do sessions the right way, because that is a very risky approach. I just write slides that provide me with a list of topics and "illustration helpers" and whatever I say just "happens". 

About Proseware: All the written comments that people submitted after the session have been collected and are being read and it's very well understood that you want to get your hands on the bits as soon as possible. One of my big takeaways from the project is that if you're Microsoft, releasing stuff that is about giving "how-to" guidance is (for more reasons you can imagine) quite a bit more complicated than just putting bits up on a download site. It's being worked on. In the meantime, I'll blog a bit about the patterns I used whenever I can allocate a timeslice.

Categories: Architecture | SOA | TechEd Europe

Simple question: Please show me a case where inheritance and/or full data encapsulation makes sense for business/domain objects on the implementation level. 

I'll steal the low-hanging fruit: Address. Address is a great candidate when you look at an OOA model as you could model yourself to death having BaseAddress(BA) and BA<-StreetAddress(SA) and BA<-PostalAddress(PA) and SA<-US_StreetAddress and SA<-DE_StreetAddress and SA<-UK_StreetAddress and so forth.

When it comes to implementation, you'll end up refactoring the class into on thing: Address. There's probably an AddressType attribute and there's a Country field that indicates the formatting and since implementing a full address validation component is way too much work that feature gets cut anyway and hence we end up with a multiline text field with the properly formatted address and stuff like Street and PostOfficeBox (eventually normalized to AddressField), City, PostalCode, Country and Region is kept separate really just to make searching easier and faster. The stuff that goes onto the letter envelope is really only the preformatted address text.

Maybe I am too much of a data (read: XML, Messages, SQL) guy by now, but I just lost faith that objects are any good on the "business logic" abstraction level. The whole inheritance story is usually refactored away for very pragmatic reasons and the encapsulation story isn't all that useful either. You simply can't pragmatically regard data validation of data on a property get/set level as a useful general design pattern, because a type like Address is one type with interdependency between its elements and not simply a container for types. The rules for Region depend on Country and the rules for AddressField (or Street/PostOfficeBox) depend on AddressType. Since the object can't know your intent of what data you want to supply to it on a property get/set level, it can't do meaningful validation on that level. Hence, you end up calling something like address.Validate() and from there it's really a small step to separate out code and data into a message and a service that deals with it and call Validate(address). And that sort of service is the best way to support polymorphism over a scoped set of "classes" because it can potentially support "any" address schema and can yet concentrate and share all the validation logic (which is largely the same across whatever format you might choose) in a single place and not spread it across an array of specialized classes that's much, much harder to maintain.

What you end up with are elements and attributes (infoset) for the data that flows across, services that deal with the data that flows, and rows and columns that efficiently store data and let you retrieve it flexibly and quickly. Objects lost (except on the abstract and conceptional analysis level where they are useful to understand a problem space) their place in that picture for me.

While objects are fantastic for frameworks, I've absolutely unlearned why I would ever want them on the business logic level in practice. Reeducate me.

Categories: Architecture | SOA

We've built FABRIQ, we've built Proseware. We have written seminar series about Web Services Best Practices and Service Orientation for Microsoft Europe. I speak about services and aspects of services at conferences around the world. And at all events where I talk about Services, I keep hearing the same question: "Enough of the theory, how do I do it?"

Therefore we have announced a seminar/workshop around designing and building service oriented systems that puts together all the things we've found out in the past years about how services can be built today and on today's Microsoft technology stack and how your systems can be designed for with migration to the next generation Microsoft technlogy stack in mind. Together with our newtelligence Associates, we are offering this workshop for in-house delivery at client sites world-wide and are planning to announce dates and locations for central, "open for all" events soon.

If you are interested in inviting us for an event at your site, contact Bart DePetrillo, or write to sales@newtelligence.com. If you are interested in participating at a central seminar, Bart would like to hear about it (no obligations) so that we can select reasonable location(s) and date(s) that fit your needs.

Categories: Architecture | SOA | FABRIQ | Indigo | Web Services

July 4, 2004
@ 05:50 PM

Monday I'll start earnestly working on this year's "summer project". Last year's project yielded what you today know as dasBlog. This year's prototyping project will have to do with running aggregated RSS content through FABRIQ networks for analysis and enrichment, solidifying the newtelligence SOA framework (something you don't even know about yet and it's not Proseware) and architecting/building a fairly large-scale system for dynamically managing user-centric media for targeted and secure distribution to arbitrary presentation surfaces. (Yes, I know that's a foggy explanation). Will the result be free stuff? No. Not this time. Will you hear about what we learn on the road to Bumblebee? Absolutely.

Categories: Architecture | Other Stuff

June 30, 2004
@ 08:12 AM

I had a nice dinner yesterday night with Don Box, Ingo Rammer, Christian Weyer, Christian Nagel, Benjamin Mitchell, Matt Tavis, and Juval Löwy. Don arrived a bit later and started by saying "look, I wanted to have dinner with all of you because we've decided to make some changes to Indigo and by now we've decided to simply bake WSE into Longhorn". Silence. Laughter. No, we didn't buy it and Don couldn't manage to keep the story up for more than two sentences. Dicussions were lively and went from the Windows Kernel to some high level architecture topics and one of the interesting takeaways was that Don elaborated a bit on the "Business Agents" idea he'd been talking about briefly in his CTS200 session. There's apparently a related project Boa (another serpent name along the family line of Viper that was the original codename for MTS), including the business markup language BML (pronounced "Bimmel") that he's involved in and he talked a bit about that, but of course I'd be killed if I gave out more details.

Hello Microsoft Watch readers and El Registraderos: go here for an update. 

Update #2: The nice thing about blogs is that I can update this entry all I want. Understand: This *is* a joke. Read the first few sentences. See, there's a little "April Fools" story right there. If that isn't enough, the codename "Viper" is a bit ancient right... like 1995/1996? Still no good? How stupid is Bimmel? And would I really break an NDA here?  If the press turns this into a story without asking whether there is any substance .... well....   This is my weblog. I don't work for Microsoft. Thank you for your understanding.

Update #3: If you are less interested in blown-out-of-proportion we-have-nothing-better-to-report summer-time "news" and more interested in the big thing before the next big thing (and -who knows- that next big thing might really be "Business Agents", after all) and are therefore contemplating whether Services and Service Oriented Architectures make sense to you, go here.

Categories: Architecture | TechEd Europe

June 25, 2004
@ 07:57 PM

Finally, finally, finally. It was a looong wait. As many others, we were in a wait loop for WSE 2.0 for a long time and that let us do what we do today only much, much later than we initially anticipated. So after being able to test on and adjust for the WSE 2.0 RTM bits for the last four weeks, we're now happy enough with our "1.0" that we're ready to share it:

Microsoft EMEA and newtelligence AG present: The FABRIQ. (http://workspaces.gotdotnet.com/fabriq)
(When you go there, make sure you get both the bits and the Hands-on Labs; you will need them).

Also, a few things to keep in mind before you go and get the bits:

  • This is a proof-of-concept project collaboratively created by Microsoft EMEA and newtelligence AG. We have tested intensively for quite a few sets of use-cases, but this is not a product. We are giving this to you because we think it's very useful architecturally and most implementation aspects isn't too bad either, and we do expect you to play with it. We don't give it to you to install this in a production environment tomorrow or even on the day after.
  • The support policy for FABRIQ is very simple: There is none. If you download this, you are absolutely and entirely on your own, legally speaking. We are keen to hear your feedback and are curious whether and for what you find this useful, but this is no product and therefore there's no support whatsoever. (If you find this so useful that you want customization, support, or need help to get this from near-production quality to production-quality, sales@newtelligence.com is a great place to write e-mail to)
  • This is "work in progress" and you are getting a version that is not considered finished. You will find artifacts in the code that are not (anymore or yet) used. You will find code branches that we not (anymore or yet) hit.  There are a few places, where we cut some corners in terms of implementation efficiency in order to get this out early. You will find that there is a bit of a disconnect between the specification documents that we have in the package vs. the documentation that you'll find and we could have done a better job cleaning it all up. We love this ourselves and will continue to polish it.
  • You need WSE 2.0 and the Enterprise Instrumentation Framework to play.
  • Contributions: We give you the code and you can use it and change it. For the first version and the next minor drops, we'll not have a public code repository that people can check things into immediately, because the beast turned out to be so complex that we need to stay in control for a little while. If we allowed "random" community contributions early, people who don't live inside in the codebase could too easily seemingly unrelated stuff. Therefore: If you want to change or add stuff, wrap up your changes along with a good reason why that's needed and send it here.
  • Discussions: Write what you like or hate or what you don't understand into the forums in the workspace or just blog about it and refer to this entry or relevant entries on my blog or Arvindra's blog once he's fully set up. We'll accept everybody into the workspace; just apply and you'll be granted access as soon as someone sees it.

Credit where credit is due: Very many thanks to the development team in Argentina, with Eugenio Pace, Adrian Nigro, Federico Winkel, and Juan Carlos Elichirigoity, who have worked very very hard turning my "written in two weeks in a hurry" prototype code into something that's actually useful.

Categories: Architecture | TechEd Europe | FABRIQ

June 24, 2004
@ 01:06 PM
In this post, I describe the FABRIQ concepts of "networks" and "nodes":
Categories: Architecture | TechEd Europe | FABRIQ

The most fundamental element in FABRIQ is a message handler and handlers are organized in pipelines to process messages. I explain the relationship here.
Categories: Architecture | TechEd Europe | FABRIQ

June 22, 2004
@ 10:59 AM

We have one regular session:

  • Architecture Overview Session (ARC405) with Arvindra Sehmi and myself: Wed, Jun 30 12:00 - 13:15 Room: 9b

along with a Hands-On-Lab and a Chalk-Talk 

  • Internals Chalk Talk (CHT019) with Arvindra Sehmi (I will try to make it there. Thursday is very busy): Thu, Jul 1 10:15 - 11:30 Room: U
  • Hands-On Lab (ARC-IL01) with newtelligence's Achim Oellers and Jörg Freiberger: Tue-Thu throughout the day, Room: O
Categories: Architecture | TechEd Europe | FABRIQ

June 22, 2004
@ 10:28 AM
For the impatient, this post shows two config snippets.
Categories: Architecture | TechEd Europe | FABRIQ

June 22, 2004
@ 09:53 AM

Before I can get into explaining how the FABRIQ works and how to configure it, I need to explain a bit of the terminology we use:

  • A network is the FABRIQ term that's rougly equivalent to an "application". A network consists of an arbitrary number of network-distributed nodes that are running inside the scope of the network. The network creates a common namespace for all of these nodes. Networks are configured using a single XML configuration document that is submitted (or routed via another network) to all hosts that shall host the network's nodes.
  • A node is the FABRIQ term that is roughly equivalent to a "service" or "component". A node is the smallest addressable unit. Every node has a "relative node URI" that is composed of the network name and the node's own name into {network/node}. This relative node URI can be composed with absolute, transport dependent URIs such as http://server/vdir/network/node or msmq://machine/queuename/network/node. Within a network, the runtime is also capable of resolving logical addresses of the form fabriq://network/node and automatically map them to physical addresses. At runtime, a node accepts messages and dispatches them into one of one or more action pipelines. Each node may be guarded by a set of WS-Policy assertions, including Kerberos and X.509 cert authentication and authorization claims. A node may be hosted on a dedicated machine, one a well defined set of machines or on "any" machine within a cluster.
  • An action pipeline is a pipeline that is associated with an action identifier and is roughly equivalent to a "method". An action identifier is a URI as per WS-Addressing's definition of wsa:Action and is mapped to SOAPAction: whenever we go through HTTP. A node must host at least one action pipeline with no limit on the number of action pipelines it can support. An action may declare a set of message schema-types that it understands and those message definitions may be used for validation inbound messages. An action has one or more outbound message routes that are matched against the result message action or destination. Multiple routes may match a message, which causes the message flow to fork. For each route exist one or multiple prioritized routing destinations. If multiple destinations have the same priority, the engine will balance calls across those, otherwise the engine will use the ones with lower priority as backup routes. At the end of each action pipeline is a sender port that sends resulting messages out to their destinations, which may be other FABRIQ nodes or any other external endpoint that understands the respective one-way message being sent.
  • A pipeline is a composition of a sequence of handlers or nested pipelines. Pipelines can be nested in arbitrary depth. Pipelines are strictly unidirectional message processors that have no concept of a "response" on the same thread analogous to a return value (hence all actions are one-way only). A pipeline may or may not be based on a predefinable pipeline-type. Pipeline-types allow the definition of reusable pipelines that can be reused within the same network or (via import) in multiple networks.
  • A handler refers to a software component (a CLR class) implementing a set of interfaces that allow it to be composed into and hosted in a pipeline. Handlers should be designed to perform only very primitive operations that can then be composed into pipelines to implement specific functionality. Built-in handlers include a content-based routing handler and an XSLT transformation handler. Custom handlers may contain any type of logic. A handler receives messages and may consume them, evaluate and annotate them and yield any number of resulting messages. The definition of a handler embeds an XML fragment that allows the handler to configure itself. The actual reference to the CLR class implementing the handler is defined in a handler-type.
  • A handler-type associates a CLR class with a name that can be used to define handlers within a configuration file. It also allows the declaration of a code-base URL for the CLR class. This feature allows the installation of "virgin" FABRIQ runtimes in a cluster and have the runtimes auto-download all the required code for hosting a node from a central code store and therefore dramatically eases deployment and dynamic reconfiguration of a FABRIQ cluster.

In the next couple of postings I will map these terms to concrete config files.

The interesting bit about config is that FABRIQ's configuration mechanism uses the FABRIQ itself. FABRIQ has a predefined (extensible, configurable) network "fabriq" with a node "configuration" that currently defines a single action "configure". The pipeline for that action consists of a single handler (the FabriqConfigurationHandler) and that expects and accepts the configuration files I'll describe over the next days as the body of a message. With that, the configuration mechanism can be secured with policy, or can be embedded into a larger network that does preprocessing or even performs automatic assembly of configuration, or that automatically distributes configuration from a single point across a large cluster of machines.

To be continued ...

Categories: Architecture | TechEd Europe | FABRIQ

June 22, 2004
@ 07:34 AM

Achim and myself are currently in a series of very quick rev-cycles for the first public release of the Microsoft/newtelligence FABRIQ project that we did with and for Microsoft EMEA HQ and that was conceived, driven and brillantly managed by my architect colleague Arvindra Sehmi, who gave me the lead architect role for this project.

[Reminder/Disclaimer: this is not a product, but rather a pretty elaborate "how-to" architecture example that comes with an implementation. Hence it's not a supported Microsoft or newtelligence "framework" or an attempt at some general, definitive guidance on how to write services. FABRIQ is an optimized architecture for fast, one-way, message processing within network-distributed nodes consisting of sequences of dynamically composed primitive processing steps. This isn't even trying to get anywhere near the guidance aspirations of Shadowfax, or let alone all the guidance we're getting from the Indigo team or even the parallel work I've been doing for MS by building Proseware.]

We've settled on build 1.0.4173 (yesterday) to be the TechEd version, but we still found a last minute issue where we weren't using WSE 2.0 correctly (not setting the SoapEnvelope.Context.Destination property for use with a bare WSE2 Pipeline in the presence of policy) and when I reassembled the distribution I didn't reset an option that I use for debugging on my machine and that caused installation hiccups over at Achim's machine. Achim commented the hour-long bug hunt with "Ah, you gotta love software!".

There will be hands-on labs at TechEd Europe led by Achim and Jörg that let you play with what we (very much including our friends at Microsoft Argentina and Microsoft EMEA) have built. And even if you don't have a proper use for a one-way queuing network architecture, it actually turned into a fun thing to play with. 

I'll be starting to explain aspects of the spec over the upcoming days and will explain how the architecture works, how you configure it and what its potential uses are. Already posted is some relevant information about the great idea of an XmlReader-based message design (which I designed inspired by the Indigo PDC build) and our use of lightweight transactions.

I am in the boot phase for the next software project right now (proprietary work) and I have identified very many good uses for the FABRIQ model in there already (hint).

Once all parties involved are giving their "thumbs up", we'll also make the source code drop and the binaries available to the public (you) and from there we're looking forward to your input (and contributions?).

Categories: Architecture | TechEd Europe | Technology | FABRIQ

I am back home from San Diego now. About 3 more hours of jet-lag to work on. This will be a very busy two weeks until I make a little excursion to the Pakistan Developer Conference in Karachi and then have another week to do the final preparations for TechEd Europe.

One of the three realy cool talks I'll do at TechEd Europe is called "Building Proseware" and explains the the scenario, architecture, and core implementation techniques of Proseware, an industrial-strength, robust, service-oriented example application that newtelligence has designed and implemented for Microsoft over the past 2 months.

The second talk is one that I have been looking forward to for a long time: Rafal Lukawiecki and myself are going to co-present a session. And if that weren't enough: The moderator of our little on-stage banter about services is nobody else than Pat Helland.

And lastly, I'll likely sign-off on the first public version of the FABRIQ later this week (we had been waiting for WSE 2.0 to come out), which means that Arvindra Sehmi and myself can not only repeat our FABRIQ talk in Amsterdam but have shipping bits to show this time. There will even be a hands-on lab on FABRIQ led by newtelligence instructors Achim Oellers and Jörg Freiberger. The plan is to launch the bits before the show, so watch this space for "when and where".

Overall, and as much as I like meeting all my friends in the U.S. and appreciate the efforts of the TechEd team over there, I think that for the last 4 years TechEd Europe consistently has been and will be again the better of the two TechEd events from a developer perspective. In Europe, we have TechEd and IT Forum, whereby TechEd is more developer focused and IT Forum is for the operations side of the house. Hence, TechEd Europe can go and does go a lot deeper into developer topics than TechEd US.

There's a lot of work ahead so don't be surprised if the blog falls silent again until I unleash the information avalanche on Proseware and FABRIQ.

Categories: Architecture | SOA | Talks | TechEd Europe | TechEd US | FABRIQ

May 19, 2004
@ 07:56 AM

The four fundamental transaction principles are nicely grouped into the acronym "ACID" that's simple to remember, and so I was looking for something that's doing the same for the SOA tenets and that sort of represents what the service idea has done to the distributed platform wars:

  • Policy-Based Behavior Negotiation
  • Explicitness of Boundaries
  • Autonomy
  • Contract
Categories: Architecture | SOA

May 16, 2004
@ 11:58 AM

Ralf Westphal responded to this and there are really just two sentences that I’d like to pick out from Ralf’s response because that allows me to go a quite a bit deeper into the data services idea and might help to further clarify what I understand as a service oriented approach to data and resource management. Ralf says: There is no necessity to put data access into a service and deploy it pretty far away from its clients. Sometimes is might make sense, sometimes it doesn’t.

I like patterns that eliminate that sort of doubt and which allow one to say “data services always make sense”.

Co-locating data acquisition and storage with business rules inside a service makes absolute sense if all accessed data can be assumed to be co-located on the same store and has similar characteristics with regards to the timely accuracy the data must have. In all other cases, it’s very beneficial to move data access into a separate, autonomous data service and as I’ll explain here, the design can be made so flexible that the data service consumer won’t even notice radical architectural changes to how data is stored. I will show three quite large scenarios to help illustrating what I mean: A federated warehouse system, a partitioned customer data storage system and a master/replica catalog system.

The central question that I want to answer is: Why would you want delegate data acquisition and storage to dedicated services? The short answer is: Because data doesn’t always live in a single place and not all data is alike.

Here the long answer:

The Warehouse

The Warehouse Inventory Service (WIS) holds data about all the goods/items that are stored in warehouse. It’s a data service in the sense that it manages the records (quantity in stock, reorder levels, items on back order) for the individual goods, performs some simplistic accounting-like work to allocate pools of items to orders, but it doesn’t really contain any sophisticated business rules. The services implementing the supply order process and the order fulfillment process for customer orders implement such business rules – the warehouse service just keeps data records.

The public interface [“Explicit Boundary” SOA tenet] for this service is governed by one (or a set of) WSDL portType(s), which define(s) a set of actions and message exchanges that the service implements and understands [“Shared Contract” SOA tenet]. Complementary is a deployment-dependent policy definition for the service, which describes several assertions about the Security and QoS requirements the service makes [“Policy” SOA tenet].

The WIS controls its own, isolated store over which it has exclusive control and the only way that others can get at the content of that data store is through actions available on the public interface of the service [“Autonomy” SOA tenet].

Now let’s say the company running the system is a bit bigger and has a central website (of which replicas might be hosted in several locations) and has multiple warehouses from where items can be delivered. So now, we are putting a total of four instances of WIS into our data centers at the warehouses in New Jersey, Houston, Atlanta and Seattle. The services need to live there, because only the people on site can effectively manage the “shelf/database relationship”. So how does that impact the order fulfillment system that used to talk to the “simple” WIS? It doesn’t, because we can build a dispatcher service implementing the very same portType that accepts order information, looks at the order’s shipping address and routes the allocation requests to the warehouse closest to the shipping destination. In fact now, the formerly “dumb” WIS can be outfitted with some more sophisticated rules that allow to split or to shift the allocation of items to orders across or between warehouses to limit freight cost or ensure the earliest possible delivery in case the preferred warehouse is out of stock for a certain item. Still, from the perspective of the service consumer, the WIS implementation is still just a data service. All that additional complexity is hidden in the underlying “service tree”.

While all the services implement the very same portType, their service policies may differ significantly. Authentication may require certificates for one warehouse and some other token for another warehouse. The connection to some warehouses might be done through a typically rock-solid reliable direct leased line, while another is reached through a less-than-optimal Internet tunnel, which impacts the application-level demand for the reliable messaging assurances. All these aspects are deployment specific and hence made an external deployment-time choice. That’s why WS-Policy exists.

The Customer Data Storage

This scenario for the Customer Data Storage Service (CDS) starts as simple as the Warehouse Inventory scenario and with a single service. The design principles are the same.

Now let’s assume we’re running a quite sophisticated e-commerce site where customers can customize quite a few aspects of the site, can store and reload shopping carts, make personal annotations on items, and can review their own order history. Let’s also assume that we’re pretty aggressively tracking what they look at, what their search keywords are and also what items they put into any shopping cart so that we can show them a very personalized selection of goods that precisely matches their interest profile. Let’s say that all-in-all, we need to have storage space of about 2Mbytes for the cumulative profile/tracking data of each customer. And we happen to have 2 million customers. Even in the Gigabyte age, ~4mln Mbytes (4TB) is quite a bit of data payload to manage in a read/write access database that should be reasonably responsive.

So, the solution is to partition the customer data across an array of smaller (cheaper!) machines that each holds a bucket of customer records. With that we’re also eliminating the co-location assumption.

As in the warehouse case, we are putting a dispatcher service implementing the very same CDS portType on top of the partitioned data service array and therefore hide the storage strategy re-architecture from the service consumers entirely. With this application-level partitioning strategy (and a set of auxiliary service to manage partitions that I am not discussing here), we could scale this up to 2 billion customers and still have an appropriate architecture. Mind that we can have any number of dispatcher instances as long as they implement the same rules for how to route requests to partitions. Strategies for this are a direct partition reference in the customer identifier or a locator service sitting on a customer/machine lookup dictionary.

Now you might say “my database engine does this for me”. Yes, so-called “shared-nothing” clustering techniques do exist on the database level for a while now, but the following addition to the scenario mandates putting more logic into the dispatching and allocation service than – for instance – SQL Server’s “distributed partitioned views” are ready to deal with.

What I am adding to the picture is the European Union’s Data Privacy Directive. Very simplified, the EU directives and regulations it is illegal to permanently store personal data of EU citizens outside EU territory, unless the storage operator and the legislation governing the operator complies with the respective “Safe Harbor” regulations spelled out in these EU rules.

So let’s say we’re a tiny little bit evil and want to treat EU data according to EU rules, but be more “relaxed” about data privacy for the rest of the world. Hence, we permanently store all EU customer data in a data center near Dublin, Ireland and the data for the rest of the world in a data center in Dallas, TX (not making any implications here).

In that case, we’re adding yet another service on top of the unaltered partitioning architecture that implements the same CDS contract and which internally implements the appropriate data routing and service access rules. Those rules which will most likely be based on some location code embedded in the customer identifier (“E1223344” vs. “U1223344”). Based on these rules, requests are dispatched to the right data center. To improve performance and avoid having to data travel along the complete path repeatedly or in small chunks during an interactive session with the customer (customer is logged into the web site), the dispatcher service might choose to have a temporary, non-permanent cache for customer data that is filled with a single request and allows quicker and repeat access to customer data. Changes to the customer’s data that result from the interactive session can later be replicated out to the remote permanent storage.

Again, the service consumer doesn’t really need to know about these massive architectural changes in the underlying data services tree. It only talks to a service that understands a well-known contract.

The Catalog System

Same picture to boot with and the same rules: Here we have a simple service fronting a catalog database. If you have millions of catalog items with customer reviews, pictures, audio and/or video clips, you might chose to partition this just like we did with the customer data.

If you have different catalogs depending on the markets you are selling into (for instance German-language books for Austria, Switzerland and Germany), you might want to partition by location just as in the warehouse scenario.

One thing that’s very special about catalog data is that very much of it rarely ever changes. Reviews are added, media might be added, but except for corrections, the title, author, ISBN number and content summary for a book really doesn’t ever change as long as the book is kept in the catalog. Such data is essentially “insert once, change never”. It’s read-only for all practical purposes.

What’s wonderful about read-only data is that you can replicate it, cache it, move it close to the consumer and pre-index it. You’re expecting that a lot of people will search for items with “Christmas” in the item description come November? Instead of running a full text search every time, run that query once, save the result set in an extra table and have the stored procedure running the “QueryByItemDescription” activity simply return the entire table if it sees that keyword. Read-only data is optimization heaven.

Also, for catalog data, timeliness is not a great concern. If a customer review or a correction isn’t immediately reflected on the presentation surface, but only 30 minutes or 3 hours after is has been added to the master catalog, it doesn’t do any harm as long as the individual adding such information is sufficiently informed of such a delay.

So what we can do to with the catalog is to periodically (every few hours or even just twice a week) consolidate, pre-index and then propagate the master catalog data to distributed read-only replicas. The data services fronting the replicas will satisfy all read operations from the local store and will delegate all write operations directly (passthrough) to the master catalog service. They might choose to update their local replica to reflect those changes immediately, but that would preclude editorial or validation rules that might be enforced by the master catalog service.


So there you have it. What I’ve described here is the net effect of sticking to SOA rules.

·         Shared Contract: Any number of services can implement the same contract (although the concrete implementation, purpose and hence their type differ). Layering contract-compatible services with gradually increasing levels of abstractions and refining rules over existing services creates very clear and simple designs that help you scale and distribute data very well

·         Explicit Boundaries: Forbidding foreign access or even knowledge about service internals allows radical changes inside and “underneath” services.

·         Autonomy allows for data partitioning and data access optimization and avoids “tight coupling in the backend”.

·         Policy: Separating out policy from the service/message contract allows flexible deployment of the compatible services across a variety of security and trust scenarios and also allows for dynamic adaptation to “far” or “near” communications paths by mandating certain QoS properties such as reliable messaging.


Service-Orientation is most useful if you don’t consider it as just another technique or tool, but embrace it as a paradigm. And very little of this thinking has to do with SOAP or XML. SOAP and XML are indeed just tools.

Categories: Architecture | SOA

[This might be more a “note to self” than anything else and might not be immediately clear. If this one goes over your head on the first pass – read it once more ;-)]

Fellow German RD Ralf Westphal is figuring out layers and data access. The “onion” he has in a recent article on his blog resembles the notation that Steve Swartz and I introduced for the Scalable Applications Tour 2003.  (See picture; get the full layers deck from Microsoft’s download site if you don’t have it already)

What Ralf describes with his “high level resource access” public interface encapsulation is in fact a “data service” as per our definition. To boot, we consider literally every unit in a program (function, class, module, service, application, system) as having three layers: the outermost layer is the publicly accessible interface, the inner layer is the hidden internal implementation and the innermost layer hides and abstracts services and resource access. The concrete implementation of this model depends on the type of unit you are dealing with. A class has public methods as public interface, protected/private methods as internal implementation and uses “new” or a factory indirection to construct references to its resource providers. A SQL database has stored procedures and views as public interface, tables and indexes as internal implementation and the resource access is the database engine itself. It goes much further than that, but I don’t want to get carried away here.

A data service is a layered unit that specializes in acquiring, storing, caching or otherwise dealing with data as appropriate to a certain scope of data items. By autonomy rules, data services do not only hide the data access methods, but also any of these characteristics. The service consumer can walk up to a data service and make a call to acquire some data and it is the data service’s responsibility to decide how that task is best fulfilled. Data might be returned from a cache, aggregated from a set of downstream services or directly acquired from a resource. Delegating resource access to autonomous services instead of “just” encapsulating it with a layer abstraction allows for several implementations of the same data service contract. One of the alternate implementations might live close to the master data copy, another might be sitting on a replica with remote update capability and yet another one may implement data partitioning across a cluster of storage services. Which variant of such a choice of data services is used for a specific environment then becomes a matter of the deployment-time wiring of the system.

Data services are the resource access layer of the “onion” model on the next higher level of abstraction. The public interface consists of presentation services (which render external data presentations of all sorts, not only human interaction), the internal implementation is made up of business services that implement the core of the application and the resource access are said data services. On the next higher level of abstraction, presentation services may very well play the role of data services to other services. And so it all repeats.

Now … Ralf says he thinks that the abstraction model works wonderfully for pulling chunks of data from underlying layers, but he’s very concerned about streaming data and large data sets – and uses reporting as a concrete example.

Now, I consider data consolidation (which reporting is) an inherent functionality of the data store technology and hence I am not at all agreeing with any part of the “read millions of records into Crystal Reports” story. A reporting rendering tool shall get pre-consolidated, pre-calculated data and turn that into a funky document; it should not consolidate data. Also, Ralf’s proposed delivery of data to a reporting engine in chunks doesn’t avoid that you’ll likely end up having to co-locate all received data into memory or onto disk to actually run the consolidation and reporting job --- in which case you end up where you started. But that’s not the point here.

Ralf says that for very large amounts of data or data streams, pull must change to push and the resource access layer must spoon-feed the business implementation (reporting service in his case) chunks of data at a time. Yes! Right on!

What Ralf leaves a bit in the fog is really how the reporting engine learns of a new reporting job, where and how results of the engine are being delivered and how he plans to deal with concurrency. Unfortunately, Ralf doesn’t mention context and how it is established and also doesn’t loop his solution back to the layering model he found. Also, the reporting service he’s describing doesn’t seem very flexible as it cannot perform autonomous data acquisition, but is absolutely dependent on being fed by the app – which might create an undesirable tightly coupled dependency between the feeder and a concrete report target.

The reporting service shall be autonomous and must be able to do its own data acquisition. It must be able to “pull” in the sense that it must be able to proactively initiate requests to data providers. At the same time, Ralf is right that the request result should be pushed to the reporting service, especially if the result set is very large.

Is that a contradiction? Does that require a different architecture? I’d say that we can’t allow very large data sets to break the fundamental layering model or that we should have to rethink the overall architectural structure in their presence. What’s needed is simply a message/call exchange pattern between the reporting service and the underlying data service that is not request/response, but duplex and which allows the callee to incrementally bubble up results to the caller. Duplex is the service-oriented equivalent of a callback interface with the difference that it’s not based on a (marshaled) interface pointer but rather on a more abstract context or correlation identifier (which might or might not be a session cookie). The requestor invokes the data service and provides a “reply-to” endpoint reference referencing itself (wsa:ReplyTo/wsa:Address), containing a correlation cookie identifying the originator context (wsa:ReplyTo/wsa:ReferenceProperties), and identifying an implemented port-type (wsa:ReplyTo/wsa:PortType) for which the data service knows how to feed back results. The port-type definition is essential, because the data service might know quite a few different port-types it can feed data to – in the given case it might be a port-type defined and exposed by the reporting service. [WS-Addressing is Total Goodness™]. What’s noteworthy regarding the mapping of duplex communication to the presented layering model is that the request originates from within the resource access layer, but the results for the requests are always delivered at the public interface.

The second fundamental difference to callbacks is that the request and all replies are typically delivered as one-way messages and hence doesn’t block any resources (threads) on the respective caller’s end.

For chunked data delivery, the callee makes repeated calls/sends messages to the “reply-to” destination and sends an appropriate termination message or makes a termination call when the request has been fulfilled. For streaming data delivery, the callee opens a streaming channel to the “reply-to” destination (something like a plain socket, TCP/DIME or – in the future -- Indigo’s TCP streaming transport) and just pumps a very long, continuous message.

Bottom line: Sometimes pull is good, sometimes push is good and duplex fits it all back into a consistent model.

Categories: Architecture

People often ask me what I’ve done before Bart, Achim and I started newtelligence together with Jörg. So where do we come from? Typically, we have given somewhat “foggy” answers to those kinds of questions, but Achim and I talked about that yesterday and we’ve started to ask ourselves “why we do that”?

In fact, Achim, Bart and I had been working together for a long time before we started newtelligence. We used to work for a banking software company called ABIT Software GmbH, which then merged with two other sibling companies by the same owners to form today’s ABIT AG. We’ve only reluctantly communicated that fact publicly, because the formation of our company didn’t really get much applause from our former employer – quite the contrary was true and hence we’ve been quite cautious.

For us it was always quite frustrating that ABIT was sitting on heaps on very cool technology that my colleagues and I developed over the years (including patented things) and never chose to capitalize on the technology itself. Here are some randomly selected milestones:

We had our own SOAP 0.9 stack running in 1999, which was part of a framework that had a working and fully transparent object-relational mapping system based on COM along with an abstract, XML-based UI description language (people call those things XUL or XAML nowadays).

In 1998 we forced (with some help of our customer’s wallet) IBM into a 6 months avalanche of weekly patches for the database engine and client software that turned SQL/400 (the SQL processor for DB/400 on AS/400) from a not-quite-to-perfect product into a rather acceptable SQL database.

In 1996 we fixed well over 500 bugs and added tons of features to Borland’s OWL for OS/2 with which we must have had the pretty unique framework setup where cross-platform Windows 3.x, Windows NT and OS/2 development actually functioned on top of that shared class library.

In 1994 we already had what could be considered as the precursor to a service-oriented architecture in with collaborating, (mostly) autonomous services. The framework underlying that architecture had an ODBC C++ class library well over 6 months before Microsoft came out with their first MFC wrapper for it, had an MVC model based centered around the idea of “value-holders” that we borrowed from SmallTalk and which spoke, amongst other things, a text-validation protocol that allowed us to have a single “TextBox” control could be bound against arbitrary value holders that would handle all the text-input syntax rules as per their data type (or subtype). All of this was fully based on the nascent COM model which was then still buried in three documentation pages of OLE 2.0. I didn’t care much about linking and embedding (although I wrote my own in-place editing client from scratch), but I cared a lot about IUnknown as soon as I got my hands on it in late 1993. And all applications (and services) built on that framework supported thorough OLE Automation with Visual Basic 3.0 to a degree that you could fill out any form and press any button programmatically – a functionality that was vital for the application’s workflow engine.

And of course, during all that time, we were actively involved in project and product development for huge financial applications with literally millions of lines of production code.

None of the technology work (except the final products) was ever shared or available to anyone for licensing.  We were at a solutions company that supported great visions internally, but never figured out that the technology would be a value by itself.

newtelligence AG exists because of that pain. Years back, we’ve already designed and implemented variations of many of the technologies that are now state of the art or (in the case of XAML) not even shipping yet. At the same time, we continue to develop our vision and that’s how we can stay on top of things. So it’s really not that we’re not learning like crazy and go through constant paradigm shifts – we’re lucky that we can accumulate knowledge on top of the vast experience that we have and adjust and modernize our thinking. However, what’s different now is that we can share the essence of what we figure out with the world. That’s a fantastic thing if you’ve spent most of your professional life “hidden and locked away” and were unable to share things with peers.

So every time you’ll see a “Flashback” title here on my blog, I’ll dig into my recollection and try to remember some of the architectural thinking we had back in those times. We’ve made some horrible mistakes and had some exuberant and not necessarily wise ideas (such as the belief that persistent object-identity and object-relational mapping are good things); but we also had quite a few really bright innovative ideas. The things that really bring you forward are the grand successes and the most devastating defeats. We’ve got plenty of those under our belt and even though some of these insights date back 10 years, they are surprisingly current and the “lessons learned” very much apply to the current technology stacks and architectural patterns.

So – if you’ve ever thought that we’re “all theory” authors and “sample app” developers – nothing is further away from the truth. Also: Although I fill an “architecture consultant” role more than anything else now, I probably write more code on a monthly basis than some full-time application developers – what’s finally surfacing in talks and workshops is usually just the tip of that iceberg and often very much simplified to explain the essence of what we find.

Categories: Architecture | Flashbacks

I talked about transactions on several events in the last few weeks and the sample that I use to illustrate that transactions are more than just a database technology is the little tile puzzle that I wrote a while back. For those interested who can't find it, here's the link again. The WorkSet class that is included in the puzzle is a fully functional, lightweight, in-memory 2 phase-commit transaction manager that's free for you to use.

Categories: Architecture | Technology

May 14, 2004
@ 08:10 AM

Welcome back ;-)

On one of those flights last week I read a short article about Enterprise Services in a developer magazine (name withheld to protect the innocent). The “teaser” introduction of the article said: “Enterprise Services serviced components are scalable, because they are stateless.” That statement is of course an echo of the same claim found in many other articles about the topic and also in many write-ups about Web services. So, is it true? Unfortunately, it’s not.

public class AmIStateless
       public int MyMethod()
              // do some work
              return 0;

 “Stateless” in the sense that it is being used in that article and many others describes the static structure of a class. Unfortunately that does not help us much to figure out how well instances of that class help us to scale by limiting the amount of server resources they consume. More precisely: If you look at a component and find that it doesn’t have any fields to hold data across calls (see the code snippet) and does furthermore not hold any data across calls in some secondary store (such as a “session object”), the component can be thought of as being stateless with regards to its callers, but how is the relationship with components and services that are called from it?

But before I continue: Why do we say that “stateless” scales well?

A component (or service) that does not hold any state across invocations has many benefits with regards to scalability. First, it lends itself well to load balancing. If you run the same component/service on a cluster of several machines and the client needs to make several calls, the client can walk up to any of the cluster machines in any order. That way, you can add any number of machines to the cluster and scale linearly. Second, components that don’t hold state across invocations can be discarded by the server at will or can be pooled and reused for any other client. This saves activation (construction) cost if you choose to pool and limits the amount of resources (memory, etc.) that instances consume on the server-end if you choose to discard components after each call. Pooling saves CPU cycles. Discarding saves memory and other resources. Both choices allow the server to serve more clients.

However, looking at the “edge” of a service isn’t enough and that’s where the problem lies.

The AmIStateless service that I am illustrating here does not stand alone. And even though it doesn’t keep any instance state in fields as you can see from the code snippet, it is absolutely not stateless. In fact, it may be a horrible resource hog. When the client makes a call to a method of the service (or otherwise sends a message to it), the service does its work by employing the components X and Y. Y in turn delegates work to an external service named ILiveElsewhere. All of a sudden, the oh-so-stateless AmIStateless service might turn into a significant resource consumer and limit scalability.

First observation: While no state is held in fields, the service does hold state on the stack while it runs. All local variables that are kept in on the call stack in the invoked service method, in X and in Y are consuming resources and depending on what you do that may not be little. Also, that memory will remain consumed until the next garbage collector run.

Second observation: If any of the secondary components takes a long time for processing (especially ILiveElsewhere), the service consumes and blocks a thread for a long time. Depending on how you invoke ILiveElsewhere you might indeed consume more than just the thread you run on.

Third observation:  If AmIStateless is the root of a transaction, you consume significant resources (locks) in all backend resource managers until the transaction completes – which may be much later than when the call returns. If you happen to run into an unfortunate situation, the transaction may take a significant time (minutes) to resolve.

Conclusion:  Since the whole purpose of what we usually do is data processing and we need to pass that data on between components, nothing is ever stateless while it runs. “Stateless” is a purely static view on code and only describes the immediate relationship between one provider and one consumer with regards to how much information is kept across calls. “Stateless” says nothing about what happens during a call.

Consequence: The scalability recipe isn’t to try achieving static statelessness by avoiding holding state across calls. Using this as a pattern certainly helps the naïve, but the actual goal is rather to keep sessions (interaction sequence duration) as short as possible and therefore limit the resource consumption of a single activity. A component that holds state across calls but for which the call sequence takes only a very short time or which does not block a lot of resources during the sequence may turn out to aid scalability much more than a component that seems “stateless” when you look at it, but which takes a long time for processing or consumes a lot of resources while processing the call. One way to get there is to avoid accumulating state on call stacks. How? Stay tuned.

Categories: Architecture

May 3, 2004
@ 09:20 PM

The EMEA Architect Tour 2004 Videos from Finland are online and certainly one of the rare chances to see me speaking in a proper suit. And we speak about FABRIQ...

Categories: Architecture

I am writing a very, very, very big application at the moment and I am totally swamped in a 24/7 coding frenzy that’s going to continue for the next week or so, but here’s one little bit to think about and for which I came up with a solution. It’s actually a pretty scary problem.

Let’s say you have a transactional serviced component (or make that a transactional EJB) and you call an HTTP web service from it that forwards any information to another service. What happens if the transaction fails for any reason? You’ve just produced a phantom record. The web service on the other end should never have seen that information. In fact, that information doesn’t exist from the viewpoint of your rolled back local transaction. And of course, as of yet, there is no infrastructure in place that gives you interoperable transaction flow. And if that were the case, the other web service may not support it. What should you do? Panic?

There is help right in the platform (Enterprise Services that is). Your best friend for that sort of circumstance is System.EnterpriseServices.CompensatingResourceManager.

The use case here is to call another service to allocate some items from an inventory service. The call is strictly asynchronous and I the remote service will eventually turn around and call an action on my service (they have a “duplex” conversation using asynchronous calls going back and forth). Instead of calling the service form within my transactional method, I am deferring the call until the transaction is being resolved. Only when DTC is sure that the local transaction will go through, the web service call will be made. There is no way to guarantee that the remote call succeeds, but it does at least eliminate the very horrible side effects on overall system consistency caused by phantom calls. It is in fact quite impossible to implement “Prepare” correctly here, since the remote service may fail processing the (one-way) call on a different thread and hence I might never get a SOAP fault indicating failure. Because that’s so and because I really don’t know what the other service does, I am not writing any specific recovery code in the “Commit” phase. Instead, my local state for the conversation indicates the current progress of the interaction between the two services and logs an expiration time. Once that expiration time has passed without a response from the remote service, a watchdog will pick up the state record, create a new message for the remote service and replay the call.

For synchronous call scenarios, you could implement (not shown here) a two-step call sequence to the remote service, which the remote service needs to support, of course. In “Prepare” (or in the “normal code”) you would pass the data to the remote service and hold a session state cookie. If that call succeeds, you vote “true”. In “Commit” you would issue a call to commit that data on the remote service for this session, on “Abort” (remember that the transaction may fail for any reason outside the scope of the web service call), you will call the remote service to cancel the action and discard the data of the session. What if the network connection fails between the “Prepare” phase call and the “Commit” phase call? That’s the tricky bit. You could log the call data and retry the “Commit” call at a later time or keep retrying for a while in the “Commit” phase (which will cause the transaction to hang). There’s no really good solution for that case, unless you have transaction flow. In any event, the remote service will have to default to an “Abort” once the session times out, which is easy to do if the data is kept in a volatile session store over there. It just “forgets” it.

However, all of this is much, much better than making naïve, simple web service calls that fan out intermediate data from within transactions. Fight the phantoms.

At the call location, write the call data to the CRM transaction log using the Clerk:

AllocatedItemsMessage aim = new AllocatedItemsMessage();
aim.allocatedAllocation = <<< copy that data from elsewhere>>>
Clerk clerk = new Clerk(typeof(SiteInventoryConfirmAllocationRM),"SiteInventoryConfirmAllocationRM",CompensatorOptions.AllPhases);
SiteInventoryConfirmAllocationRM.ConfirmAllocationLogRecord rec = new RhineBooks.ClusterInventoryService.SiteInventoryConfirmAllocationRM.ConfirmAllocationLogRecord();
rec.allocatedItemsMessage = aim;
clerk.WriteLogRecord( rec.XmlSerialize() );

Write a compensator that picks up the call data from the log and forwards it to the remote service. In the “Prepare” phase, the minimum work that can be done is to check whether the proxy can be constructed. You could also make sure that the call URL is valid, the server name resolves and you could even try a GET on the service’s documentation page or call a “Ping” method the remote service may provide. That all serves to verify as good as you can that the “Commit” call has a good chance of succeeding:

using System.EnterpriseServices.CompensatingResourceManager;
using …



/// This class is a CRM compensator that will invoke the allocation confirmation
/// activity on the site inventory service if, and only if, the local transaction
/// enlisting it is succeeding. Using the technique is a workaround for the lack
/// of transactional I/O with HTTP web services. While the compensator cannot make
/// sure that the call will succeed, it can at least guarantee that we do not produce
/// phantom calls to external services.

public class SiteInventoryConfirmAllocationRM : Compensator
  private bool vote = true;

  public class ConfirmAllocationLogRecord
    public SiteInventoryInquiries.AllocatedItemsMessage allocatedItemsMessage;           

    internal string XmlSerialize()
      StringWriter sw = new StringWriter();
      XmlSerializer xs = new XmlSerializer(typeof(ConfirmAllocationLogRecord));
      return sw.ToString();

    internal static ConfirmAllocationLogRecord XmlDeserialize(string s)
      StringReader sr = new StringReader(s);
      XmlSerializer xs = new XmlSerializer(typeof(ConfirmAllocationLogRecord));
      return xs.Deserialize(sr) as ConfirmAllocationLogRecord;

  public override bool PrepareRecord(LogRecord rec)
      SiteInventoryInquiriesWse sii;
      ConfirmAllocationLogRecord calr  = ConfirmAllocationLogRecord.XmlDeserialize((string)rec.Record);
      sii = InventoryInquiriesInternal.GetSiteInventoryInquiries( calr.allocatedItemsMessage.allocatedAllocation.warehouseName );
      vote = sii != null;    
      return false;
    catch( Exception ex )
      ExceptionManager.Publish( ex );
      vote = false;
      return true;

  public override bool EndPrepare()
    return vote;

  public override bool CommitRecord(LogRecord rec)
    SiteInventoryInquiriesWse sii;
    ConfirmAllocationLogRecord calr  = ConfirmAllocationLogRecord.XmlDeserialize((string)rec.Record);
    sii = InventoryInquiriesInternal.GetSiteInventoryInquiries( calr.allocatedItemsMessage.allocatedAllocation.warehouseName );
      sii.ConfirmAllocation( calr.allocatedItemsMessage );
    catch( Exception ex )
      ExceptionManager.Publish( ex );
    return true;



April 5, 2004
@ 03:25 PM

Here's a life sign. I am buried under lots of work of which pretty much all will see the light of day at TechEd. We're getting close to having a first public version of the FABRIQ project with Microsoft EMEA and we're very busy here at newtelligence writing a huge SOA sample application using and combining all the good things of ASP.NET Web Services, WSE 2.0, Enterprise Services, MSMQ, SQL, and Remoting. The result is quite likely going to play some role at TechEd US and other TechEds this year. Between then and my last technical blog positing I've written several thousand lines of code again and there are several thousand more to follow. Hence the silence. Once those two projects are done or close to being done, expect a flood of explanations.

Categories: Architecture | newtelligence

February 25, 2004
@ 02:05 PM

Of course, there is really no unanimously agreed-upon definition of what’s absolutely fundamental to SOA – or even what SOA really is. But I think there are four things that most people agree on and I think I there’s even a fifth:

To me, the first four core principles are:

·         Explicitness of boundaries [read: there’s stuff that is explicitly public and the rest isn’t],
·         Data exchange governed by an implementation independent message contract,
·         Compatibility of behaviors through negotiation of capabilities based on policies and
·         Service autonomy.

Number five is:

·         Locating and binding to a remote service is always indirect [read: the most important design pattern is the factory pattern]

I hear that there's quite a bit of amusement among the more senior Microsoft folks (and people like me) that there’s a lot of "reinventing COM" going on. It’s not that there’s a push into that direction. It just seems to happen. All of a sudden folks are playing with (differently named) variants of monikers, class factories and all those things. Say what you will, the IClassFactory indirection is great thing to have – one place to find a service, one place to configure a proxy, one place to sneak in a mapper/wrapper that makes the actual service you talk to look like another service.

(Note that I don’t mention SOAP here. Must a service use SOAP? How about services that fall back to something without angle brackets because their respective policies indicate that they can?)

Categories: Architecture

February 15, 2004
@ 08:27 PM

I am currently writing the speaker notes for a service-oriented architecture workshop that Microsoft and newtelligence will run later this year. I was just working on the definitions of components and services and I think I found a reasonably short and clear definition for it:

One of the most loaded and least well defined terms in programming is "component". Unfortunately, the same is true for "service". Especially there is confusion about the terms "component" and "services" in the context of SOA.

The term component is a development and deployment concept and refers to some form of compiled code. A component might be a JVM or CLR class, a Java bean or a COM class; in short, a component is any form of a unit of potentially reusable code that can be accessed by name, deployed and activated and can be assembled to build applications. Components are typically implemented using object-oriented programming languages and components can be used to implement services.

A service is a deployment and runtime concept. A service is strictly not a unit of code; it is rather a boundary definition that might be valid for several different concrete implementations. The service definition is deployed along with the components that implement it. The communication to and from a service is governed by data contracts and services policies. From the outside, a service is considered an autonomous unit that is solely responsible for the resource it provides access to. Services are used to compose solutions that may or may not span multiple applications.

Let me repeat the one sentence that made me go “damn, I think now I finally have the topic cornered”:

A service is strictly not a unit of code; it is rather a boundary definition that might be valid for several different concrete implementations.

Categories: Architecture | Indigo

Slowly, slowly I am seeing some light at the end of the tunnel designing the FABRIQ. It’s a very challenging project and although I am having a lot of fun, it’s really much harder work than I initially thought.

Today I’d like to share some details with you on how I am employing the lightweight transaction coordinator “WorkSet” that Steve Swartz and I wrote during this year’s Scalable Applications tour inside the FABRIQ.

The obvious problem with one-way pipeline processing (and a problem with the composition of independent cross-cutting concerns in general) is that failure management is pretty difficult. Once one of the pipeline components fails, other components may already have done work that might not be valid if the processing fails further down through the pipeline. The simplest example of that is, of course, logging. If you log a message as the first stage of a pipeline and a subsequent stage fails, do you want the log entry to remain where it is? The problem is: it depends. So although you might need to see the message before it is being processed by stages further down the pipeline, you can only find out whether it is flagged as success or failure once processing is complete or you may want to discard the log entry altogether on failure.

Before I go into details, I’ll clarify some of the terminology I am using:

·         A message handler is an object that typically implements a ProcessMessage() method and a property Next pointing to the handler that immediately follows it in a chain of handlers.

·         A pipeline hosts a chain of message handlers and has a “head” and a “tail” message handler which link the pipeline with that chain of handlers. The pipeline itself is a message handler itself, so that pipelines can be nested inside pipelines. The FabriqPipeline is a concrete implementation of such a pipeline that has, amongst other things, support for the mechanism described here.

·         A message is an object representing a SOAP message and has a collection of headers, a body (as an XmlReader) and a transient collection of message properties that are only valid as long as the message is in memory.

·         A work set is a lightweight, in-memory 2PC transaction that provides really only the “atomicity” and “consistency” properties out of the well-known “ACID” transaction property set. “Durability” is not a goal here and “isolation” sort of guaranteed, because messages are not shared resources. If external resources are touched, isolation needs to be guaranteed by the enlisted workers. A worker is a lightweight resource manager that can enlist into a work set and provides Prepare/Commit/Abort entry points.

Whenever a new message arrives at a FabriqPipeline, a new work set is created that governs the fault management for processing the respective message. The work set is associated with the message by creating a “@WorkSet” property on the message that references the WorkSet object. The pipeline itself maintains no immediate reference to the work set – it is message-bound.

public class FabriqWorker : IWorker
   private Message msg;
   private FabriqMessageHandler handler;
    public FabriqWorker(Message msg, FabriqMessageHandler handler)
         msg = msg;
         handler = handler;
     bool IWorker.Prepare(bool vote)
      return handler.Prepare(vote, msg );

    void IWorker.Abort()
       handler.Abort( msg );

    void IWorker.Commit()
       handler.Commit( msg );


The FabriqPipeline does not enlist any workers into the work set directly. Instead, message handlers enlist their workers into the work set as the message flows through the pipeline. A “worker” is an implementation of the IWorker interface that can be enlisted into a work set as a participant. Because the pipeline instance along with all message handler instances shall be reusable and shall be capable of processing several messages concurrently, the worker is not implemented on the handler itself. Instead, workers are implemented as a separate helper class (FabriqWorker). Instances of these worker classes are enlisted into the message’s work set. The worker instance gets a reference to the message it deals with and to the handler which enlisted it into the work set; once the worker is called during the 2 phase commit protocol phases, it calls the message handler’s implementation of Prepare/Abort/Commit.

This way, we can have one “all in one place” implementation of all message-handling on the message handler, but are keeping the transaction dependent state in a very lightweight object; therefore we can share the entire (likely complex) pipeline and handlers for many concurrent transactions, because none of the pipeline is made dependent on the message or transaction state.

public abstract class FabriqMessageHandler :
   IMessageHandler next = null;
   public FabriqMessageHandler()
   public virtual IMessageHandler Next  
   { get { return next; } set { next = value; }}
   bool IMessageHandler.Process(Message msg)
      bool result = this.Preprocess(msg);
      WorkSet workSet = msg.Properties["@WorkSet"] as WorkSet;
      if ( workSet != null )
          workSet.Register(new FabriqWorker( msg, this ) );
      return result;           

   protected bool Forward( Message msg )
      if ( next != null )
         return next.Process( msg );
         return false;

   public virtual bool Preprocess( Message msg )
       return false;

   public abstract bool Prepare( bool vote, Message msg );

   public virtual void Commit( Message msg )

   public virtual void Abort( Message msg )




When a message flows into the pipeline, all a transactional message handler does when it gets called in ProcessMessage() is to enlist its worker and return. If the handler is not transactional, it must never fail (such things exist), can ignore the whole work set story and simply forward the message to the Next handler. So, in fact, a transactional message handler will never forward the message in the (non-transactional) ProcessMessage() method.

One problem that the dependencies between message handlers create is that it may be impossible to forward a message to the next message handler in the chain before the message is processed; at least you can’t make a Prepare==true promise for the transaction outcome until you’ve done most work on the message and have verified that all resultant work will very likely succeed. Messages may even be transformed into new messages or split into multiple messages inside the pipeline, so that you can’t do anything meaningful until you are at least preparing.

The resulting contradiction is that a transaction participant cannot perform all work resulting from on a message before it is asked to commit work, but that message handlers following in the sequence may not have received the resulting message until then and may not even be enlisted into the transaction.

To resolve this problem, the FABRIQ pipeline’s transaction management is governed by some special transaction handling rules that are more liberal than those of traditional transaction coordinators.

·         During the first (prepare) phase of the 2-phase commit protocol, workers may still enlist into the transaction. This allows a message handler to forward messages to a not-yet-enlisted message handler during the prepare phase. The worker(s) that is/are enlisted by a subsequent handler because the currently preparing message handler is forwarding one (or multiple) messages to it, is/are appended to the list of workers in the work set and asked to prepare their work once the current message handler is done preparing. We call this method a “rolling enlistment” during prepare.

·         Inside the pipeline, messages are considered to be transient data. Therefore, they may be manipulated and passed on during the Prepare phase, independent of the overall transaction outcome. The tail of the transaction controller pipeline (which is the outermost pipeline object) always enlists a worker into the transaction that will only forward messages to outside parties on Commit() and therefore takes care of hiding the transaction work to guarantee isolation.

·         Changes to any resources external to the message (so, anything that is not contained in message properties or message headers) must be guarded by the transaction workers. This means that all usual rules about guarding intermediate transaction state and transaction resources apply: The ability to make changes must be verified by tentative actions during Prepare() and changes may only be finally performed in Commit(). In case the external resources do not permit tentative actions, the Abort() method must take the necessary steps to undo actions performed during Prepare().

Whenever new messages get created during processing, the message properties (which hold the reference to the work set and, hence, to the current transaction) may be propagated into the newly created message, which causes the processing of these messages to be enlisted in the transaction, or a new or no work set can be created so that further processing of these messages is separate from the ongoing transaction. That’s what we do for failure messages.

During prepare, participants can log failure information to a message property called “@FaultInfo” that contains a collection of FaultInfo objects. If message processing fails, this information is logged and is, if possible, relayed to the message sender’s WS Addressing wsa:FaultTo, wsa:ReplyTo or wsa:From destination (in that order of preference) in a SOAP fault message.

For integration with “real” transactions, the entire work set may act as a DTC resource manager. If that’s so, the 2PC management is done by DTC and the work set acts as an aggregating proxy for the workers toward DTC. It collects its vote from its own enlistments and forwards the Commit/Abort to its enlistments.

Categories: Architecture | FABRIQ

December 6, 2003
@ 06:27 PM

„Software architecture is a tough thing - a vast, interesting and largely unexplored subject area. And of course everyone has something to say about it!”

Go, get and read the first issue of the Microsoft EMEA Architect’s JOURNAL. You will be surprised in many ways.

*(I wrote an article on blogging in general and dasBlog in particular for this issue. Here’s the shortcut.)


Categories: Architecture

December 3, 2003
@ 11:12 PM
Arvindra Sehmi, who is “Senior Architect” at Microsoft EMEA, is indeed one of the most brilliant architects I know and also happens to be the project manager and “owner” of the project I am working on as the lead architect at the moment (I’ve hinted at it here and here) has finally allowed me to say bit more about what we’re up to. The goal of this project, code-named “FABRIQ”, is to create a special-purpose, high-performance, service-oriented, one-way messaging infrastructure for queuing networks, agents and agile computing. It’s not a Microsoft product. It’s an architecture blue-print backed by code that we write so that customers don’t need to – at least that’s the plan. In case that doesn’t tell you anything, I’ll try to give you a little bit of an idea (It’s long, but it’s hopefully worth it) ....
Categories: Architecture

I see quite a few models for Service Oriented Architectures that employ pipelines with validating "gatekeeper" stages that verify whether inbound messages are valid according to an agreed contract. Validation on inbound messages is a reactive action resulting from distrust of the communication partner's ability to adhere to the contract. Validation on inbound messages shields a service from invalid input data, but seen from the perspective of the entire system, the action occurs too late.

What I see less often is a gatekeeper on outbound channels that verifies whether the currently executing local service adheres to the agreed communication contract. Validation on outbound messages is a proactive action taken in order to create trust with partners about the local service's ability to adhere to a contract. Furthermore, validation on outbound messages is quite often the last chance action before a well-known point of no return: the transaction boundary. If a service is faulty, for whatever reason, it needs to consistently fail and abort transactions instead of emitting incorrect messages that are in violation of the contract. If the service is faulty, it must consequently be assumed that compensating recovery strategies will not function properly and with the desired result.

Exception information that is generated on an inbound channel, especially in asynchronous one-way scenarios, vanishes into a log file at a location/organization that may not even own the sending service that's in violation of the contract. The only logical place to detect contract violations in order to isolate and efficiently eliminate problems is on the outbound, not on the inbound channel. Eliminating problems may mean to fix problems in the software, allow manual correction by an operator/clerk or an automatic rejection/rollback/retry of the operation yielding the incorrect result. None of these corrective actions can be done in a meaningful way by the message recipient. The recipient can shield itself, and that is and remains very important. However, it's just a desperate act of digging oneself in when the last line of defense did already fall.

Categories: Architecture | IT Strategy

Javier Gonzalez sent me a mail today on my most recent SOA post and says that it resonates with his experience:

I just read your article about services and find it very interesting. I have been using OOP languages to build somewhat complex systems for the last 5 years and even if I have had some degree of success with them, I usually find myself facing those same problems u mention (why, for instance, do I have to throw an exception to a module that doesn't know how to deal with it?). Yes, objects in a well designed OOP systems are *supposed* to be loosely coupled, but then, is that really possible to completely achieve? So I do agree with u SOA might be a solution to some of my nightmares. Only one thing bothers me, and that is service implementation. Services, and most of all Web Services only care about interfaces, or better yet, contracts, but the functionality that those contracts provide have to be implemented in some way, right? Being as I am an "object fan" I would use an OO language, but I would like to hear your opinions on the subject. Also, there's something I call "service feasibility". Web Services and SOA in general do "sound" a very nice idea, but then, on real systems they tend to be sluggish, to say the least. They can put a network on its knees if the amount of information transmitted is only fair. SAOP is a very nice idea when it comes to interoperability, but the messages are *bloated* and the system's performance tend to suffer. -- I'd love to hear your opinions on this topics.

Here’s my reply to Javier:

Within a service, OOP stays as much of a good idea as it always was, because it gives us all the qualities of pre-built infrastructure reuse that we've learned to appreciate in recent years. I don't see much realistic potential for business logic or business object reuse, but OOP as a tool is well and alive.

Your point about services being sluggish has some truth to it, if you look at system components singularly. There is no doubt that a Porsche 911 is faster than a Ford Focus. However, if you look at a larger system as a whole, to stay in the picture let's take a bridge crossing a river at rush hour, the Focus and the 911 move at the same speed because of congestion -- a congestion that would occur even if everyone driving on that bridge were driving a 911. The primary goal is thus to make that bridge wider and not to give everyone a Porsche.

Maximizing throughput always tops optimizing raw performance. The idea of SOA in conjunction with autonomous computing networks decouples subsystems in a way that you get largely independent processing islands connected by one-way roads to which you can add arbitrary numbers of lanes (and arbitrary number of identical islands). So while an individual operation may indeed take a bit longer and the bandwidth requirements may be higher, the overall system can scale its capacity and throughput to infinity.

Still, for a quick reality check: Have you looked at what size packages IIOP or DCOM produce on the wire and at the number of network roundtrips they require for protocol negotiation? The scary thing about SOAP is that it is really very much in our face and relatively easy to comprehend. Thus people tend to pay more attention to it. If you compare common binary protocols to SOAP (considering a realistic mix of payloads), SOAP doesn't look all that horrible. Also, XML compresses really well and much better than binary data. All that being said, I know that the vendors (specifically Microsoft) are looking very closely at how to reduce the wire footprint of SOAP and I expect them to come around with proposals in a not too distant future.

Over in the comment view of that article, Stu Charlton raises some concerns and posts some questions. Here are some answers:

1) "No shared application state, everything must be passed through messages."  Every "service" oriented system I have ever witnessed has stated this as a goal, and eventually someone got sick of it and implemented a form of shared state. The GIT in COM, session variables in PL/SQL packages, ASP[.NET] Sessions, JSP HttpSession, common areas in CICS, Linda/JavaSpaces, Stateful Session Beans, Scratchpads / Blackboards, etc. Concern: No distributed computing paradigm has ever eliminated transient shared state, no matter how messy or unscalable it is.

Sessions are scoped to a conversation; what I mean is application-scoped state shared across sessions. Some of the examples you give are about session state, some are about application state. Session state can’t be avoided (although it can sometimes be piggybacked into the message flow) and is owned by a particular service. If you’ve started a conversation with a service, you need to go back to that service to continue the conversation. If the service itself is implemented using a local (load balance and/or failover) cluster that’s great, but you shouldn’t need to know about it. Application state that’s shared between multiple services provided by an application leads to co-location assumptions and is therefore bad.

2) "A customer record isn't uniquely identifiable in-memory and even not an addressable on-disk entity that's known throughout the system"  -- Question: This confuses me quite a bit. Are you advocating the abolishment of a primary key for a piece of shared data? If not, what do you mean by this: no notion of global object identity (fair), or something else?

I am saying that not all data can and should be treated alike. There is shared data whose realistic frequency of change is so low, that it simply doesn’t deserve uniqueness (and be identified by a primary key in a central store). There is shared data for which a master copy exists, but of which many concurrent on-disk replicas and in-memory copies may safely float throughout the system as long as there is understanding about the temporal accuracy requirements as well as about the potential for concurrent modification. While there is always a theoretical potential for concurrent data modification, the reality of many systems is that a records in many tables can and will never be concurrently accessed, because the information causing the change does not surface at two places at the same time. How many call center agents will realistically attempt to change a single customer’s address information at the same time? Lastly, there is data that should only be touched within a transaction and can and may only exist in a single place.

I am not abandoning the idea of “primary key” or a unique customer number. I am saying that reflecting that uniqueness in in-memory state is rarely the right choice and rarely worth the hassle. Concurrent modification of data is rare and there are techniques to eliminate it in many cases and by introduction of chronologies. Even if you are booking into a financial account, you are just adding information to a uniquely identifiable set of data. You are not modifying the account itself, but you add information to it. Counter example: If you have an object that represents a physical device such as a printer, a sensor, a network switch or a manufacturing robot, in-memory identity immediately reflects the identity of the physical entity you are dealing with. These are cases where objects and object identity make sense. That direct correspondence rarely exists in business systems. Those deal with data about things, not things.

3) "In a services world, there are no objects, just data". – […] Anyway, I don't think anyone [sane] has advocated building fine-grained object model distributed systems for quite a few years. […] But the object oriented community has known that for quite some time, hence the "Facade" pattern, and the packaging/reuse principles from folks such as Robert C. Martin. Domain models may still exist in the implementation of the service, depending on the complexity of the service.

OOP is great for the inner implementation of a service (see above) and I am in line with you here. There, however, plenty of people who still believe in object purity and that’s why I am saying what I am saying.

4) "data record stored & retrieved from many different data sources within the same application out of a variety of motivations"  --- I assume all of these copies of data are read-only, with one service having responsibility for updates. I also assume you mean that some form of optimistic conflict checking would be involved to ensure no lost updates. Concern: Traditionally we have had serializable transaction isolation to protect us from concurrent anomalies. Will we still have this sort of isolation in the face of multiple cached copies across web services?

I think that absolute temporal accuracy is severely overrated and is more an engineering obsession than anything else. Amazon.com basically lies into the faces of millions of users each day by saying “only 2-4 items left in stock” or “Usually ships within 24 hours”. Can they give you to-the-second accurate information from their backend warehouse? Of course they don’t. They won’t even tell you when your stuff ships when you’re through checkout and gave them you money. They’ll do so later – by email.

I also think that the risk of concurrent updates to records is – as outlined above – very low if you segment your data along the lines of the business use cases and not so much along the lines of what a DBA thinks is perfect form.

I’ll skip 5) and 6) (the answers are “Ok” and “If you want to see it that way”) and move on to
7) "Problematic assumptions regarding single databases vs. parallel databases for scalability" -- I'm not sure what the problem is here from an SOA perspective? Isn't this a physical data architecture issue, something encapsulated by your database's interface? As far as I know it's pretty transparent to me if Oracle decides to use a parallel query, unless I dig into the SQL plan. […]

“which may or may not be directly supported by your database system” is the half sentence to consider here as well. The Oracle cluster does it, SQL Server does it too, but there are other database system out there and there’s also other ways of storing and accessing data than RDBMS.

8) "Strong contracts eliminate "illegal argument" errors" Question: What about semantic constraints? Or referential integrity constraints? XML Schemas are richer than IDL, but they still don't capture rich semantic constraints (i.e. "book a room in this hotel, ensuring there are no overlapping reservations" -- or "employee reporting relationships must be hierarchical"). […]

“Book a room in this hotel” is a message to the service. The requirements-motivated answer to this message is either “yes” or “no”. “No overlapping reservations” is a local concern of that service and even “Sorry, we don’t know that hotel” is. The employee reporting relationships for a message relayed to an HR service can indeed be expressed by referential constraints in XSD, the validity of the merging the message into the backend store is an internal concern of the service. The answer is “can do that” or “can’t do that”.

What you won’t get are failures like “the employee name has more than 80 characters and we don’t know how to deal with that”. Stronger contracts and automatic enforcement of these contracts reduce the number of stupid errors, side-effects and the combination of stupid errors and side effects to look for – at either endpoint.

9) "The vision of Web services as an integration tool of global scale exhibits these and other constraints, making it necessary to enable asynchronous behavior and parallel processing as a core principle of mainstream application design and don’t leave that as a specialty to the high-performance and super-computing space."  -- Concern: Distributed/concurrent/parallel computing is hard. I haven't seen much evidence that SOA/ web services makes this any easier. It makes contracts easier, and distributing data types easier. But it's up to the programming model (.NET, J2EE, or something else) to make the distributed/concurrent/parallel model easier. There are some signs of improvement here, but I'm skeptical there will be anything that breaks this stuff into the "mainstream" (I guess it depends on what one defines as mainstream)...

Oh, I wouldn’t be too sure about that. There are lots of thing going on in that area that I know of but can’t talk about at present.

While SOA as a means of widespread systems integration is a solid idea, the dream of service-oriented "grid" computing isn't really economically viable unless the computation is very expensive. Co-locating processing & filtering as close as possible to the data source is still the key principle to an economic & performing system. (Jim Gray also has a recent paper on this on his website). Things like XQuery for integration and data federations (service oriented or not) still don't seem economically plausible until distributed query processors get a lot smarter and WAN costs go down.

Again, if the tools were up to speed, it would be economically feasible to do so. That’s going to be fixed. Even SOA based grids apparently sound much less like science fiction to me than to you.

Categories: Architecture | IT Strategy

I am in a blogging mood today … Here are some thoughts around composite metadata. Sorry for the bold title ;)

* * *

Whenever I am asked what I consider the most important innovation of the CLR, I don’t hesitate to respond “extensible metadata” coming in the form of custom attributes. Everyone who has followed this blog for a while and looked at some of the source code I published knows that I am seriously in love with attributes. In fact, very few of the projects I write don’t include at least one class derived from Attribute and once you use the XmlSerializer, Enterprise Services or ASMX, there’s no way around using them.

In my keynote on contracts and metadata at the Norwegian Visual Studio .NET 2003 launch earlier this year, I used the sample that’s attached at the bottom of this article. It illustrates how contracts can be enforced by both, schema validation and validation of object graphs based on the same set of constraints. In schema, the constraints are defined using metadata (restrictions) inside element or type definitions, and in classes, the very same restrictions can be applied using custom attributes, given you have a sufficient set of attributes and the respective validation logic. In both cases, the data is run through a filter that’s driven by the metadata information. If either filter is used at the inbound and outbound channels of a service, contract enforcement is automatic and “contract trust” between services, as defined in my previous article, can be achieved. So far, so good.

In my example, the metadata instrumentation for a CLR type looks like this:

       public class addressType
              public string City;
              public countryNameType Country;
              public countryCodeType CountryCode;
              public string PostalCode;
              public string AddressLine;

… while the corresponding schema is a bit better factored and looks like this:

    <xsd:simpleType name="nameType">
              <xsd:restriction base="xsd:string">
                     <xsd:pattern value="\p{L}[\p{L}\p{P}0-9\s]*" />
xsd:complexType name="addressType">
                     <xsd:element name="City">
                                   <xsd:restriction base="nameType">
                                          <xsd:maxLength value="80" />
                     <xsd:element name="Country" type="countryNameType" />
                     <xsd:element name="CountryCode" type="countryCodeType" />
                     <xsd:element name="PostalCode">
                                   <xsd:restriction base="xsd:string">
                                          <xsd:maxLength value="10" />
                     <xsd:element name="AddressLine">
                                   <xsd:restriction base="xsd:string">
                                          <xsd:maxLength value="160" />

The restrictions are expressed differently, but they are aspects of type in both cases and semantically identical. And both cases work and even the regular expressions are identical. All the sexiness of this example aside, there’s one thing that bugs me:

In XSD, I can create a new simple type by extending a base type with additional metadata like this

<xsd:simpleType name="nameType">
       <xsd:restriction base="xsd:string">
              <xsd:pattern value="\p{L}[\p{L}\p{P}0-9\s]*" />

which causes the metadata to be inherited by the subsequent element definition that again uses metadata to further augment the type definition with metadata rules:

<xsd:element name="City">
              <xsd:restriction base="nameType">
                     <xsd:maxLength value="80" />

So, XSD knows how to do metadata inheritance on simple types. The basic storage type (xsd:string) isn’t changed by this augmentation, it’s just the validation rules that change, expressed by adding metadata to the type. The problem is that the CLR model isn’t directly compatible with this. You can’t derive from any of the simple types and therefore you can’t project this schema directly onto a CLR type definition. Therefore I will have to apply the metadata onto every field/property, which is the equivalent of the XSD’s element declaration. The luxury of the <xsd:simpleType/> definition and inheritable metadata doesn’t exist. Or does it?

Well, using the following pattern it indeed can. Almost.

Let’s forget for a little moment that the nameType simple type definition above is a restriction of xsd:string, but let’s focus on what it really does for us. It encapsulates metadata. When we inherit that into the City element, an additional metadata item is added, resulting in a metadata composite of two rules – applied to the base type xsd:string.

So the about equivalent of this expressed in CLR terms could look like this:

    public class NameTypeStringAttribute : Attribute

    public class addressType
        public string City;

Now we have an attribute NameTypeString(Attribute) that fulfills the same metadata containment function. The attribute has an attribute. In fact, we could even go further with this and introduce a dedicated “CityString” meta-type either by composition:


      public class CityStringAttribute : Attribute


 … or by inheritance


       public class CityStringAttribute : NameTypeStringAttribute

Resulting in the simple field declaration

    [CityString] public string City;

The declaration essentially tells us “stored as a string, following the contract rules as defined in the composite metadata of [CityString]”.

Having that, there is one thing that’s still missing. How does the infrastructure tell if an attribute is indeed a composite and that the applicable set of metadata is a combination of all attributes found on this attribute and attributes that are declared on itself?

The answer is the following innocent looking marker interface:

    public interface ICompositeAttribute
    {  }

If that marker interface is found on an attribute, the attribute is considered a composite attribute and the infrastructure must (potentially recursively) consider attributes defined on this attribute in the same way as attributes that exist on the originally inspected element – for instance, a field.

    public class NameTypeStringAttribute : Attribute, ICompositeAttribute
    {   }

Why a marker interface and not just another attribute on the attribute? The answer is quite simple: Convenience. Using the marker interface, you can find composites simply with the following expression: *.GetCustomAttributes(typeof(ICompositeAttribute),true)

And why not use a base-class “CompositeAttribute”? Because that would be an unnecessary restriction for the composition of attributes. If only the marker interface is used, the composite can have any base attribute class, including those built into the system.

But wait, this is just one side of the composition story for attributes. There’s already a hint on an additional composition quality two short paragraphs up: *.GetCustomAttributes(typeof(ICompositeAttribute),true). The metadata search algorithm doesn’t only look for concrete attribute types, but also looks for interfaces, allowing the above expression to work.

So how would it be if an infrastructure like Enterprise Services would not use concrete attributes, but would also support composable attributes as illustrated here …

    public interface ITransactionAttribute
        public TransactionOption TransactionOption

    public interface IObjectPoolingAttribute
        public int MinPoolSize

        public int MaxPoolSize


In that case, you would also be able to define composite attributes that define standardized behavior for a certain class of ServicedComponents that you have in your application and should all behave in a similar way, resulting in a declaration like this:

      public class StandardTransactionalPooledAttribute :
        Attribute, ITransactionAttribute, IObjectPoolingAttribute

    public class MyComponent : ServiceComponent


While it seems to be an “either/or” thing at first, both illustrated composition patterns, the one using ICompositeAttribute and the other that’s entirely based on the inherent composition qualities of interface are useful. If you want to reuse a set of pre-built attributes like the ones that I am using to implement the constraints, the marker interface solution is very cheap, because the coding effort is minimal. If you are writing a larger infrastructure and want to allow your users more control over what attributes do and allow them to provide their own implementation, “interface-based attributes” may be a better choice.

Download: MetadataTester.zip

Categories: Architecture | CLR

September 25, 2003
@ 03:27 PM
If you are a developer and don't live in the Netherlands (where SOA stands, well known, for "Sexueel Overdraagbare Aandoeningen" = "Sexually transmitted diseases”), you may have heard by now that SOA stands for "service oriented architectures". In this article I am thinking aloud about what "services" mean in SOA.
Categories: Architecture | IT Strategy

Philip Rieck makes a great point about the obsession with "revolutionary innovation" quite a few people have. Little steps count too, he says, and there's good stuff in old things and I absolutely agree. At TechEd in Barcelona I said in one of my talks that people should read less computer books published in 2003 and more of those published in 1973.

Categories: Architecture