April 2, 2008
@ 11:10 PM

Earlier today I hopefully gave a somewhat reasonable, simple answer to the question "What is a Claim?" Let's try the same with "Token":

In the WS-* security world, "Token" is really just a another name the security geniuses decided to use for "Handy package for all sorts of security stuff". The most popular type of token is the SAML (just say "samel") token. If the ladies and gentlemen designing and writing security platform infrastructure and frameworks are doing a good job you might want to know about the existence of such a thing, but otherwise be blissfully ignorant of all the gory details.

Tokens are meant to be a thing that you need to know about in much the same way you need to know about ... ummm... rebate coupons you can cut out of your local newspaper or all those funny books that you get in the mail. I have really no idea how the accounting works behind the scenes between the manufacturers and the stores, but it really doesn't interest me much, either. What matters to me is that we get $4 off that jumbo pack of diapers and we go through a lot of those these days with a 9 month old baby here at home. We cut out the coupon, present it at the store, four bucks saved. Works for me.

A token is the same kind of deal. You go to some (security) service, get a token, and present that token to some other service. The other service takes a good look at the token and figures whether it 'trusts' the token issuer and might then do some further inspection; if all is well you get four bucks off. Or you get to do the thing you want to do at the service. The latter is more likely, but I liked the idea for a moment.

Remember when I mentioned the surprising fact that people lie from time to time when I wrote about claims? Well, that's where tokens come in. The security stuff in a token is there to keep people honest and to make 'assertions' about claims. The security dudes and dudettes will say "Err, that's not the whole story", but for me it's good enough. It's actually pretty common (that'll be their objection) that there are tokens that don't carry any claims and where the security service effectively says "whoever brings this token is a fine person; they are ok to get in". It's like having a really close buddy relationship with the boss of the nightclub when you are having troubles with the monsters guarding the door. I'm getting a bit ahead of myself here, though.

In the post about claims I claimed that "I am authorized to approve corporate acquisitions with a transaction volume of up to $5Bln". That's a pretty obvious lie. If there was such a thing as a one-click shopping button for companies on some Microsoft Intranet site (there isn't, don't get any ideas) and I were to push it, I surely should not be authorized to execute the transaction. The imaginary "just one click and you own Xigg" button would surely have some sort of authorization mechanism on it.

I don't know what Xigg is assumed to be worth these days, but there is actually be a second authorization gate to check. I might indeed be authorized to do one-click shopping for corporate acquisitions, but even with my made-up $5Bln limit claim, Xigg may just be worth more that I'm claiming I'm authorized to approve. I digress.

How would the one-click-merger-approval service be secured? It would expect some sort of token that absolutely, positively asserts that my claim "I am authorized to approve corporate acquisitions with a transaction volume of up to $5Bln" is truthful and the one-click-merger-approval service would have to absolutely trust the security service that is making that assertion. The resulting token that I'm getting from the security service would contain the claim as an attribute of the assertion and that assertion would be signed and encrypted in mysterious (for me) yet very secure and interoperable ways, so that I can't tamper with it as much as I look at the token while having it in hands.

The service receiving the token is the only one able to crack the token (I'll get to that point in a later post) and look at its internals and the asserted attributes. So what if I were indeed authorized to spend a bit of Microsoft's reserves and I were trying to acquire Xigg at the touch of a button and, for some reason I wouldn't understand, the valuation were outside my acquisition limit? That's the service's job. It'd look at my claim, understand that I can't spend more than $5Bln and say "nope!" - and it would likely send email to SteveB under the covers. Trouble.

Bottom line: For a client application, a token is a collection of opaque (and mysterious) security stuff. The token may contain an assertion (saying "yep, that's actually true") about a claim or a set of claims that I am making. I shouldn't have to care about the further details unless I'm writing a service and I'm interested in some deeper inspection of the claims that have been asserted. I will get to that.

Before that, I notice that I talked quite a bit about some sort of "security service" here. Next post...


 
Categories: Architecture | SOA | CardSpace | WCF | Web Services

April 2, 2008
@ 01:20 PM

If you ask any search engine "What is a Claim?" and you mean the sort of claim used in the WS-* security space, you'll likely find an answer somewhere, but that answer is just as likely buried in a sea of complex terminology that is only really comprehensible if you have already wrapped your head around the details of the WS-* security model. I would have thought that by now there would be a simple and not too technical explanation of the concept that's easy to find on the Web, but I haven't really had success finding one. 

So "What is a Claim?" It's really simple.

A claim is just a simple statement like "I am Clemens Vasters", or "I am over 21 years of age", or "I am a Microsoft employee", or "I work in the Connected Systems Division", or "I am authorized to approve corporate acquisitions with a transaction volume of up to $5Bln". A claim set is just a bundle of such claims.

When I walk up to a service with some client program and want to do something on the service that requires authorization, the client program sends a claim set along with the request. For the client to know what claims to send along, the service lets it know about its requirements in its policy.

When a request comes in, this imaginary (U.S.) service looks at the request knowing "I'm a service for an online game  promoting alcoholic beverages!". It then it looks at the claim set, finds the "I am over 21 years of age" claim and thinks "Alright, I think we got that covered".

The service didn't really care who was trying to get at the service. And it shouldn't. To cover the liquor company's legal behind, they only need to know that you are over 21. They don't really need to know (and you probably don't want them to know) who is talking to them. From the client's perspective that's a good thing, because the client is now in a position to refuse giving out (m)any clues about the user's identity and only provide the exact data needed to pass the authorization gate. Mind that the claim isn't the date of birth for that exact reason. The claim just says "over 21".

Providing control over what claims are being sent to a service (I'm lumping websites, SOAP, and REST services all in the same bucket here) is one of the key reasons why Windows CardSpace exists, by the way. The service asks for a set of claims, you get to see what is being asked for, and it's ultimately your personal, interactive decision to provide or refuse to provide that information.

The only problem with relying on simple statements (claims) of that sort is that people lie. When you go to the Jack Daniel's website, you are asked to enter your date of birth before you can proceed. In reality, it's any date you like and an 10-year old kid is easily smart enough to figure that out.

All that complex security stuff is mostly there to keep people honest. Next time ...


 
Categories: Architecture | SOA | CardSpace | WCF | Web Services

Christian Weyer shows off the few lines of pretty straightforward WCF code & config he needed to figure out in order to set up a duplex conversation through BizTalk Services.


 
Categories: Architecture | SOA | BizTalk | WCF | Web Services | XML

Steve has a great analysis of what BizTalk Services means for Corzen and how he views it in the broader industry context.


 
Categories: Architecture | SOA | IT Strategy | Technology | BizTalk | WCF | Web Services

April 24, 2007
@ 08:28 PM

"ESB" (for "Enterprise Service Bus") is an acronym floating around in the SOA/BPM space for quite a while now. The notion is that you have a set of shared services in an enterprise that act as a shared foundation for discovering, connecting and federating services. That's a good thing and there's not much of a debate about the usefulness, except whether ESB is the actual term is being used to describe this service fabric or whether there's a concrete product with that name. Microsoft has, for instance, directory services, the UDDI registry, and our P2P resolution services that contribute to the discovery portion, we've got BizTalk Server as a scalable business process, integration and federation hub, we've got the Windows Communication Foundation for building service oriented applications and endpoints, we've got the Windows Workflow Foundation for building workflow-driven endpoint applications, and we have the Identity Platform with ILM/MIIS, ADFS, and CardSpace that provides the federated identity backplane.

Today, the division I work in (Connected Systems Division) has announced BizTalk Services, which John Shewchuk explains here and Dennis Pilarinos drills into here.

Two aspects that make the idea of a "service bus" generally very attractive are that the service bus enables identity federation and connectivity federation. This idea gets far more interesting and more broadly applicable when we remove the "Enterprise" constraint from ESB it and put "Internet" into its place, thus elevating it to an "Internet Services Bus", or ISB. If we look at the most popular Internet-dependent applications outside of the browser these days, like the many Instant Messaging apps, BitTorrent, Limewire, VoIP, Orb/Slingbox, Skype, Halo, Project Gotham Racing, and others, many of them depend on one or two key services must be provided for each of them: Identity Federation (or, in absence of that, a central identity service) and some sort of message relay in order to connect up two or more application instances that each sit behind firewalls - and at the very least some stable, shared rendezvous point or directory to seed P2P connections. The question "how does Messenger work?" has, from an high-level architecture perspective a simple answer: The Messenger "switchboard" acts as a message relay.

The problem gets really juicy when we look at the reality of what connecting such applications means and what an ISV (or you!) were to come up with the next cool thing on the Internet:

You'll soon find out that you will have to run a whole lot of server infrastructure and the routing of all of that traffic goes through your pipes. If your cool thing involves moving lots of large files around (let's say you'd want to build a photo sharing app like the very unfortunately deceased Microsoft Max) you'd suddenly find yourself running some significant sets of pipes (tubes?) into your basement even though your users are just passing data from one place to the next. That's a killer for lots of good ideas as this represents a significant entry barrier. Interesting stuff can get popular very, very fast these days and sometimes faster than you can say "Venture Capital".

Messenger runs such infrastructure. And the need for such infrastructure was indeed an (not entirely unexpected) important takeaway from the cited Max project. What looked just to be a very polished and cool client app to showcase all the Vista and NETFX 3.0 goodness was just the tip of a significant iceberg of (just as cool) server functionality that was running in a Microsoft data center to make the sharing experience as seamless and easy as it was. Once you want to do cool stuff that goes beyond the request/response browser thing, you easily end up running a data center. And people will quickly think that your application sucks if that data center doesn't "just work". And that translates into several "nines" in terms of availability in my book. And that'll cost you.

As cool as Flickr and YouTube are, I don't think of none of them or their brethren to be nearly as disruptive in terms of architectural paradigm shift and long-term technology impact as Napster, ICQ and Skype were as they appeared on the scene. YouTube is just a place with interesting content. ICQ changed the world of collaboration. Napster's and Skype's impact changed and is changing entire industries. The Internet is far more and has more potential than just having some shared, mashed-up places where lots of people go to consume, search and upload stuff. "Personal computing" where I'm in control of MY stuff and share between MY places from wherever I happen to be and NOT giving that data to someone else so that they can decorate my stuff with ads has a future. The pendulum will swing back. I want to be able to take a family picture with my digital camera and snap that into a digital picture frame at my dad's house at the push of a button without some "place" being in the middle of that. The picture frame just has to be able to stick its head out to a place where my camera can talk to it so that it can accept that picture and know that it's me who is sending it.

Another personal, and very concrete and real point in case: I am running, and I've written about that before, a custom-built (software/hardware) combo of two machines (one in Germany, one here in the US) that provide me and my family with full Windows Media Center embedded access to live and recorded TV along with electronic program guide data for 45+ German TV channels, Sports Pay-TV included. The work of getting the connectivity right (dynamic DNS, port mappings, firewall holes), dealing with the bandwidth constraints and shielding this against unwanted access were ridiculously complicated. This solution and IP telephony and video conferencing (over Messenger, Skype) are shrinking the distance to home to what's effectively just the inconvenience of the time difference of 9 hours and that we don't see family and friends in person all that often. Otherwise we're completely "plugged in" on what's going on at home and in Germany in general. That's an immediate and huge improvement of the quality of living for us, is enabled by the Internet, and has very little to do with "the Web", let alone "Web 2.0" - except that my Program Guide app for Media Center happens to be an AJAX app today. Using BizTalk Services would throw out a whole lot of complexity that I had to deal with myself, especially on the access control/identity and connectivity and discoverability fronts. Of course, as I've done it the hard way and it's working to a degree that my wife is very happy with it as it stands (which is the customer satisfaction metric that matters here), I'm not making changes for technology's sake until I'm attacking the next revision of this or I'll wait for one of the alternative and improving solutions (Orb is on a good path) to catch up with what I have.

But I digress. Just as much as the services that were just announced (and the ones that are lined up to follow) are a potential enabler for new Napster/ICQ/Skype type consumer space applications from innovative companies who don't have the capacity or expertise to run their own data center, they are also and just as importantly the "Small and Medium Enterprise Service Bus".

If you are an ISV catering shrink-wrapped business solutions to SMEs whose network infrastructure may be as simple as a DSL line (with dynamic IP) that goes into a (wireless) hub and is as locked down as it possibly can be by the local networking company that services them, we can do as much as we want as an industry in trying to make inter-company B2B work and expand it to SMEs; your customers just aren't playing in that game if they can't get over these basic connectivity hurdles.

Your app, that lives behind the firewall shield and NAT and a dynamic IP, doesn't have a stable, public place where it can publish its endpoints and you have no way to federate identity (and access control) unless you are doing some pretty invasive surgery on their network setup or you end up building and running run a bunch of infrastructure on-site or for them. And that's the same problem as the mentioned consumer apps have. Even more so, if you look at the list of "coming soon" services, you'll find that problems like relaying events or coordinating work with workflows are very suitable for many common use-cases in SME business applications once you imagine expanding their scope to inter-company collaboration.

So where's "Megacorp Enterprises" in that play? First of all, Megacorp isn't an island. Every Megacorp depends on lots of SME suppliers and retailers (or their equivalents in the respective lingo of the verticals). Plugging all of them directly into Megacorp's "ESB" often isn't feasible for lots of reasons and increasingly less so if the SME had a second or third (imagine that!) customer and/or supplier. 

Second, Megacorp isn't a uniform big entity. The count of "enterprise applications" running inside of Megacorp is measured in thousands rather than dozens. We're often inclined to think of SAP or Siebel when we think of enterprise applications, but the vast majority are much simpler and more scoped than that. It's not entirely ridiculous to think that some of those applications runs (gasp!) under someone's desk or in a cabinet in an extra room of a department. And it's also not entirely ridiculous to think that these applications are so vertical and special that their integration into the "ESB" gets continuously overridden by someone else's higher priorities and yet, the respective business department needs a very practical way to connect with partners now and be "connectable" even though it sits deeply inside the network thicket of Megacorp. While it is likely on every CIO's goal sheet to contain that sort of IT anarchy, it's a reality that needs answers in order to keep the business bring in the money.

Third, Megacorp needs to work with Gigacorp. To make it interesting, let's assume that Megacorp and Gigacorp don't like each other much and trust each other even less. They even compete. Yet, they've got to work on a standard and hence they need to collaborate. It turns out that this scenario is almost entirely the same as the "Panic! Our departments take IT in their own hands!" scenario described above. At most, Megacorp wants to give Gigacorp a rendezvous and identity federation point on neutral ground. So instead of letting Gigacorp on their ESB, they both hook their apps and their identity infrastructures into the ISB and let the ISB be the mediator in that play.

Bottom line: There are very many solution scenarios, of which I mentioned just a few, where "I" is a much more suitable scope than "E". Sometimes the appropriate scope is just "I", sometimes the appropriate scope is just "E". They key to achieve the agility that SOA strategies commonly promise is the ability to do the "E to I" scale-up whenever you need it in order to enable broader communication. If you need to elevate one or a set services from your ESB to Internet scope, you have the option to go and do so as appropriate and integrated with your identity infrastructure. And since this all strictly WS-* standards based, your "E" might actually be "whatever you happen to run today". BizTalk Services is the "I".

Or, in other words, this is a pretty big deal.


 
Categories: Architecture | SOA | IT Strategy | Microsoft | MSDN | BizTalk | WCF | Web Services

Recently, a gentleman from Switzerland wrote me an email after attending the “WinFX Tour” presentations in Zurich. He is a business consultant advising corporations on the IT strategy and an IT industry veteran with his first programming work dating as long back as 1962. He was quite interested in the Workflow part of my presentation, but wrote me that he thinks that those abstraction efforts go the wrong way. He sees the fundamental gap between business and IT widening and sees very little hope for the two sides to ever find a way to communicate effectively with each other. In his view, IT isn’t truly interested in the reality of business. He wrote me a very long email with several statements and questions, which I won’t quote – the (very long) reply below should give you enough context: 

Your main concern is, in my words, about the disconnect between the reality of the business vs. the snapshot of a perceived business reality that is translated into a software system. I say “perceived” because the capturing of the actual business reality is done by analysts who are on the fence between being business experts and IT experts and even though they would ideally be geniuses in both worlds to do that translation, they often are coming down on one side of that fence in terms of their core competencies.  

The only way to close that gap is to pull people off that fence onto the business side and enable them to capture the reality of the business and the way the business processes flow with tools that fit their needs and don’t demand that they are programmers or even have the sense of abstraction that a software designer or process analyst possesses. Our industry is only starting to understand what is required to achieve this and we are certainly thinking hard about these problems. 

You state that the Business/IT gap cannot be bridged. I do not fully agree with that assessment. I think what you are observing is a particular effect of software architecture and implementation as it exists today. You are truly an “industry veteran” you can certainly see much clearer how software has evolved since you got into the trade in 1962 than I can as a relative youngling. However, my (humble) observation is that the fundamental concepts of business software design haven’t changes all that much since then. A business application is a scoped set of siloed functionality built for a set of predefined purposes and whether the user interacts with the system through batch jobs, green screens, web sites or whether the system is made up of 5000 identical fat client applications with identical logic that talk to a central database is merely an implementation detail. The tradition of (interactive) business software is very much that we’ve got a system with some sort of menu screen or other form of selecting the task you want to perform with the system and any number of forms/screens/dialogs with which you can interact with the system. The reason for your observation of IT conveniently neglecting at least 20% of what they are told to do is not only caused by them not understanding the business, but also caused by them being not in control of the monsters they create because the scope grows too big, everything is tightly coupled to everything else, and a lot of functionalities are crammed together in ”multi-purpose” user interfaces that often make changes or adjustments mutually exclusive and hence “impossible”. 

The actual business process is often external to the software or if the software has workflow guidance, the workflows are not taking all the “offline” activities into account and the process becomes lossy. The way that most applications are built today is that they present the user with a grab-bag of tools and procedures and leaves it up to them to navigate through it. Even worse, business processes are often changed to fit the constraints of software – and not the other way around. Testimony for this is that the organizational structure of many companies is hardly recognizable once the SAP/PeopleSoft/Oracle/etc. ERP consultants have left the building. 

So what’s changing with the SOA/BPM “hype” as you call it? Or I shall better say: What’s the opportunity?  

First, we’re working hard to get to a point where we can build interoperable, autonomous pieces of software with well-defined interfaces where we don’t have to spend 80% of our time and money trying to make those pieces communicate with each other. Before the industry consensus around XML, SOAP and the WS-* specifications this most fundamental requirement for breaking up the solution silos simple didn’t exist (despite previous efforts like DCE or CORBA). I am not saying that we have completely arrived at that point, but we’ve got a better foundation than ever to build composable, loosely coupled, distributed systems that can interact irrespective of their implementation specifics. 

Second, we are starting to see growing insight on the IT side for the need of a common understanding of the idea of “services”. Any employee, department, division or vendor assumes various roles within an organization and renders a certain portfolio of services towards the organization. The notion of service-oriented architecture speaks about writing software that fits into the organization instead of fitting the organization to the software. Just like an employee assumes roles, software assumes roles as well. From an architecture perspective, people and software are peers collaborating on the same business task. The former are just a lot more flexible than the latter. 

Business processes (or bureaucracy), whether formalized or ad-hoc, are driven by the flow of information. If the information flow is not central to the execution of the process you have friction and efficiency suffers. With a paper-bound “offline” process where all information flows in a (paper-) file folder and on forms carried by courier from department to department there’s arguably a better and more complete information flow than in an environment where information is scattered around dozens of different computer systems where the flow of information and the flow of tasks are disconnected and only knitted together by people shuffling mice around on their desks. 

The opportunity is right at that point. With the technology we have available and a common notion of “services”, the line between the implementation of deterministic work (done by programs) and non-deterministic work (done by humans) begins to blur. It is fairly easy today to write chat-bots, mail-bots, or speech-enabled services that allow rich computer/human interaction within their respective scope. It is likewise fairly easy to expose application to application services that allow exchanging rich data across system and platform boundaries using Web Services. The concrete form of how a service interacts with a peer depends on who the peer is. If you have an address book lookup service, it may have all of these capabilities at once. With that, services can be integrated into real-life ad-hoc scenarios singly or in combination because they are built to satisfy specific roles and their capabilities are exposed for (re)use in arbitrary contexts.  

When you have a service that’s specialized on creating complex sales offers and you are a salesman on the road an sit in a taxi, it’s absolutely fathomable to build a solution that allows you to call in, identify yourself towards a voice service and ask for an baseline offer for 500 units of A and 300 units of B for a specific customer and have the result, with all applicable rebates and possible delivery time frames, including shipping cost and considering the schedule of the container freighter ship from China, pushed onto your hotel fax or by email or SMS/MMS. However, the question of how realistic it is to build that service purely depends on how easy it is to make all the necessary backend systems talk to each other and wire up all the roles into a process that can jointly accomplish the task. And it also depends on how well any necessary human intervention into that process can be integrated into the respective flow. Assuming that the resulting offer would cross a certain threshold in terms of the total order amount, the offer might require approval by a manager – the manager renders a “decision service” towards the process and might so by responding to an email that is sent to him/her by a program and will be evaluated by a program.  

Ideally, you could teach a service (through mail, speech or chat) the workflow ad-hoc just as you would tell an administrative assistant a sequence of activities. “Get A and B, do C, let Mike take a look at it and send it to me”. The information can be parsed, mapped to activities, the activities can be wired up to a one-off workflow and the workflow can be launched. The crux is that you need to have those individual capabilities and activities catalogued and available in a fashion that allows composition. And the above example of the approval manager goes to show that people’s roles and capabilities must be part of that same catalogue. 

We (Microsoft) are already shipping and will ship even more building blocks not only for creating such services and workflows, but we also have an increasingly complete federated identity and access control infrastructure that allows to realize all that decentralized interaction in a secure fashion. From a purely technical perspective, the above scenario is not utopia. We have every single component in place to let customers build this, voice recognition included. However, mind that I am not saying “very easy”. 

“BPM” tools such as the Windows Workflow Foundation or BizTalk Orchestration are all about putting the process into the center. Services are all about creating easy-to-integrate, loosely coupled, business-aligned pieces of functionality that can be used and reused in as many contexts as a role in a business can be used in different contexts. A workflow may just be one step that you just have in your head as you are interacting with the address lookup service or it may be a more formalized workflow with several steps and intertwined people-based and program-based activities. The realization here is that while individual roles in an organization are relatively sticky, the way that the organization acts across those roles and tunes the rules for the roles is very dynamic. BPM tools are precisely about shaping and reshaping the flow and rules quickly and deploying those instantly into the business environment. 

The design of the Windows Workflow Foundation also recognizes that business processes, especially those that run for days and weeks rather than seconds or minutes, are never set in stone. If you’ve got a (very) long running master workflow that were, for instance, tracking an insurance policy and applies rebates and handles claims as time progresses and suddenly the client comes around and sues the insurance company, that policy certainly no longer belongs into the same bucket as the other 100000s of policies that are being tracked. So for such unforeseeable circumstances, the foundation makes it possible to jump right in, assess the status and redirect or reshape the respective flow on a case-by-case basis and if the new action that you add into the flow in order to terminate it is merely that you hand off the entire case with all of its status to your legal department’s “dropoff service”. 

You write about SO/A and BPM as an “effort to extend the borders of IT”. From my perspective it’s rather the attempt by IT to humbly fit itself into the ever changing dynamic nature of business. On an industry level we have started to realize that we need to get away from the thinking that applications are silos and that nobody should factually care about whether he/she interacts with a “CRM” or “ERP” system or needs to navigate across a dozen of intranet websites to get a job done. The whole notion of “we build a loan application handling program” is indeed misguided. That’s the disconnect. 

But how alien is the notion of building a library of services that are not wired up into a thing that you can install and instantly run as a program? We hire people into organizations who have a broad education if which we only tap a fraction at any given point in time, but we can trust that we can instantly tap some other capability as the process changes. Which CEO/CFO/CIO will commit to a project that “educates” a software system to have a broad spectrum of capabilities that may or may not be used in the circumstances of “now” but which may become a pressing necessity as you need to quickly adapt to a change in the business? How strange of an idea is it that you might produce a software package that consists of hundreds of roles and thousands of activities but none of them are connected in any way, because that’s the job of the space that’s intentionally left blank and undefined to host the customized business process? How does the buyer justify the expense? What can the vendor charge? Would you continually service and update a “dormant” software-based capability to the latest policies, laws and regulations so that it can be used whenever the need arises? 

All that comes back to a completely different communication breakdown: How does IT explain that sort of perspective to the business stakeholders? The great challenge is not in the bits, it’s in the heads.


 
Categories: Architecture | SOA

Inside the big house....

Back in December of last year and about two weeks before I publicly announced that I will be working from Microsoft, I started a nine-part series on REST/POX* programming with Indigo WCF. (1, 2, 3, 4, 5, 6, 7, 8, 9). Since then, the WCF object model has seen quite a few feature and usability improvements across the board and those are significant enough to justify that I rewrite the entire series to get it up to the February CTP level and I will keep updating it through Vista/WinFX Beta2 and as we are marching towards our RTM. We've got a few changes/extensions in our production pipeline to make the REST/POX story for WCF v1 stronger and I will track those changes with yet another re-release of this series.

Except in one or two occasions, I haven't re-posted a reworked story on my blog. This here is quite a bit different, because of it sheer size and the things I learned in the process of writing it and developing the code along the way. So even though it is relatively new, it's already due for an end-to-end overhaul to represent my current thinking. It's also different, because I am starting to cross-post content to http://blogs.msdn.com/clemensv with this post; however http://friends.newtelligence.net/clemensv remains my primary blog since that runs my engine ;-)

Listening

The "current thinking" is of course very much influenced by now working for the team that builds WCF instead of being a customer looking at things from the outside. That changes the perspective quite a bit. One great insight I gained is how non-dogmatic and customer-oriented our team is. When I started the concrete REST/POX work with WCF back in last September (on the customer side still working with newtelligence), the extensions to the HTTP transport that enabled this work were just showing up in the public builds and they were sometimes referred to as the "Tim/Aaaron feature". Tim Ewald and Aaron Skonnard had beat the drums for having simple XML (non-SOAP) support in WCF so loudly that the team investigated the options and figured that some minimal changes to the HTTP transport would enable most of these scenarios**. Based on that feature, I wrote the set of dispatcher extensions that I've been presenting in the V1 of this series and newtellivision as the applied example did not only turn out to be a big hit as a demo, it also was one of many motivations to give the REST/POX scenario even deeper consideration within the team.

REST/POX is a scenario we think about as a first-class scenario alongside SOAP-based messaging - we are working with the ASP.NET Atlas team to integrate WCF with their AJAX story and we continue to tweak the core WCF product to enable those scenarios in a more straightforward fashion. Proof for that is that my talk (PPT here) at the MIX06 conference in Las Vegas two weeks ago was entirely dedicated to the non-SOAP scenarios.

What does that say about SOAP? Nothing. There are two parallel worlds of application-level network communication that live in peaceful co-existence:

  • Simple point-to-point, request/response scenarios with limited security requirements and no need for "enterprise features" along the lines of reliable messaging and transaction integration.
  • Rich messaging scenarios with support for message routing, reliable delivery, discoverable metadata, out-of-band data, transactions, one-way and duplex, etcetc.

The Faceless Web

The first scenario is the web as we know it. Almost. HTTP is an incredibly rich application protocol once you dig into RFC2616 and look at the methods in detail and consider response codes beyond 200 and 404. HTTP is strong because it is well-defined, widely supported and designed to scale, HTTP is weak because it is effectively constrained to request/response, there is no story for server-to-client notifications and it abstracts away the inherent reliability of the transmission-control protocol (TCP). These pros and cons lists are not exhaustive.

What REST/POX does is to elevate the web model above the "you give me text/html or */* and I give you application/x-www-form-urlencoded" interaction model. Whether the server punts up markup in the form of text/html or text/xml or some other angle-bracket dialect or some raw binary isn't too interesting. What's changing the way applications are built and what is really creating the foundation for, say, AJAX is that the path back to the server is increasingly XML'ised. PUT and POST with a content-type of text/xml is significantly different from application/x-www-form-urlencoded. What we are observing is the emancipation of HTTP from HTML to a degree that the "HT" in HTTP is becoming a misnomer. Something like IXTP ("Interlinked XML Transport Protocol" - I just made that up) would be a better fit by now.

The astonishing bit in this is that there has been been no fundamental technology change that has been driving this. The only thing I can identify is that browsers other than IE are now supporting XMLHTTP and therefore created the critical mass for broad adoption. REST/POX rips the face off the web and enables a separation of data and presentation in a way that mashups become easily possible and we're driving towards a point where the browser cache becomes more of an application repository than merely a place that holds cacheable collateral. When developing the newtellivision application I have spent quite a bit of time on tuning the caching behavior in a way that HTML and script are pulled from the server only when necessary and as static resources and all actual interaction with the backend services happens through XMLHTTP and in REST/POX style. newtellivision is not really a hypertext website, it's more like a smart client application that is delivered through the web technology stack.

Distributed Enterprise Computing

All that said, the significant investments in SOAP and WS-* that were made my Microsoft and industry partners such as Sun, IBM, Tibco and BEA have their primary justification in the parallel universe of highly interoperable, feature-rich intra and inter-application communication as well as in enterprise messaging. Even though there was a two-way split right through through the industry in the 1990s with one side adopting the Distributed Computing Environment (DCE) and the other side driving the Common Object Request Broker Architecture (CORBA), both of these camps made great advances towards rich, interoperable (within their boundaries) enterprise communication infrastructures. All of that got effectively killed by the web gold-rush starting in 1994/1995 as the focus (and investment) in the industry turned to HTML/HTTP and to building infrastructures that supported the web in the first place and everything else as a secondary consideration. The direct consequence of the resulting (even if big) technology islands hat sit underneath the web and the neglect of inter-application communication needs was that inter-application communication has slowly grown to become one of the greatest industry problems and cost factors. Contributing to that is that the average yearly number of corporate mergers and acquisitions has tripled compared to 10-15 years ago (even though the trend has slowed in recent years) and the information technology dependency of today's corporations has grown to become one of the deciding if not the deciding competitive factor for an ever increasing number of industries.

What we (the industry as a whole) are doing now and for the last few years is that we're working towards getting to a point where we're both writing the next chapter of the story of the web and we're fixing the distributed computing story at the same time by bringing them both onto a commonly agreed platform. The underpinning of that is XML; REST/POX is the simplest implementation. SOAP and the WS-* standards elevate that model up to the distributed enterprise computing realm.

If you compare the core properties of SOAP+WS-Adressing and the Internet Protocol (IP) in an interpretative fashion side-by-side and then also compare the Transmission Control Protocol (TCP) to WS-ReliableMessaging it may become quite clear to you what a fundamental abstraction above the networking stacks and concrete technology coupling the WS-* specification family has become. Every specification in the long list of WS-* specs is about converging and unifying formerly proprietary approaches to messaging, security, transactions, metadata, management, business process management and other aspects of distributed computing into this common platform.

Convergence

The beauty of that model is that it is an implementation superset of the web. SOAP is the out-of-band metadata container for these abstractions. The key feature of SOAP is SOAP:Header, which provides a standardized facility to relay the required metadata alongside payloads. If you are willing to constrain out-of-band metadata to one transport or application protocol, you don't need SOAP.

There is really very little difference between SOAP and REST/POX in terms of the information model. SOAP carries headers and HTTP carries headers. In HTTP they are bolted to the protocol layer and in SOAP they are tunneled through whatever carries the envelope. [In that sense, SOAP is calculated abuse of HTTP as a transport protocol for the purpose of abstraction.] You can map WS-Addressing headers from and to HTTP headers.

The SOAP/WS-* model is richer, more flexible and more complex. The SOAP/WS-* set of specifications is about infrastructure protocols. HTTP is an application protocol and therefore it is naturally more constrained - but has inherently defined qualities and features that require an explicit protocol implementation in the SOAP/WS-* world; one example is the inherent CRUD (create, read, update, delete) support in HTTP that is matched by the explicitly composed-on-top WS-Transfer protocol in SOAP/WS-*

The common platform is XML. You can scale down from SOAP/WS-* to REST/POX by putting the naked payload on the wire and rely on HTTP for your metadata, error and status information if that suits your needs. You can scale up from REST/POX to SOAP/WS-* by encapsulating payloads and leverage the WS-* infrastructure for all the flexibility and features it brings to the table. [It is fairly straightforward to go from HTTP to SOAP/WS-*, and it is harder to go the other way. That's why I say "superset".]

Doing the right thing for a given scenario is precisely what are enabling in WCF. There is a place for REST/POX for building the surface of the mashed and faceless web and there is a place for SOAP for building the backbone of it - and some may choose to mix and match these worlds. There are many scenarios and architectural models that suit them. What we want is

One Way To Program

* REST=REpresentational State Transfer; POX="Plain-Old XML" or "simple XML"


 
Categories: Architecture | SOA | MIX06 | Technology | Web Services

To (O/R) map or not to map.

The monthly discussion about the benefits and dangers of O/R mapping is making rounds on one of the mailing lists that I am signed up to. One big problem in this space - from my experience of discussing this through with a lot of people over and over – is that O/R mapping is one of those things where the sheer wish for an elegant solution to the data/object schism obscures most of the rational argumentation. If an O/R mapper provides a nice programming or tooling experience, developers (and architects) are often willing to accept performance hits and a less-than-optimal tight coupling to the data model, because they are lured by the aesthetics of the abstraction.

Another argument I keep hearing is that O/R mapping yields a significant productivity boost. However, if that were the case and if using O/R mapping would shorten the average development cost in a departmental development project by – say – a quarter or more, O/R mapping would likely have taken over the world by now. It hasn't. And it's not that the idea is new. It’s been around for well more than a decade.

To me, O/R mapping is one of the unfortunate consequences of trying to apply OOP principles to anything and everything. For "distributed objects", we’re fixing that with the service orientation idea and the consequential constraints when we talk about the network edge of applications. It turns out that the many of the same principles apply to the database edge as well. The list below is just for giving you the idea. I could write a whole article about this and I wish I had the time:

  • Boundaries are explicit => Database access is explicit
  • Services avoid coupling (autonomy) => Database schema and in-process data representation are disjoint and mapped explicitly
  • Share schema not code => Query/Sproc result sets and Sproc inputs form data access schema (aliased result sets provide a degree of separation from phys. schema)

In short, I think the dream of transparent O/R mapping is the same dream that fueled the development of fully transparent distributed objects in the early days of DSOM, CORBA and (D)COM when we all thought that'd just work and were neglecting the related issues of coupling, security, bandwidth, etc.

Meanwhile, we’ve learned the hard way that even though the idea was fantastic, it was rather naïve to apply local development principles to distributed systems. The same goes for database programming. Data is the most important thing in the vast majority of applications. Every class of data items (table) surround special considerations: read-only, read/write, insert-only; update frequency, currency and replicability; access authorization; business relevance; caching strategies; etcetc. 

Proper data management is the key to great architecture. Ignoring this and abstracting data access and data management away just to have a convenient programming model is … problematic.

And in closing: Many of the proponents of O/R mapping that I run into (and that is a generalization and I am not trying to offend anyone – just an observation) are folks who don't know SQL and RDBMS technology in any reasonable depth and/or often have no interest in doing so. It may be worth exploring how tooling can better help the SQL-challenged instead of obscuring all data access deep down in some framework and make all data look like a bunch of local objects. If you have ideas, shoot. Comment section is open for business.


 
Categories: Architecture | SOA

See Part 1

Before we can do anything about deadlocks or deal with similar troubles, we first need to be able to tell that we indeed have a deadlock situation. Finding this out is a matter of knowing the respective error codes that your database gives you and a mechanism to bubble that information up to some code that will handle the situation. So before we can think about and write the handling logic for failed/failing but safely repeatable transactions, we need to build a few little things. The first thing we’ll need is an exception class that will wrap the original exception indicating the reason for the transaction failure. The new exception class’s identity will later serve to filter out exceptions in a “catch” statement and take the appropriate actions.

using System;
using System.Runtime.Serialization;

namespace newtelligence.EnterpriseTools.Data
{
   [Serializable]
   public class RepeatableOperationException : Exception
   {
       public RepeatableOperationException():base()
       {
       }

       public RepeatableOperationException(Exception innerException)
           :base(null,innerException)
       {
       }

       public RepeatableOperationException(string message, Exception innerException)
           :base(message,innerException)
       {
       }

       public RepeatableOperationException(string message):base(message)
       {
       }

        public RepeatableOperationException(
          SerializationInfo serializationInfo,
          StreamingContext streamingContext)
            :base(serializationInfo,streamingContext)
        {
        }

        public override void GetObjectData(
           System.Runtime.Serialization.SerializationInfo info,
           System.Runtime.Serialization.StreamingContext context)
        {
            base.GetObjectData (info, context);
        }
   }
}

Having an exception wrapper with the desired semantics, we know need to be able to figure out when to replace the original exception with this wrapper and re-throw it up on the call stack. The idea is that whenever you execute a database operation – or, more generally, any operation that might be repeatable on failure – you will catch the resulting exception and run it through a factory, which will analyze the exception and wrap it with the RepeatableOperationException if the issue at hand can be resolved by re-running the transaction. The (still a little naïve) code below illustrates how to such a factory in the application code. Later we will flesh out the catch block a little more, since we will lose the original call stack if we end up re-throwing the original exception like shown here:

Try
{
   dbConnection.Open();
   sprocUpdateAndQueryStuff.Parameters["@StuffArgument"].Value = argument;
   result = this.GetResultFromReader( sprocUpdateAndQueryStuff.ExecuteReader() );
}
catch( Exception exception )
{
   throw RepeatableOperationExceptionMapper.MapException( exception );                           
}
finally
{
   dbConnection.Close();
}

The factory class itself is rather simple in structure, but a bit tricky to put together, because you have to know the right error codes for all resource managers you will ever run into. In the example below I put in what I believe to be the appropriate codes for SQL Server and Oracle (corrections are welcome) and left the ODBC and OLE DB factories (for which would have to inspect the driver type and the respective driver-specific error codes) blank. The factory will check out the exception data type and delegate mapping to a private method that is specialized for a specific managed provider.

using System;
using System.Data.SqlClient;
using System.Data.OleDb;
using System.Data.Odbc;
using System.Data.OracleClient;

namespace newtelligence.EnterpriseTools.Data
{
   public class RepeatableOperationExceptionMapper
   {
        /// <summary>
        /// Maps the exception to a Repeatable exception, if the error code
        /// indicates that the transaction is repeatable.
        /// </summary>
        /// <param name="sqlException"></param>
        /// <returns></returns>
        private static Exception MapSqlException( SqlException sqlException )
        {
            switch ( sqlException.Number )
            {
                case -2: /* Client Timeout */
                case 701: /* Out of Memory */
                case 1204: /* Lock Issue */
                case 1205: /* Deadlock Victim */
                case 1222: /* Lock Request Timeout */
                case 8645: /* Timeout waiting for memory resource */
                case 8651: /* Low memory condition */
                    return new RepeatableOperationException(sqlException);
                default:
                    return sqlException;
            }
        }

        private static Exception MapOleDbException( OleDbException oledbException )
        {
            switch ( oledbException.ErrorCode )
            {
                default:
                    return oledbException;
            }
        }

        private static Exception MapOdbcException( OdbcException odbcException )
        {
            return odbcException;           
        }

        private static Exception MapOracleException( OracleException oracleException )
        {
            switch ( oracleException.Code )
            {
                case 104:  /* ORA-00104: Deadlock detected; all public servers blocked waiting for resources */
                case 1013: /* ORA-01013: User requested cancel of current operation */
                case 2087: /* ORA-02087: Object locked by another process in same transaction */
                case 60:   /* ORA-00060: Deadlock detected while waiting for resource */
                    return new RepeatableOperationException( oracleException );
                default:
                    return oracleException;
            }
        }

        public static Exception MapException( Exception exception )
        {
            if ( exception is SqlException )
            {
                return MapSqlException( exception as SqlException );
            }
            else if ( exception is OleDbException )
            {
                return MapOleDbException( exception as OleDbException );
            }
            else if (exception is OdbcException )
            {
                return MapOdbcException( exception as OdbcException );
            }
            else if (exception is OracleException )
            {
                return MapOracleException( exception as OracleException );
            }
            else
            {
                return exception;
            }
        }
   }
}

With that little framework of two classes, we can now selectively throw exceptions that convey whether a failed/failing transaction is worth repeating. Next step: How do we do actually run such repeats and make sure we neither lose data nor make the user unhappy in the process? Stay tuned.


 
Categories: Architecture | SOA | Enterprise Services | MSMQ

Deadlocks and other locking conflicts that cause transactional database operations to fail are things that puzzle many application developers. Sure, proper database design and careful implementation of database access (and appropriate support by the database engine) should take care of that problem, but it cannot do so in all cases. Sometimes, especially under stress and other situations with high lock contention, a database just has not much of a choice but picking at least one of the transactions competing for the same locks as the victim in resolving the deadlock situation and then aborts the chosen transaction. Generally speaking, transactions that abort and roll back are a good thing, because this behavior guarantees data integrity. In the end, we use transaction technology for those cases where data integrity is at risk. What’s interesting is that even though transactions are a technology that is explicitly about things going wrong, the strategy for dealing with failing transaction is often not much more than to bubble the problem up to the user and say “We apologize for the inconvenience. Please press OK”.

The appropriate strategy for handling a deadlock or some other recoverable reason for a transaction abort on the application level is to back out of the entire operation and to retry the transaction. Retrying is a gamble that the next time the transaction runs, it won’t run into the same deadlock situation again or that it will at least come out victorious when the database picks its victims. Eventually, it’ll work. Even if it takes a few attempts. That’s the idea. It’s quite simple.

What is not really all that simple is the implementation. Whenever you are using transactions, you must make your code aware that such “good errors” may occur at any time. Wrapping your transactional ODBC/OLEDB/ADO/ADO.NET code or calls to transactional Enterprise Services or COM+ components with a try/catch block, writing errors to log-files and showing message boxes to users just isn’t the right thing to do. The right thing is to simply do the same batch of work again and until it succeeds.

The problem that some developers seem to have with “just retry” is that it’s not so clear what should be retried. It’s a problem of finding and defining the proper transaction scope. Especially when user interaction is in the picture, things easily get very confusing. If a user has filled in a form on a web page or some dialog window and all of his/her input is complete and correct, should the user be bothered with a message that the update transaction failed due to a locking issue? Certainly not. Should the user know when the transaction fails because the database is currently unavailable? Maybe, but not necessarily. Should the user be made aware that the application he/she is using is for some sudden reason incompatible with the database schema of the backend database? Maybe, but what does Joe in the sales department do with that valuable piece of information?

If stuff fails, should we just forget about Joe’s input and tell him to come back when the system is happier to serve him? So, in other words, do we have Joe retry the job? That’s easy to program, but that sort of strategy doesn’t really make Joe happy, does it?

So what’s the right thing to do? One part of the solution is a proper separation between the things the user (or a program) does and the things that the transaction does. This will give us two layers and “a job” that can be handed down from the presentation layer down to the “transaction layer”. Once this separation is in place, we can come up with a mechanism that will run those jobs in transactions and will automate how and when transactions are to be retried. Transactional MSMQ queues turn out to be a brilliant tool to make this very easy to implement. More tomorrow. Stay tuned.


 
Categories: Architecture | SOA | Enterprise Services | MSMQ

I am presently doing some intense research on services, service patterns, message exchange patterns and many other issues related to services (No surprise there). However, I can't do that without external help and since many people are reading my blog, I can just as well start asking around right here:

I would like to get in touch with companies (preferrably insurances and banks) who afford a corporate history department. The ambitious goal I have is to reconstruct a few banking or insurance or purchasing business processes of ca. 1955-1965. I have come to believe that there is a lot, a lot to be learned there that will be very useful to what we're all doing. The deal is that if you share, I share whatever I have as soon as I have it. My contact address is clemensv@newtelligence.com


 
Categories: SOA

I get emails like that very frequently. I have some news.

Short story: Microsoft is still willing and working to publish the application that I presented at TechEd Europe (see Benjamin's report) and they keep telling me that it will come out. Apparently there is a lot of consensus building to be done to get a big sample application out of the door. So there's nothing to be found on msdn, yet.

Little known secret: There are 15 lucky indivduals who have already received (hand-delivered) the Proseware code as a technical preview under a non-disclosure agreement. Because we (newtelligence) designed and wrote the sample application, we have permission to distribute the complete sample to participants of our SOA workshops and seminars.

So if you want to get yours hands on it, all you need to do is to send mail to training@newtelligence.com to sign up for one of the public events [Next published date is Dec 1-3, and the event is held in German, unless we get swamped with international inquiries] or you send email to the same address asking for an on-site workshop delivery. At this time, we (and MS) bind the code sample to workshop attendance so that you really understand why the application was built like it's built and that you fully understand the guidance that the application implicitly and explicitly carries (and doesn't carry).


 
Categories: SOA | Web Services

I feel like I have been "out of business" for a really long time and like I really got nothing done in the past 3 months, even though that's objectively not true. I guess that's "conference & travel withdrawal", because I had tone and tons of bigger events in the first half of the year and 3 smaller events since TechEd Amsterdam in July. On the upside, I am pretty relaxed and have certainly reduced my stress-related health risks ;-)

So with winter and its short days coming up, the other half of my life living a 1/3 around the planet until next spring, I can and am going to spend some serious time on a bunch of things:

On the new programming stuff front:
     Catch up on what has been going on in Indigo in recent months, dig deeper into "everything Whidbey", figure out the CLR aspects of SQL 2005 and familiarize myself with VS Team System.

On the existing programming stuff front:
      Consolidate my "e:\development\*"  directory on my harddrive and pull together all my samples and utilities for Enterprise Services, ASP.NET Web Services and other enterprise-development technologies and create a production-quality library from of them for us and our customers to use. Also, because the Indigo team is doing quite a bit of COM/COM+ replumbing recently in order to have that prohgraming model ride on Indigo, I have some hope that I can now file bugs/wishes against COM+ that might have a chance of being addressed. If that happens and a particular showstopper is getting out of the way, I will reopen this project here and will, at the very least, release it as a toy.

On the architectural stuff front:
         Refine our SOA Workshop material, do quite a bit of additional work on the FABRIQ, evolve the Proseware architecture model, and get some pending projects done. In addition to our own SOA workshops (the next English-language workshop is held December 1-3, 2004 in Düsseldorf), there will be a series of invite-only Microsoft events on Service Orientation throughout Europe this fall/winter, and I am very happy that I will be speaking -- mostly on architecture topics -- at the Microsoft Eastern Mediterranean Developer Conference in Amman/Jordan in November and several other locations in the Middle East early next year. 

And even though I hate the effort around writing books, I am seriously considering to write a book about "Services" in the next months. There's a lot of stuff here on the blog that should really be consolidated into a coherent story and there are lots and lots of considerations and motiviations for decisons I made for FABRIQ and Proseware and other services-related work that I should probably write down in one place. One goal of the book would be to write a pragmatic guide on how to design and build services using currently shipping (!) technologies that does focus on how to get stuff done and not on how to craft new, exotic SOAP headers, how to do WSDL trickery, or do other "cool" but not necessarily practical things. So don't expect a 1200 page monster. 

In addition to the "how to" part, I would also like to incorporate and consolidate other architect's good (and bad) practical design and implementation experiences, and write about adoption accelerators and barriers, and some other aspects that are important to get the service idea past the CFO. That's a great pain point for many people thinking about services today. If you would be interested in contributing experiences (named or unnamed), I certainly would like to know about it.

And I also think about a German-to-English translation and a significant (English) update to my German-language Enterprise Services book.....

[And to preempt the question: No, I don't have a publisher for either project, yet.]


 
Categories: Architecture | SOA | Blog | IT Strategy | newtelligence | Other Stuff | Talks

September 7, 2004
@ 04:09 AM

A little while ago a new build of Indigo found its way onto my desk. On thing that’s interesting about this particular interim milestone (don’t hope for juicy details here on the blog) is that it doesn’t support WSDL. No. Calm down. It’s not what you think. It just happens that the WSDL support wasn’t included in this particular version; it’ll be there, no worries.

Such interim binary drops that the Microsoft product groups give out to very early adopters are really only meant for the bravest of the brave with too much time on their hands. Therefore, the documentation is not entirely in synch with the feature set – and that’s a stretch. (Hey, I am not complaining.) So until someone told me “yeah, there’s no WSDL or Metadata exchange worth talking about in this build” I tried to find how contract works in this build and of course I couldn’t find it. Well, not really true. It turns out that contract works just like with ASP.NET 1.0 or ASP.NET 1.1 Web Services. At runtime, WSDL usually doesn’t play any role, at all. Unless you use some funky dynamic binding logic, WSDL is just a design-time metadata container and that’s the basis for generating CLR metadata and code. Although there is an implementation aspect to it when you generate proxies or server skeletons, the most important job of WSDL.EXE is to perform a conversion of the WSDL rendering of the message contract into a format that the ASMX infrastructure can readily understand. That format happens to be classes with methods and attribute annotations. Here is a client and a server I just typed up with Notepad:

Contract.asmx

Client.cs

<% @WebService class="MyService" language="C#"%>

using System.Web.Services;
using System.Web.Services.Protocols;

[WebService(Namespace="http://tempuri.org")]
public class MyService
{
    [WebMethod]
    public string Hello( string Test )
    {
       return "Hello "+Test;
    }
}

using System;
using System.Web.Services;
using System.Web.Services.Protocols;

[WebServiceBinding(Namespace="http://tempuri.org")]
public class MyService : SoapHttpClientProtocol
{
    [SoapDocumentMethod]
    public string Hello( string Test )
    {
       return (string)Invoke("Hello", new object[]{Test})[0];
    }
}

public class MainApp
{
   static void Main()
   {
       MyService m = new MyService();
       m.Url = "http://localhost/contract.asmx";
       Console.WriteLine("Result: "+ m.Hello("Test"));
   }
}

Drop the contract.asmx into x:\inetpub\wwwroot, compile the client with “csc client.cs” and run it. No WSDL ever changed hands, no “Add Web Reference”, it just works. Ok, ok, it’s the year ‘04 now and there’s really no magic there anymore. Now here’s a tiny bit of refactoring:

Contract.asmx

Client.cs

<% @WebService class="MyService" language="C#"%>

using System.Web.Services;
using System.Web.Services.Protocols;


public interface IMyService
{
    string Hello( string Test );
}


[WebService(Namespace="http://tempuri.org")]
public class MyService : IMyService
{
    [WebMethod]
    public string Hello( string Test )
    {
       return "Hello "+Test;
    }
}

using System;
using System.Web.Services;
using System.Web.Services.Protocols;


public interface IMyService
{
    string Hello( string Test );
}


[WebServiceBinding(Namespace="http://tempuri.org")]
public class MyService : SoapHttpClientProtocol, IMyService
{
    [SoapDocumentMethod]
    public string Hello( string Test )
    {
       return (string)Invoke("Hello", new object[]{Test})[0];
    }
}

public class MainApp
{
   static void Main()
   {
       MyService m = new MyService();
       m.Url = "http://localhost:8080/contract.asmx";
       Console.WriteLine("Result: "+ m.Hello("Test"));
   }
}

Extracting the interface from server and proxy makes it very, very clear that we’re dealing with the very same message contract. Mind that we only have source-code level contract equivalence between client and server here. The compiled code on either side yields distinct IMyService types and that’s supposed to be that way. In the case I am illustrating here, the language C# serves as the metadata language (and the clipboard is the mechanism) for sharing contract between client and server (or endpoints).

There are two things I find (mildly) annoying about ASP.Net Web Services 1.x and they also become quite apparent in this example: #1 the server side and the client side have a different minimum set of required attributes and #2 ASP.NET’s ASMX support doesn’t look at inherited method-level attributes. The following does not even compile:

Contract.asmx

Client.cs

<% @WebService class="MyService" language="C#"%>

using System.Web.Services;
using System.Web.Services.Protocols;

[WebServiceBinding(Namespace="http://tempuri.org")]
[WebService(Namespace="http://tempuri.org")]
public interface IMyService
{
    [SoapDocumentMethod][WebService]
    string Hello( string Test );
}

public class MyService : IMyService
{
    public string Hello( string Test )
    {
       return "Hello "+Test;
    }
}

using System;
using System.Web.Services;
using System.Web.Services.Protocols;

[WebServiceBinding(Namespace="http://tempuri.org")]
[WebService(Namespace="http://tempuri.org")]
public interface IMyService
{
    [SoapDocumentMethod][WebService]   
    string Hello( string Test );
}

public class MyService : SoapHttpClientProtocol, IMyService
{
    public string Hello( string Test )
    {
       return (string)Invoke("Hello", new object[]{Test})[0];
    }
}

public class MainApp
{
   static void Main()
   {
       MyService m = new MyService();
       m.Url = "http://localhost:8080/contract.asmx";
       Console.WriteLine("Result: "+ m.Hello("Test"));
   }
}

And, frankly, it’s probably not bad that it doesn’t compile, because the half of the attributes on the resulting interface declaration are useless on either side.

Indigo pulls both sides together into the notion of a service contract that’s good for either side (I am using a simple variant of the “publicly known” notation here)

[ServiceContract]
public interface IMyService
{
    [ServiceMethod]
    string Hello( string Test );
}

If you want to see it that way, this is Indigo’s IDL (Interface Definition Language). There’s an equivalence transformation from and to WSDL, if you want to share the contract with folks who program on other platforms or who use another programming language. There are just no angle brackets here and it’s much easier to read, too.

So if you see code that I wrote and you find (even today) seemingly unnecessary interface declarations that are implemented on a single web service class within a project and nowhere else and are also not referenced from anywhere – they might not be unnecessary as they seem. It’s simply IDL!

More confusingly, you might find those declarations replicated! in source code! in another service! Yes, because services are designed to be evolved independent of each other. So while the target service is already in V4, the consumer may still be on the level of V2. Moving up to the V4 interface version is a conscious choice by the developer of the service consumer – and at that point in time he/she imports the most recent contract. Whether that’s copy/paste of a C#/VB/C++ declaration or clicking “Update” on some menu in Visual Studio that turns WSDL into proxy code is not very important; it’s just a matter of tool preference.

Using explicit interface declarations with Web Services is strictly a “contract first” model. It’s just not using WSDL, that’s all.


 
Categories: SOA

(1) Policy-negotiated behavior, (2) Explicitness of Boundaries, (3) Autonomy and (4) Contract/Schema Exchange are the proclaimed tenets of service orientation. As I am getting along with the design for the services infrastructure we're working on, I find that one of the four towers the others in importance and really doesn't really fit well with them: Autonomy.

P, E and CE say things about the edge of a service. P speaks about how a service negotiates its edge behavior, guarantees and expectations with others. E speaks about how talking to another service is and must be different from talking to anything within your own service. CE speaks about how it is possible to establish a common understanding about messages and data while allowing independent evolution of internals by not sharing programming-level type constructs.

P is about creating and using techniques for metadata exchange and metadata controlled dynamic configuration of service-edges, E is about creating and using techniques for shaping service-edge communication paths and tools in a way that services don't get into each others underwear, and CE is about techniques for defining, exchanging, and implementing the flow of data across service-edges so that services can deal with each other's "stuff".

P, E and CE are guiding tenets for "Web Services" and the stack that evolves around SOAP. These are about "edge stuff". Proper tooling around SOAP, WSDL, XML Schema, WS-Policy (think ASMX, WSE, Axis, or Indigo) makes or will make it relatively easy for any programmer to do "the right thing" about these tenets.

Autonomy, on the other hand, is rather independent from the edge of a service. It describes a fundamental approach to architecture. An "autonomous digital entity" is "alive", makes independent decisions, hides its internals, and is in full control of the data it owns. An autonomous digital entity is not fully dependent on stuff happening on its edge or on inbound commands or requests. Instead, an autonomous service may routinely wake up from a self-chosen cryostasis and check whether certain data items that it is taking care of are becoming due for some actions, or it may decide that it is time to switch on the lights and lower the windows blinds to fends off burglars while its owner is on vacation. 

Autonomy is actually quite difficult to (teach and) achieve and much more a matter of discipline than a matter of tooling. If you have two "Web services" that sit on top of the very same data store and frequently touch what's supposed to be each others private (data) parts, each of them may very well fulfill the P, E, CE tenets, but they are not autonomous. If you try to scale out and host such a pair or group of Web services in separate locations and on top of separate stores, you end up with a very complicated Siamese-twins-separation surgery.  

That gets me to very clearly separate the two stories: Web Services <> Services.

A service is an autonomous digital entity that may or may not accept commands, inquiries (and data) from the outside world. If it chooses to accept such input, it can choose to do so through one or more web service edges that fulfill the P,E,CE tenets and/or may choose to accept totally different forms of input. If it communicates with the outside world from within, it may or may choose to do so via web services. A service is a program.

A web service is an implementation of a set of edge definitions (policies and contracts and channels) that fronts a service and allows services to communicate with each other. Two services may choose to communicate using multiple web services, if they wish to do so. A web service is a communication tool.

With that, I'll cite myself from three paragraphs earlier: If you have two "Web services" that sit on top of the very same data store [...] they are not autonomous. If you try to scale out [...] you end up with a very complicated Siamese-twins-separation surgery."  ... and correct myself: If you have two "Web services", they don't sit on the data store, they front a service. The service is the smallest deployable unit. The service provides the A, the web services bring P, E and CE to the table. A web service may, indeed, just be a message router, security gateway, translator or other intermediary that handles messages strictly on the wire-level and dispatches them to yet another web service fronting an actual service that does the work prescribed by the message. 

All of this, of course, causes substantial confusion about the duplicate use of the word service. The above it terribly difficult to read. I would wish it was still possible to kill the (inapproprate) term "web service" and just call it "edge", or "XML Edge", or (for the marketing people) "SOAP Edge Technology", or maybe "Service Edge XML", although I don't think the resulting acronym would go over well in the U.S.


 
Categories: SOA

August 6, 2004
@ 07:40 AM

I don't blog much in summer. That's mostly because I am either enjoying some time off or I am busy figuring out "new stuff".

So here's a bit of a hint what currently keeps me busy. If you read this in an RSS aggregator, you better come to the website for this explanation to make sense.

This page here is composed from several elements. There are navigation elements on the left, including a calendar, a categories tree and an archive link list that are built based on index information of the content store. The rest of the page, header and footer elements aside, contains the articles, which are composed onto the page based on a set of rules and there's some content aggregation going on to produce, for instance, the comment counter. Each of these jobs takes a little time and they are worked on sequentially, while the data is acquired from the backend, the web-site (rendering) thread sits idle.

Likewise, imagine you have an intranet web portal that's customizable and gives the user all sorts of individual information like the items on his/her task list, the unread mails, today's weather at his/her location, a list of current projects and their status, etc.  All of these are little visual components on the page that are individually requested and each data item takes some time (even if not much) to acquire. Likely more than here on this dasBlog site. And all the information comes from several, distributed services with the portal page providing the visual aggregation. Again, usually all these things are worked on sequentially. If you have a dozen of those elements on a page and it takes unusually long to get one of them, you'll still sit there and wait for the whole dozen. If the browser times out on you during the wait, you won't get anything, even if 11 out of 12 information items could be acquired.

One aspect of what I am working having all those 12 things done at the same time and allow the rendering thread to do a little bit of work whenever one of the items is ready and to allow the page to proceed whenever it loses its patience with one or more of those jobs. So all of the data acquisition work happens in parallel rather than in sequence and the results can be collected and processed in random order and as they are ready. What's really exciting about this from an SOA perspective is that I am killing request/response in the process. The model sits entirely on one-way messaging. No return values, not output parameters anywhere in sight.

In case you wondered why it is so silent around here ... that's why.


 
Categories: SOA

I was a little off when I compared my problem here to a tail call. Gordon Weakliem corrected me with the term "continuation".

The fact that the post got 28 comments shows that this seems to be an interesting problem and, naming aside, it is indeed a tricky thing to implement in a framework when the programming language you use (C# in my case) doesn't support the construct. What's specifically tricky about the concrete case that I have is that I don't know where I am yielding control to at the time when I make the respective call.

I'll recap. Assume there is the following call

CustomerService cs = new CustomerService();
cs.FindCustomer(customerId);

FindCustomer is a call that will not return any result as a return value. Instead, the invoked service comes back into the caller's program at some completely different place such this:

[WebMethod]
public void
FindCustomerReply(Customer[] result)
{
   ...
}

So what we have here is a "duplex" conversation. The result of an operation initiated by an outbound message (call) is received, some time later, through an inbound message (call), but not on the same thread and not on the same "object". You could say that this is a callback, but that's not precisely what it is, because a "callback" usually happens while the initiating call (as above FindCustomer) has not yet returned back to its scope or at least while the initiating object (or an object passed by some sort of reference) is still alive. Here, instead, processing of the FindCustomer call may take a while and the initiating thread and the initiating object may be long gone when the answer is ready.

Now, the additional issue I have is that at the time when the FindCustomer call is made, it is not known what "FindCustomerReply" message handler it going to be processing the result and it is really not know what's happening next. The decision about what happens next and which handler is chosen is dependent on several factors, including the time that it takes to receive the result. If the FindCustomer is called from a web-page and the service providing FindCustomer drops a result at the caller's doorstep within 2-3 seconds [1], the FindCustomerReply handler can go and hijack the initial call's thread (and HTTP context) and render a page showing the result. If the reply takes longer, the web-page (the caller) may lose its patience [2] and choose to continue by rendering a page that says "We are sending the result to your email account." and the message handler with not throw HTML into an HTTP response on an open socket, but rather render it to an email and send it via SMTP and maybe even alert the user through his/her Instant Messenger when/if the result arrives.

[1] HTTP Request => FindCustomer() =?> "FindCustomerReply" => yield to CustomerList.aspx => HTTP Response
[2] HTTP Request => FindCustomer() =?> Timeout!            => yield to YouWillGetMail.aspx => HTTP Response
                               T+n =?> "FindCustomerReply" => SMTP Mail
                                                           => IM Notification

So, in case [1] I need to correlate the reply with the request and continue processing on the original thread. In case [2], the original thread continues on a "default path" without an available reply and the reply is processed on (possibly two) independent threads and using two different notification channels.

A slightly different angle. Consider a workflow application environment in a bank, where users are assigned tasks and simply fetch the next thing from the to-do list (by clicking a link in an HTML-rendered list). The reply that results from "LookupAndDoNextTask" is a message that contains the job that the user is supposed to do.  

[1] HTTP Request => LookupAndDoNextTask() =?> Job: "Call Customer" => yield to CallCustomer.aspx => HTTP Response
[2] HTTP Request => LookupAndDoNextTask() =?> Job: "Review Credit Offer" => yield to ReviewCredit.aspx => HTTP Response
[3] HTTP Request => LookupAndDoNextTask() =?> Job: "Approve Mortgage" => yield to ApproveMortgage.aspx => HTTP Response
[4] HTTP Request => LookupAndDoNextTask() =?> No Job / Timeout => yield to Solitaire.aspx => HTTP Response

In all of these cases, calls to "FindCustomer()" and "LookupAndDoTask()" that are made from the code that deals with the incoming request will (at least in the theoretical model) never return to their caller and the thread will continue to execute in a different context that is "TBD" at the time of the call. By the time the call stack is unwound and the initiating call (like FindCustomer) indeed returns, the request is therefore fully processed and the caller may not perform any further actions. 

So the issue at hand is to make that fact clear in the programming model. In ASP.NET, there is a single construct called "Server.Transfer()" for that sort of continuation, but it's very specific to ASP.NET and requires that the caller knows where you want to yield control to. In the case I have here, the caller knows that it is surrendering the thread to some other handler, but it doesn't know to to whom, because this is dynamically determined by the underlying frameworks. All that's visible and should be visible in the code is a "normal" method call.

cs.FindCustomer(customerId) might therefore not be a good name, because it looks "too normal". And of course I don't have the powers to invent a new statement for the C# language like continue(cs.FindCustomer(customerId)) that would result in a continuation that simply doesn't return to the call location. Since I can't do that, there has to be a different way to flag it. Sure, I could put an attribute on the method, but Intellisense wouldn't show that, would it? So it seems the best way is to have a convention of prefixing the method name.

There were a bunch of ideas in the comments for method-name prefixes. Here is a selection:

  • cs.InitiateFindCustomer(customerId)
  • cs.YieldFindCustomer(customerId)
  • cs.YieldToFindCustomer(customerId)
  • cs.InjectFindCustomer(customerId)
  • cs.PlaceRequestFindCustomer(customerId)
  • cs.PostRequestFindCustomer(customerId)

I've got most of the underlying correlation and dispatch infrastructure sitting here, but finding a good programming model for that sort of behavior is quite difficult.

[Of course, this post won't make it on Microsoft Watch, eWeek or The Register]


 
Categories: Architecture | SOA | Technology | ASP.NET | CLR

July 19, 2004
@ 12:07 AM

The recording of last week's .NET Rocks show on which I explained my view on the "services mindset" (at 4AM in the morning) is now available for download from franklins.net


 
Categories: SOA

newtelligence AG will be hosting an open workshop on service-oriented development, covering principles, architecture ideas and implementation guidance on October 13-15 in Düsseldorf, Germany.

The workshop will be held in English, will be hosted by my partner and “Mr. Methodologies” Achim Oellers and myself, and is limited to just 15 (!) attendees to assure an interactive environment that maximizes everyone’s benefit. The cap on the number of attendees also allows us to adjust the content to individual needs to some extent.

We will cover the “services philosophy” and theoretical foundations of service-compatible transaction techniques, scalability and federation patterns, autonomy and other important aspects. And once we’ve shared our “services mind-set”, we will take the participants on a very intense “guided tour” through (a lot of) very real and production-level quality code (including the Proseware example application that newtelligence built for Microsoft Corporation) that turns the theory to practice on the Windows platform and shows that there’s no need to wait for some shiny future technology to come out in 2 year’s time to benefit from services today.

Regular pricing for the event is €2500.00 (plus applicable taxes) and includes:

  • 3-day workshop in English from 9:00 – 18:00 (or later depending on topic/evening) 
  •  2 nights hotel stay (Oct 13th and 14th)
  • Group dinner with the experts on the first night.  The 2nd night is at your disposal to enjoy Düsseldorf’s fabulous Altstadt at your own leisure
  • Lunch (and snacks/drinks throughout the day)
  • Printed materials (in English), as appropriate
  • Post-Workshop CD containing all presentations and materials used/shown

For registration inquiries, information about the prerequisites, as well as for group and early-bird discount options, please contact Mr. Fons Habes via training@newtelligence.com. If the event is sold out at the time of your inquiry or if you are busy on this date, we will be happy to pre-register you for one of the upcoming event dates or arrange for an event at your site.


 
Categories: Architecture | SOA | newtelligence

July 15, 2004
@ 04:17 AM

We have a bit of a wording problem. With what I am current building we have a bit (not precisely) a notion of "tail calls". Here's an example:

public void LookupMessage(int messageId)
{
  
MessageStoreService messageStore = new MessageStoreService();
  
messageStore.LookupMessage(messageId);
}

The call to LookupMessage() doesn't return anything as a return value or through output parameters. Instead, the resulting reply message surfaces moments later at a totally different place within the same application. At the same time, the object with the method you see here, surrenders all control to the (anonymous) receiver of the reply. It's a tiny bit like Server.Transfer() in ASP.NET.

So the naming problem is that neither of "GetMessage()", "LookupMessage()", "RequestMessage()" sounds right and they all look odd if there's no request/response pattern. The current favorite is to prefix all such methods with "Yield" so that we'd have "YieldLookupMessage()". Or "LookupMessageAndYield()"? Or something else?

Update: Also consider this

public void LookupCustomer(int customerId)
{
   CustomerService cs = new
CustomerService();
  
cs.FindCustomer(customerId);
}


 
Categories: SOA

Carl invited me for .NET Rocks on Thursday night. That is July 15th, 10 PM-Midnight Eastern Standard Time (U.S.) which is FOUR A.M. UNTIL SIX A.M. Central European Time (CET) on Friday morning. I am not sure whether my brain can properly operate at that time. The most fun thing would be to go out drinking Thursday night ;-)   I want to talk about (guess what) Services. Not Indigo, not WSE, not Enterprise Services, not SOAP, not XML. Services. Mindset first, tools later.


 
Categories: Architecture | SOA

July 12, 2004
@ 06:07 AM

I've had several epiphanies in the 12 months or so. I don't know how it is for other people, but the way my thinking evolves is that I've got some inexpressible "thought clouds" going around in my head for months that I can't really get on paper or talk about in any coherent way. And then, at some point, there's some catalyst and "bang", it all comes together and suddenly those clouds start raining ideas and my thinking very rapidly goes through an actual paradigm shift.

The first important epiphany occurred when Arvindra gave me a compact explanation of his very pragmatic view on Agent Technology and Queueing Networks, which booted the FABRIQ effort. Once I saw what Arvindra had done in his previous projects and I put that together with my thinking about services, a lot of things clicked. The insight that formed from there was that RPC'ish request/response interactions are very restrictive exceptions in a much larger picture where one-way messages and much more complex message flow-patterns possibly involving an arbitrary number of parties are the norm.

The second struck me while on stage in Amsterdam and during the "The Nerd, The Suit, and the Fortune Teller" play as Pat and myself were discussing Service Oriented User Interaction. (You need to understand that we had very limited time for preparation and hence we had a good outline, but the rest of the script essentially said "go with the flow" and so most of it was pure improvisation theater). The insight that formed can (with all due respect) be shortened "the user is just another service". Not only users shall drive the interaction by issuing messages (commands) to a systems for which they expect one or more out of a set of possible replies, but there should also be a way how systems can be drive an interaction by issuing messages to users expecting one or more out of a set of possible replies. There is no good reason why any of these two directions of driving the interaction should receive preferred treatment. There is no client and there is no server. There are just roles in interactions. That moment, the 3-layer/3-tier model of building applications died a quick and painless death in my head. I think I have a new one, but the clouds are still raining ideas. Too early for details. Come back and ask in a few months.


 
Categories: Architecture | SOA

In my comment view for the last post (comment #1), Piyush Pant writes about the confusion around different pipeline models and frameworks that are popping up all over the place and mentions Proseware, so I need to clarify some things:

I'll address the "too many frameworks" concern first: Proseware's explicit design goal and my job was to use the technologies ASP.NET Web Services, WSE 2.0, IIS, MSMQ, and Enterprise Services as pure as possible and I did intentionally not introduce yet another framework for the runtime bits beyond a few utility classes used by the services as a common infrastructure (like a config-driven web service proxy factory, the queue listener, or the just-in-time activation proxy pooling). What my job was and what I reasonably succeeded at was to show that:

Writing Service Oriented Applications on today's Windows Server 2003 platform does not require yet another framework.

The framework'ish pieces that I had to add are simply addressing some deployment issues like creating accounts, setting ACLs or setting up databases, that need to be done in a "real" app hat isn't a toy. Such things are sometimes difficult to abstract on the level of what the .NET Framework can offer as a general-purpose platform or are simply not there yet. All of these extra classes reside in an isolated assembly that's only used by the installers.

The total number of utility classes that play a role of any importance at runtime is 5 (in words five) and none of them has more than three screen pages worth of actual code. Let me repeat:

Writing Service Oriented Applications on today's Windows Server 2003 platform does not require yet another framework.

I do have a dormant (newtelligence-owned) code branch sitting here that'd make a lot of things in Proseware easier and more elegant to develop and makes reconfiguring services more convenient, but it's a developer convenience and productivity framework. No pipelines, no other architecture, just a prettier shell around the exact Proseware architecture and technologies I chose.

To illustrate my point about the fact that we don't need another entirely new framework, I have here (MessageQueueWebRequest.cs.txt, MessageQueueWebResponse.cs.txt) an early 0.1 prototype copy of our MessageQueueWebRequest/-WebResponse class pair that supports sending WS messages through MSMQ. (That prototype only does very simple one-way messages; you can do a lot more with MSMQ).  

Take the code, put it in yours, create a private queue, take an arbitrary ASMX WebService proxy, call MessageQueueWebRequest.RegisterMSMQProtocol() when your app starts, instantiate the proxy, set the Url property of the proxy to msmq://mymachine/private$/myqueue, invoke the proxy and watch how a SOAP message materializes in the queue.

Next step: use a WSE proxy. Works too. I'll leave the receiver logic to your imagination, but that's not really much more than listening to the queue and throwing the message into a WSE 2.0 SoapMethod or throwing it as a raw HTTP request at an ASMX WebMethod or by using a SimpleWorkerRequest on a self-hosted ASP.NET AppDomain (just like WebMatrix's Cassini hosts that stuff).

 

On to "pipelines" in the same context: Pipelines are a very common design pattern and you can find hundreds of variations of them in many projects (likely dozens from MS) which all have some sort of a notion of a pipeline. It's just "pipeline", not Pipeline(tm) 2003 SP1.

User-extensible pipeline models are a nice idea, but I don't think they are very useful to have or consider for most services of the type that Proseware has (and that covers a lot of types).

Frankly, most things that are done with pipelines in generalized architectures that wrap around endpoints (in/out crosscutting pipelines) and that are not about "logging" (which is, IMHO, more useful if done explicitly and in-context) are already in the existing technology stack (Enterprise Services, WSE) or are really jobs for other services.

There is no need to invent another pipeline to process custom headers in ASMX, if you have SoapExtensions. There is no need to invent a new pipeline model to do WS-Security, if you can plug the WSE 2.0 pipeline into the ASMX SoapExtension pipeline already. There is no need to invent a new pipeline model to push a new transaction context on the stack, if you can hook the COM+ context pipeline into your call chain by using ES. There is no need to invent another pipeline for authorization, if you can hook arbitrary custom stuff into the ASP.NET Http Pipeline or the WSE 2.0 pipeline already has or simply use what the ES context pipeline gives you.

I just enumerated four (!) different pipeline models and all of them are in the bits you already have on a shipping platform today and as it happens, all of them compose really well with each other. The fact that I am writing this might show that most of us just use and configure their services without even thinking of them as a composite pipeline model.

"We don't need another Pipeline" (I want Tina Turner to sing that for me).

Of course there's other pipeline jobs, right? Mapping!

Well, mapping between schemas is something that goes against the notion of a well-defined contract of a service. Either you have a well-defined contract or two or three or you don't. If you have a well-defined contract and there's a sender that doesn't adhere to it, it's the job of another service to provide that sort of data negotiation, because that's a business-logic task in and by itself.

Umm ... ah! Validation!

That might be true if schema validation is enough, but validation of data is a business logic level task if things get more complex (like if you need to check a PO against your catalog and need to check whether that customer is actually entitled to get a certain discount bracket). That's not a cross-cutting concern. That's a core job of the app.

Pipelines are for plumbers

 

Now, before I confuse everyone (and because Piyush mentioned it explicitly):

FABRIQ is a wholly different ballgame, because it is precisely a specialized architecture for dynamically distributable, queued (pull-model), one-way pipeline message processing and that does require a bit of a framework, because the platform doesn't readily support it.

We don't really have a notion of an endpoint in FABRIQ that is the default terminal for any message arriving at a node. We just let stuff asynchronously flow in one direction and across machines and handlers can choose to look at, modify, absorb or yield resultant messages into the pipeline as a result of what they do. In that model, the pipeline is the application. Very different story, very different sets of requirements, very different optimization potential and not really about services in the first place (although we stick to the tenets), but rather about distributing work dynamically and about doing so as fast as we can make it go.

Sorry, Piyush! All of that totally wasn't going against your valued comments, but you threw a lit match into a very dry haystack.

 


 
Categories: Architecture | SOA

Benjamin Mitchell wrote a better summary of my "Building Proseware Inc." session at TechEd Amsterdam than I ever could.

Because ... whenever the lights go on and the mike is open, I somehow automatically switch into an adrenalin-powered auto-pilot mode that luckily works really well and since my sessions take up so much energy and "focus on the moment", I often just don't remember all the things I said once the session is over and I am cooled down. That also explains why I almost never rehearse sessions (meaning: I never ever speak to the slides until I face an audience) except when I have to coordinate with other speakers. Yet, even though most of my sessions are really ad-hoc performances, whenever I repeat a session I usually remember whatever I said last time just at the very moment when the respective topic comes up, so there's an element of routine. It is really strange how that works. That's also why I am really a bad advisor on how to do sessions the right way, because that is a very risky approach. I just write slides that provide me with a list of topics and "illustration helpers" and whatever I say just "happens". 

About Proseware: All the written comments that people submitted after the session have been collected and are being read and it's very well understood that you want to get your hands on the bits as soon as possible. One of my big takeaways from the project is that if you're Microsoft, releasing stuff that is about giving "how-to" guidance is (for more reasons you can imagine) quite a bit more complicated than just putting bits up on a download site. It's being worked on. In the meantime, I'll blog a bit about the patterns I used whenever I can allocate a timeslice.


 
Categories: Architecture | SOA | TechEd Europe

Simple question: Please show me a case where inheritance and/or full data encapsulation makes sense for business/domain objects on the implementation level. 

I'll steal the low-hanging fruit: Address. Address is a great candidate when you look at an OOA model as you could model yourself to death having BaseAddress(BA) and BA<-StreetAddress(SA) and BA<-PostalAddress(PA) and SA<-US_StreetAddress and SA<-DE_StreetAddress and SA<-UK_StreetAddress and so forth.

When it comes to implementation, you'll end up refactoring the class into on thing: Address. There's probably an AddressType attribute and there's a Country field that indicates the formatting and since implementing a full address validation component is way too much work that feature gets cut anyway and hence we end up with a multiline text field with the properly formatted address and stuff like Street and PostOfficeBox (eventually normalized to AddressField), City, PostalCode, Country and Region is kept separate really just to make searching easier and faster. The stuff that goes onto the letter envelope is really only the preformatted address text.

Maybe I am too much of a data (read: XML, Messages, SQL) guy by now, but I just lost faith that objects are any good on the "business logic" abstraction level. The whole inheritance story is usually refactored away for very pragmatic reasons and the encapsulation story isn't all that useful either. You simply can't pragmatically regard data validation of data on a property get/set level as a useful general design pattern, because a type like Address is one type with interdependency between its elements and not simply a container for types. The rules for Region depend on Country and the rules for AddressField (or Street/PostOfficeBox) depend on AddressType. Since the object can't know your intent of what data you want to supply to it on a property get/set level, it can't do meaningful validation on that level. Hence, you end up calling something like address.Validate() and from there it's really a small step to separate out code and data into a message and a service that deals with it and call Validate(address). And that sort of service is the best way to support polymorphism over a scoped set of "classes" because it can potentially support "any" address schema and can yet concentrate and share all the validation logic (which is largely the same across whatever format you might choose) in a single place and not spread it across an array of specialized classes that's much, much harder to maintain.

What you end up with are elements and attributes (infoset) for the data that flows across, services that deal with the data that flows, and rows and columns that efficiently store data and let you retrieve it flexibly and quickly. Objects lost (except on the abstract and conceptional analysis level where they are useful to understand a problem space) their place in that picture for me.

While objects are fantastic for frameworks, I've absolutely unlearned why I would ever want them on the business logic level in practice. Reeducate me.


 
Categories: Architecture | SOA

We've built FABRIQ, we've built Proseware. We have written seminar series about Web Services Best Practices and Service Orientation for Microsoft Europe. I speak about services and aspects of services at conferences around the world. And at all events where I talk about Services, I keep hearing the same question: "Enough of the theory, how do I do it?"

Therefore we have announced a seminar/workshop around designing and building service oriented systems that puts together all the things we've found out in the past years about how services can be built today and on today's Microsoft technology stack and how your systems can be designed for with migration to the next generation Microsoft technlogy stack in mind. Together with our newtelligence Associates, we are offering this workshop for in-house delivery at client sites world-wide and are planning to announce dates and locations for central, "open for all" events soon.

If you are interested in inviting us for an event at your site, contact Bart DePetrillo, or write to sales@newtelligence.com. If you are interested in participating at a central seminar, Bart would like to hear about it (no obligations) so that we can select reasonable location(s) and date(s) that fit your needs.


 
Categories: Architecture | SOA | FABRIQ | Indigo | Web Services

June 4, 2004
@ 02:15 AM

Autonomy means that a service is alive.

Here are my sub-tenets:

  • It has its very own, independent view on data. That may or may not result in fully owning its own data store (I think it should, but that's all a matter of scale and use case), but it certainly shall never share its own view on a shared store with others. The service's public interface(s) provide(s) the only way to manipulate its view on data.
  • It controls its own lifetime. It can do periodical tasks, spin its own threads and should not be forced to shut down because its hosting process model thinks it's idle for the sole reason that it hasn't seen inbound traffic for a while.
  • It has its own identity and carries a security responsibility. It identifies itself with a service-unique principal against other services and through of its own authorization rules it takes the responsibility upon itself that no user gains illegitimate access to backend data or services. It identifies and takes responsibility for those that invoke it, but never assumes their identity.

The PEACE tenets for SO are a composite set. Autonomy is architecturally the most far reaching of the SO tenets and it is much more about the inside and fundamental behavior of a service than about its edge.


 
Categories: SOA

I am back home from San Diego now. About 3 more hours of jet-lag to work on. This will be a very busy two weeks until I make a little excursion to the Pakistan Developer Conference in Karachi and then have another week to do the final preparations for TechEd Europe.

One of the three realy cool talks I'll do at TechEd Europe is called "Building Proseware" and explains the the scenario, architecture, and core implementation techniques of Proseware, an industrial-strength, robust, service-oriented example application that newtelligence has designed and implemented for Microsoft over the past 2 months.

The second talk is one that I have been looking forward to for a long time: Rafal Lukawiecki and myself are going to co-present a session. And if that weren't enough: The moderator of our little on-stage banter about services is nobody else than Pat Helland.

And lastly, I'll likely sign-off on the first public version of the FABRIQ later this week (we had been waiting for WSE 2.0 to come out), which means that Arvindra Sehmi and myself can not only repeat our FABRIQ talk in Amsterdam but have shipping bits to show this time. There will even be a hands-on lab on FABRIQ led by newtelligence instructors Achim Oellers and Jörg Freiberger. The plan is to launch the bits before the show, so watch this space for "when and where".

Overall, and as much as I like meeting all my friends in the U.S. and appreciate the efforts of the TechEd team over there, I think that for the last 4 years TechEd Europe consistently has been and will be again the better of the two TechEd events from a developer perspective. In Europe, we have TechEd and IT Forum, whereby TechEd is more developer focused and IT Forum is for the operations side of the house. Hence, TechEd Europe can go and does go a lot deeper into developer topics than TechEd US.

There's a lot of work ahead so don't be surprised if the blog falls silent again until I unleash the information avalanche on Proseware and FABRIQ.


 
Categories: Architecture | SOA | Talks | TechEd Europe | TechEd US | FABRIQ

All the wonderful loose coupling on the service boundary doesn't help you the least bit, if you tightly couple a set of services on a common store. The temptation is just too big that some developer will go and make a database join across the "data domains" of services and cause a co-location dependency of data and schema dependencies between services. If you share data stores, you break the autonomy rule and you simply don't have a service.

Separating out data stores means at least that every service has it's own "tablespace" or "database" and that in-store joins between those stores are absolutely forbidden. If you have a service managing customers and a service managing invoices, the invoice service must go through the service front for anything that has to do with customer data.

If you want to do reporting across data owned by several services, you must have a reporting service that pulls the data through service interfaces, consolidates it and creates the reports from there.

Will this all be a bit slower than coupling in the store? Sure. It will make your architecture infinitely more agile, though and allows you to implement a lot of clustering scalability patterns. In that way, autonomy is not about making everything a Porsche 911; it's about making the roads wider so that nobody (including the Porsche) ends up in a traffic jam all the time. It's also about paving roads that not only let you from A to B in one stretch, but also have something useful called "exits" that let you get off or on that road at any other place between those two points.

If you decide to throw out you own customer service and replace it with a wrapper around Siebel, your invoice service will never learn about that change. If the invoice service were reaching over into co-located tables owned by the (former) customer service, you'd have a lot of work to do to untangle things. You don't need to do that untangling and all that complication. As an architect you should keep things separate from the start and make it insanely difficult for developers to break those rules. Having different databases and, better yet, to scatter them over several machines at least at development time makes it hard enough to keep the discipline.


 
Categories: SOA

May 19, 2004
@ 12:56 AM

The four fundamental transaction principles are nicely grouped into the acronym "ACID" that's simple to remember, and so I was looking for something that's doing the same for the SOA tenets and that sort of represents what the service idea has done to the distributed platform wars:

  • Policy-Based Behavior Negotiation
  • Explicitness of Boundaries
  • Autonomy
  • Contract
    Exchange

 
Categories: Architecture | SOA

May 16, 2004
@ 04:58 AM

Ralf Westphal responded to this and there are really just two sentences that I’d like to pick out from Ralf’s response because that allows me to go a quite a bit deeper into the data services idea and might help to further clarify what I understand as a service oriented approach to data and resource management. Ralf says: There is no necessity to put data access into a service and deploy it pretty far away from its clients. Sometimes is might make sense, sometimes it doesn’t.

I like patterns that eliminate that sort of doubt and which allow one to say “data services always make sense”.

Co-locating data acquisition and storage with business rules inside a service makes absolute sense if all accessed data can be assumed to be co-located on the same store and has similar characteristics with regards to the timely accuracy the data must have. In all other cases, it’s very beneficial to move data access into a separate, autonomous data service and as I’ll explain here, the design can be made so flexible that the data service consumer won’t even notice radical architectural changes to how data is stored. I will show three quite large scenarios to help illustrating what I mean: A federated warehouse system, a partitioned customer data storage system and a master/replica catalog system.

The central question that I want to answer is: Why would you want delegate data acquisition and storage to dedicated services? The short answer is: Because data doesn’t always live in a single place and not all data is alike.

Here the long answer:

The Warehouse

The Warehouse Inventory Service (WIS) holds data about all the goods/items that are stored in warehouse. It’s a data service in the sense that it manages the records (quantity in stock, reorder levels, items on back order) for the individual goods, performs some simplistic accounting-like work to allocate pools of items to orders, but it doesn’t really contain any sophisticated business rules. The services implementing the supply order process and the order fulfillment process for customer orders implement such business rules – the warehouse service just keeps data records.

The public interface [“Explicit Boundary” SOA tenet] for this service is governed by one (or a set of) WSDL portType(s), which define(s) a set of actions and message exchanges that the service implements and understands [“Shared Contract” SOA tenet]. Complementary is a deployment-dependent policy definition for the service, which describes several assertions about the Security and QoS requirements the service makes [“Policy” SOA tenet].

The WIS controls its own, isolated store over which it has exclusive control and the only way that others can get at the content of that data store is through actions available on the public interface of the service [“Autonomy” SOA tenet].

Now let’s say the company running the system is a bit bigger and has a central website (of which replicas might be hosted in several locations) and has multiple warehouses from where items can be delivered. So now, we are putting a total of four instances of WIS into our data centers at the warehouses in New Jersey, Houston, Atlanta and Seattle. The services need to live there, because only the people on site can effectively manage the “shelf/database relationship”. So how does that impact the order fulfillment system that used to talk to the “simple” WIS? It doesn’t, because we can build a dispatcher service implementing the very same portType that accepts order information, looks at the order’s shipping address and routes the allocation requests to the warehouse closest to the shipping destination. In fact now, the formerly “dumb” WIS can be outfitted with some more sophisticated rules that allow to split or to shift the allocation of items to orders across or between warehouses to limit freight cost or ensure the earliest possible delivery in case the preferred warehouse is out of stock for a certain item. Still, from the perspective of the service consumer, the WIS implementation is still just a data service. All that additional complexity is hidden in the underlying “service tree”.

While all the services implement the very same portType, their service policies may differ significantly. Authentication may require certificates for one warehouse and some other token for another warehouse. The connection to some warehouses might be done through a typically rock-solid reliable direct leased line, while another is reached through a less-than-optimal Internet tunnel, which impacts the application-level demand for the reliable messaging assurances. All these aspects are deployment specific and hence made an external deployment-time choice. That’s why WS-Policy exists.

The Customer Data Storage

This scenario for the Customer Data Storage Service (CDS) starts as simple as the Warehouse Inventory scenario and with a single service. The design principles are the same.

Now let’s assume we’re running a quite sophisticated e-commerce site where customers can customize quite a few aspects of the site, can store and reload shopping carts, make personal annotations on items, and can review their own order history. Let’s also assume that we’re pretty aggressively tracking what they look at, what their search keywords are and also what items they put into any shopping cart so that we can show them a very personalized selection of goods that precisely matches their interest profile. Let’s say that all-in-all, we need to have storage space of about 2Mbytes for the cumulative profile/tracking data of each customer. And we happen to have 2 million customers. Even in the Gigabyte age, ~4mln Mbytes (4TB) is quite a bit of data payload to manage in a read/write access database that should be reasonably responsive.

So, the solution is to partition the customer data across an array of smaller (cheaper!) machines that each holds a bucket of customer records. With that we’re also eliminating the co-location assumption.

As in the warehouse case, we are putting a dispatcher service implementing the very same CDS portType on top of the partitioned data service array and therefore hide the storage strategy re-architecture from the service consumers entirely. With this application-level partitioning strategy (and a set of auxiliary service to manage partitions that I am not discussing here), we could scale this up to 2 billion customers and still have an appropriate architecture. Mind that we can have any number of dispatcher instances as long as they implement the same rules for how to route requests to partitions. Strategies for this are a direct partition reference in the customer identifier or a locator service sitting on a customer/machine lookup dictionary.

Now you might say “my database engine does this for me”. Yes, so-called “shared-nothing” clustering techniques do exist on the database level for a while now, but the following addition to the scenario mandates putting more logic into the dispatching and allocation service than – for instance – SQL Server’s “distributed partitioned views” are ready to deal with.

What I am adding to the picture is the European Union’s Data Privacy Directive. Very simplified, the EU directives and regulations it is illegal to permanently store personal data of EU citizens outside EU territory, unless the storage operator and the legislation governing the operator complies with the respective “Safe Harbor” regulations spelled out in these EU rules.

So let’s say we’re a tiny little bit evil and want to treat EU data according to EU rules, but be more “relaxed” about data privacy for the rest of the world. Hence, we permanently store all EU customer data in a data center near Dublin, Ireland and the data for the rest of the world in a data center in Dallas, TX (not making any implications here).

In that case, we’re adding yet another service on top of the unaltered partitioning architecture that implements the same CDS contract and which internally implements the appropriate data routing and service access rules. Those rules which will most likely be based on some location code embedded in the customer identifier (“E1223344” vs. “U1223344”). Based on these rules, requests are dispatched to the right data center. To improve performance and avoid having to data travel along the complete path repeatedly or in small chunks during an interactive session with the customer (customer is logged into the web site), the dispatcher service might choose to have a temporary, non-permanent cache for customer data that is filled with a single request and allows quicker and repeat access to customer data. Changes to the customer’s data that result from the interactive session can later be replicated out to the remote permanent storage.

Again, the service consumer doesn’t really need to know about these massive architectural changes in the underlying data services tree. It only talks to a service that understands a well-known contract.

The Catalog System

Same picture to boot with and the same rules: Here we have a simple service fronting a catalog database. If you have millions of catalog items with customer reviews, pictures, audio and/or video clips, you might chose to partition this just like we did with the customer data.

If you have different catalogs depending on the markets you are selling into (for instance German-language books for Austria, Switzerland and Germany), you might want to partition by location just as in the warehouse scenario.

One thing that’s very special about catalog data is that very much of it rarely ever changes. Reviews are added, media might be added, but except for corrections, the title, author, ISBN number and content summary for a book really doesn’t ever change as long as the book is kept in the catalog. Such data is essentially “insert once, change never”. It’s read-only for all practical purposes.

What’s wonderful about read-only data is that you can replicate it, cache it, move it close to the consumer and pre-index it. You’re expecting that a lot of people will search for items with “Christmas” in the item description come November? Instead of running a full text search every time, run that query once, save the result set in an extra table and have the stored procedure running the “QueryByItemDescription” activity simply return the entire table if it sees that keyword. Read-only data is optimization heaven.

Also, for catalog data, timeliness is not a great concern. If a customer review or a correction isn’t immediately reflected on the presentation surface, but only 30 minutes or 3 hours after is has been added to the master catalog, it doesn’t do any harm as long as the individual adding such information is sufficiently informed of such a delay.

So what we can do to with the catalog is to periodically (every few hours or even just twice a week) consolidate, pre-index and then propagate the master catalog data to distributed read-only replicas. The data services fronting the replicas will satisfy all read operations from the local store and will delegate all write operations directly (passthrough) to the master catalog service. They might choose to update their local replica to reflect those changes immediately, but that would preclude editorial or validation rules that might be enforced by the master catalog service.

 

So there you have it. What I’ve described here is the net effect of sticking to SOA rules.

·         Shared Contract: Any number of services can implement the same contract (although the concrete implementation, purpose and hence their type differ). Layering contract-compatible services with gradually increasing levels of abstractions and refining rules over existing services creates very clear and simple designs that help you scale and distribute data very well

·         Explicit Boundaries: Forbidding foreign access or even knowledge about service internals allows radical changes inside and “underneath” services.

·         Autonomy allows for data partitioning and data access optimization and avoids “tight coupling in the backend”.

·         Policy: Separating out policy from the service/message contract allows flexible deployment of the compatible services across a variety of security and trust scenarios and also allows for dynamic adaptation to “far” or “near” communications paths by mandating certain QoS properties such as reliable messaging.

 

Service-Orientation is most useful if you don’t consider it as just another technique or tool, but embrace it as a paradigm. And very little of this thinking has to do with SOAP or XML. SOAP and XML are indeed just tools.


 
Categories: Architecture | SOA