Transactions. I spent a good deal of the weekend reading two dozen research papers (CiteSeer is a great launch pad to dig into that space) on agreements, consensus, trust, and various forms of blocking and non-blocking atomic commitment models. All that of course motivated by the desperate search for a solution for the Web services space that preserves the simplicity of the programming model for 2-phase commit. Making stuff compensation-based is just a small step for a technology framework person, but it's a giant leap for someone who has to design compensation into the application logic.

Some special problems for Web services as we see them developing:

  • How to establish trust between parties? Think about the implications for dynamic service discovery and invocation using UDDI. Think about the fact that ACID transactions, unlike other services, have a direct impact on the behavior of an entire system due to isolation rules and therefore locking requirements. Think about the potential for creating damage by simply spoofing votes on transaction outcome and think about the potential for DDoS attacks by deliberate blocking.
  • How does proximity affect trust in this context? Is a transaction participant from my own company and for which I have full control of all implementation aspects, but which is running halfway around the planet as trustworthy as the machine next door? After all, a man-in-the-middle attack that targets blocking will only need to intercept and simply block all further traffic between participants.
  • How to deal with connectionless, multi-hop, asynchronous messages? Think about the fact that even these types of message exchanges may require ACID rules to be fully enforced, even of the message exchange isn't synchronous (in the sense of RPC). For optimization reasons, a transactional message conversation may go from Düsseldorf to Dubai, from Dubai to Signapore, from Signapore to Los Angeles and from Los Angeles back to Düsseldorf - so, rather routed once around the planet instead of being communicated in a star-shaped form -- in order to beat the limits of E=mc^2. (One of the reasons why I like things like WS-Routing and WS-Security's capability to variably encrypt select portions of messages).

That's a lot of problems already and just the tip of the iceberg.  I've got some scribbles that address a couple of these issues and one of the key workarounds is the introduction of rules around deadlines for when transactions expire even if participants are in a "prepared" state. However, to efficiently limit blocking, this brings up another hard problem: trustworthy and precise (<50ms) time-synchronization between all parties. Tough stuff.

Updated: