It's 2008. Where's my flying car? RSS 2.0
 Thursday, May 27, 2004

Only this week here at TechEd it became really apparent to me how many people read the things I write here. I've had dozens of "strangers" walking up to me saying "Clemens, I read your blog. Thank you for the things you write.". It's great to meet the real people behind the numbers  (I get an insane amount of hits each day for what is effectively a personal opinion outlet)  and it's absolutely fantastic to hear when people tell me that I am helping them to do their job better. So what I wanted to say is ... "Thank you for stopping by every once in a while and for helping me to do my job well"

Thursday, May 27, 2004 8:43:18 AM (Pacific Daylight Time, UTC-07:00)  #    Comments [0] - Trackback
Blog | Other Stuff
 Wednesday, May 26, 2004

All the wonderful loose coupling on the service boundary doesn't help you the least bit, if you tightly couple a set of services on a common store. The temptation is just too big that some developer will go and make a database join across the "data domains" of services and cause a co-location dependency of data and schema dependencies between services. If you share data stores, you break the autonomy rule and you simply don't have a service.

Separating out data stores means at least that every service has it's own "tablespace" or "database" and that in-store joins between those stores are absolutely forbidden. If you have a service managing customers and a service managing invoices, the invoice service must go through the service front for anything that has to do with customer data.

If you want to do reporting across data owned by several services, you must have a reporting service that pulls the data through service interfaces, consolidates it and creates the reports from there.

Will this all be a bit slower than coupling in the store? Sure. It will make your architecture infinitely more agile, though and allows you to implement a lot of clustering scalability patterns. In that way, autonomy is not about making everything a Porsche 911; it's about making the roads wider so that nobody (including the Porsche) ends up in a traffic jam all the time. It's also about paving roads that not only let you from A to B in one stretch, but also have something useful called "exits" that let you get off or on that road at any other place between those two points.

If you decide to throw out you own customer service and replace it with a wrapper around Siebel, your invoice service will never learn about that change. If the invoice service were reaching over into co-located tables owned by the (former) customer service, you'd have a lot of work to do to untangle things. You don't need to do that untangling and all that complication. As an architect you should keep things separate from the start and make it insanely difficult for developers to break those rules. Having different databases and, better yet, to scatter them over several machines at least at development time makes it hard enough to keep the discipline.

Wednesday, May 26, 2004 9:57:02 AM (Pacific Daylight Time, UTC-07:00)  #    Comments [4] - Trackback
SOA

Omar has already posted the announced fix for version 1.6 and has updated all the downloadable files. Go here to get the updated versions. If you run 1.6, get the hotfix, otherwise just get one of the full archives. We should now be stable again. Thanks to Omar and Erv Walter for providing the fix and the repacking so quickly (while I am busy in San Diego at TechEd).
Wednesday, May 26, 2004 12:21:46 AM (Pacific Daylight Time, UTC-07:00)  #    Comments [0] - Trackback
dasBlog
 Tuesday, May 25, 2004

We'll have a fixpack for dasBlog 1.6 within the next two days that will roll back a few internal changes that had been made to improve performance, but unfortunately caused significant instability. The code is already checked into our tree and we're going to have the fix packaged up for download very soon. If you don't have 1.6 installed yet, wait until we have the fix. Within a week we are going to replace the 1.6 verson available from the Gotdotnet workspace with a version that incorporates the fix.

Tuesday, May 25, 2004 4:26:26 PM (Pacific Daylight Time, UTC-07:00)  #    Comments [0] - Trackback
dasBlog
 Monday, May 24, 2004

A short background reading link list for my CTS404 session at TechEd that I'll do in Room 10 this afternoon (Monday, May 24) at 5:00pm at TechEd San Diego:

Stateless?!
(About the uselessness of the static "statelessness" of a component as an indicator for its scalability)

Dealing with distributed transaction anomalies caused by web service calls from within transactions
(I'll show an updated version of that approach) 

Just in time activation proxy pooling
(Client side "connection pooling" for Enterprise Services)

I am looking forward to the session, because it's another one that challenges established beliefs (such as "stateless"="scalable")

Monday, May 24, 2004 7:13:28 AM (Pacific Daylight Time, UTC-07:00)  #    Comments [1] - Trackback
TechEd US
 Thursday, May 20, 2004

Now that we're getting close to the dasBlog engine's 1st birthday, I'd like to know how people use it. I am seeing quite a few blogs out there who run the software, but it's just as interesting to know how the engine is used in corporate Intranets and whether you use it as a tool to help coordinate projects, share knowledge about certain topics or .... how would I know?

If you use dasBlog, it'd be great if you could share with me how you use it, how you like it, and what you don't like. If you've warped the engine into something totally different or if you have some really cool design but it lives hidden inside the corporate firewall, I would appreciate getting a screenshot (blur out the secrets). None of the information will be published unless you allow me to do that.

I am also interested to know whether and how you've used snippets from the blog code for your own projects and/or products. Knowing what pieces are valuable to you would allow me to isolate them and put them into some isolated "goodies" library down the road.

Thursday, May 20, 2004 6:39:53 AM (Pacific Daylight Time, UTC-07:00)  #    Comments [8] - Trackback
newtelligence | dasBlog
 Wednesday, May 19, 2004

Scott Hanselman ran into a critical bug in dasBlog 1.6 that has to do with the new caching logic that the folks in the GDN workspace came up with (I didn't do it, I didn't do it!). We've both sent email to those who know about this issue and will see who will look at it and when. Apparently Scott posted something using an external tool and a couple of things were happening in parallel around that same time and that got the caching mechanism confused. If you get unexplicable errors and all you get is the "error page" , go ahead and delete the files entryCache.xml, categoryCache.xml, and blogdata.xml; then open and save (touch) web.config. that should get the blog back on its feet.

If you are on version 1.5 or earlier, stick to it while this is being checked. If you are on 1.6, have some tea or a lightly alcoholic beverage and don't panic.

Wednesday, May 19, 2004 11:16:19 PM (Pacific Daylight Time, UTC-07:00)  #    Comments [0] - Trackback
dasBlog

It's rare that I give "must have" tool recommendations, but here is one: If you do any regular expressions work with the .NET Framework, go and get Roy Osherove's Regulator. Roy consolidated a lot of the best things from various free regex tools and added his own wizardry into a pretty cool "RegEx IDE".

Wednesday, May 19, 2004 2:51:48 PM (Pacific Daylight Time, UTC-07:00)  #    Comments [1] - Trackback
Technology

The four fundamental transaction principles are nicely grouped into the acronym "ACID" that's simple to remember, and so I was looking for something that's doing the same for the SOA tenets and that sort of represents what the service idea has done to the distributed platform wars:

  • Policy-Based Behavior Negotiation
  • Explicitness of Boundaries
  • Autonomy
  • Contract
    Exchange
Wednesday, May 19, 2004 12:56:50 AM (Pacific Daylight Time, UTC-07:00)  #    Comments [3] - Trackback
Architecture | SOA
 Monday, May 17, 2004

This here reminds me of the box that's quietly humming in my home office and serves as my domain controller, firewall, RAS and DSL gateway. I upgraded the machine (a rather old 400 MHz Compaq) to Windows Server 2003 the day before I flew to TechEd Malaysia last year (August 23rd, 2003). I configured it to auto-update from Windows Update and reboot at 3:00AM in case updates have been applied.

Guess what: I got back home from that trip (which included 4 days touring the Angkor temples in Cambodia and another 10 days hanging out at the beach on Thailand's Ko Samui island) and realized that I forgot the Administrator password. Tried to get in to no avail. I've got rebuilding the box on my task list, but there's no rush. I haven't really touched or switched off the machine ever since. It keeps patching itself every once in a while and otherwise simply does its job.

Monday, May 17, 2004 1:21:22 PM (Pacific Daylight Time, UTC-07:00)  #    Comments [4] - Trackback
Technology

Want to win an XBox? You run dasBlog? Michael Earls shows you how.

[I just knew that the <%newtelligence.aspnetcontrol("TechEdBloggersFeed.ascx")%> macro would eventually be good for something ;-) ]

Monday, May 17, 2004 12:24:31 PM (Pacific Daylight Time, UTC-07:00)  #    Comments [0] - Trackback
TechEd US

I am not a “smart client” programmer and probably not even a smart client programmer and this trick has probably been around for ages, but …

For someone who’s been doing WPARAM and LPARAM acrobatics for years and still vividly recalls what (not necessarily good) you can do with WM NCPAINT and WM NCMOUSEMOVE (all that before I discovered the blessings of the server-side), it’s pretty annoying that Windows Forms doesn’t bubble events – mouse events specifically. It is actually hard to believe that that wouldn’t work. But I’ve read somewhere that bubbling events is “new in Whidbey”, so it is probably not my ignorance. Anyways … include the following snippet in your form (add MouseDown, MouseUp, … variants at your leisure), bind the respective events of all labels, panels and all the other “dead stuff” to this very handler (yes, all the controls share that handler) and that’ll have the events bubble up to your form in case you need them. I am just implementing custom resizing and repositioning for some user controls in a little tool and that’s how I got trapped into this. Voilá. Keep it.

 

protected void BubbleMouseMove(object sender, System.Windows.Forms.MouseEventArgs e)
{
      Point pt = this.PointToClient(((Control)sender).PointToScreen(new Point(e.X,e.Y)));
      MouseEventArgs me = new MouseEventArgs(e.Button,e.Clicks,pt.X,pt.Y,e.Delta);
      OnMouseMove(sender,me);
}

Monday, May 17, 2004 12:13:42 PM (Pacific Daylight Time, UTC-07:00)  #    Comments [2] - Trackback
Technology

Omar Shahine, who took the role of the "Program Manager" for dasBlog 1.6 added a new macro feature (I am actually not really sure who added it; someone correct me if I am wrong; at least Omar OK'd the feature) that totally rocks and put us on par with MovableType in terms of easy access to older entries:

<%newtelligence.drawarchivemonths()%> 

The macro creates a list of links for all months that have blog entries and if you look at my site (not at the RSS feed), you'll see it on the left-hand side of the page just under the "What's News" section. Thanks! Now I can find my old stuff again. ;-)

if you haven't see it already; Omar's comments about the 1.6 drop and links to release notes and binaries/source are on his blog.

Monday, May 17, 2004 2:27:21 AM (Pacific Daylight Time, UTC-07:00)  #    Comments [0] - Trackback
Blog | dasBlog
 Sunday, May 16, 2004

One of the reasons why I run Windows Server 2003 on my notebook is that "Services without Components" (managed incarnation is System.EnterpriseServices.ServiceDomain) didn't work on XP. If you just touch the ServiceConfig or ServiceDomain classes on XP, you get rewarded with a PlatformNotSupportedException, because the unmanaged implementation of that feature was present, but not quite-as-perfect-as-it-should-be on XP. That will soon be history. Windows XP SP2 and the COM+ 1.5 Rollup Package 6 will fix that and will bring COM+ 1.5 pretty much on par with Windows Server 2003.

Sunday, May 16, 2004 11:55:11 AM (Pacific Daylight Time, UTC-07:00)  #    Comments [2] - Trackback
Enterprise Services

Ralf Westphal responded to this and there are really just two sentences that I’d like to pick out from Ralf’s response because that allows me to go a quite a bit deeper into the data services idea and might help to further clarify what I understand as a service oriented approach to data and resource management. Ralf says: There is no necessity to put data access into a service and deploy it pretty far away from its clients. Sometimes is might make sense, sometimes it doesn’t.

I like patterns that eliminate that sort of doubt and which allow one to say “data services always make sense”.

Co-locating data acquisition and storage with business rules inside a service makes absolute sense if all accessed data can be assumed to be co-located on the same store and has similar characteristics with regards to the timely accuracy the data must have. In all other cases, it’s very beneficial to move data access into a separate, autonomous data service and as I’ll explain here, the design can be made so flexible that the data service consumer won’t even notice radical architectural changes to how data is stored. I will show three quite large scenarios to help illustrating what I mean: A federated warehouse system, a partitioned customer data storage system and a master/replica catalog system.

The central question that I want to answer is: Why would you want delegate data acquisition and storage to dedicated services? The short answer is: Because data doesn’t always live in a single place and not all data is alike.

Here the long answer:

The Warehouse

The Warehouse Inventory Service (WIS) holds data about all the goods/items that are stored in warehouse. It’s a data service in the sense that it manages the records (quantity in stock, reorder levels, items on back order) for the individual goods, performs some simplistic accounting-like work to allocate pools of items to orders, but it doesn’t really contain any sophisticated business rules. The services implementing the supply order process and the order fulfillment process for customer orders implement such business rules – the warehouse service just keeps data records.

The public interface [“Explicit Boundary” SOA tenet] for this service is governed by one (or a set of) WSDL portType(s), which define(s) a set of actions and message exchanges that the service implements and understands [“Shared Contract” SOA tenet]. Complementary is a deployment-dependent policy definition for the service, which describes several assertions about the Security and QoS requirements the service makes [“Policy” SOA tenet].

The WIS controls its own, isolated store over which it has exclusive control and the only way that others can get at the content of that data store is through actions available on the public interface of the service [“Autonomy” SOA tenet].

Now let’s say the company running the system is a bit bigger and has a central website (of which replicas might be hosted in several locations) and has multiple warehouses from where items can be delivered. So now, we are putting a total of four instances of WIS into our data centers at the warehouses in New Jersey, Houston, Atlanta and Seattle. The services need to live there, because only the people on site can effectively manage the “shelf/database relationship”. So how does that impact the order fulfillment system that used to talk to the “simple” WIS? It doesn’t, because we can build a dispatcher service implementing the very same portType that accepts order information, looks at the order’s shipping address and routes the allocation requests to the warehouse closest to the shipping destination. In fact now, the formerly “dumb” WIS can be outfitted with some more sophisticated rules that allow to split or to shift the allocation of items to orders across or between warehouses to limit freight cost or ensure the earliest possible delivery in case the preferred warehouse is out of stock for a certain item. Still, from the perspective of the service consumer, the WIS implementation is still just a data service. All that additional complexity is hidden in the underlying “service tree”.

While all the services implement the very same portType, their service policies may differ significantly. Authentication may require certificates for one warehouse and some other token for another warehouse. The connection to some warehouses might be done through a typically rock-solid reliable direct leased line, while another is reached through a less-than-optimal Internet tunnel, which impacts the application-level demand for the reliable messaging assurances. All these aspects are deployment specific and hence made an external deployment-time choice. That’s why WS-Policy exists.

The Customer Data Storage

This scenario for the Customer Data Storage Service (CDS) starts as simple as the Warehouse Inventory scenario and with a single service. The design principles are the same.

Now let’s assume we’re running a quite sophisticated e-commerce site where customers can customize quite a few aspects of the site, can store and reload shopping carts, make personal annotations on items, and can review their own order history. Let’s also assume that we’re pretty aggressively tracking what they look at, what their search keywords are and also what items they put into any shopping cart so that we can show them a very personalized selection of goods that precisely matches their interest profile. Let’s say that all-in-all, we need to have storage space of about 2Mbytes for the cumulative profile/tracking data of each customer. And we happen to have 2 million customers. Even in the Gigabyte age, ~4mln Mbytes (4TB) is quite a bit of data payload to manage in a read/write access database that should be reasonably responsive.

So, the solution is to partition the customer data across an array of smaller (cheaper!) machines that each holds a bucket of customer records. With that we’re also eliminating the co-location assumption.

As in the warehouse case, we are putting a dispatcher service implementing the very same CDS portType on top of the partitioned data service array and therefore hide the storage strategy re-architecture from the service consumers entirely. With this application-level partitioning strategy (and a set of auxiliary service to manage partitions that I am not discussing here), we could scale this up to 2 billion customers and still have an appropriate architecture. Mind that we can have any number of dispatcher instances as long as they implement the same rules for how to route requests to partitions. Strategies for this are a direct partition reference in the customer identifier or a locator service sitting on a customer/machine lookup dictionary.

Now you might say “my database engine does this for me”. Yes, so-called “shared-nothing” clustering techniques do exist on the database level for a while now, but the following addition to the scenario mandates putting more logic into the dispatching and allocation service than – for instance – SQL Server’s “distributed partitioned views” are ready to deal with.

What I am adding to the picture is the European Union’s Data Privacy Directive. Very simplified, the EU directives and regulations it is illegal to permanently store personal data of EU citizens outside EU territory, unless the storage operator and the legislation governing the operator complies with the respective “Safe Harbor” regulations spelled out in these EU rules.

So let’s say we’re a tiny little bit evil and want to treat EU data according to EU rules, but be more “relaxed” about data privacy for the rest of the world. Hence, we permanently store all EU customer data in a data center near Dublin, Ireland and the data for the rest of the world in a data center in Dallas, TX (not making any implications here).

In that case, we’re adding yet another service on top of the unaltered partitioning architecture that implements the same CDS contract and which internally implements the appropriate data routing and service access rules. Those rules which will most likely be based on some location code embedded in the customer identifier (“E1223344” vs. “U1223344”). Based on these rules, requests are dispatched to the right data center. To improve performance and avoid having to data travel along the complete path repeatedly or in small chunks during an interactive session with the customer (customer is logged into the web site), the dispatcher service might choose to have a temporary, non-permanent cache for customer data that is filled with a single request and allows quicker and repeat access to customer data. Changes to the customer’s data that result from the interactive session can later be replicated out to the remote permanent storage.

Again, the service consumer doesn’t really need to know about these massive architectural changes in the underlying data services tree. It only talks to a service that understands a well-known contract.

The Catalog System

Same picture to boot with and the same rules: Here we have a simple service fronting a catalog database. If you have millions of catalog items with customer reviews, pictures, audio and/or video clips, you might chose to partition this just like we did with the customer data.

If you have different catalogs depending on the markets you are selling into (for instance German-language books for Austria, Switzerland and Germany), you might want to partition by location just as in the warehouse scenario.

One thing that’s very special about catalog data is that very much of it rarely ever changes. Reviews are added, media might be added, but except for corrections, the title, author, ISBN number and content summary for a book really doesn’t ever change as long as the book is kept in the catalog. Such data is essentially “insert once, change never”. It’s read-only for all practical purposes.

What’s wonderful about read-only data is that you can replicate it, cache it, move it close to the consumer and pre-index it. You’re expecting that a lot of people will search for items with “Christmas” in the item description come November? Instead of running a full text search every time, run that query once, save the result set in an extra table and have the stored procedure running the “QueryByItemDescription” activity simply return the entire table if it sees that keyword. Read-only data is optimization heaven.

Also, for catalog data, timeliness is not a great concern. If a customer review or a correction isn’t immediately reflected on the presentation surface, but only 30 minutes or 3 hours after is has been added to the master catalog, it doesn’t do any harm as long as the individual adding such information is sufficiently informed of such a delay.

So what we can do to with the catalog is to periodically (every few hours or even just twice a week) consolidate, pre-index and then propagate the master catalog data to distributed read-only replicas. The data services fronting the replicas will satisfy all read operations from the local store and will delegate all write operations directly (passthrough) to the master catalog service. They might choose to update their local replica to reflect those changes immediately, but that would preclude editorial or validation rules that might be enforced by the master catalog service.

 

So there you have it. What I’ve described here is the net effect of sticking to SOA rules.

·         Shared Contract: Any number of services can implement the same contract (although the concrete implementation, purpose and hence their type differ). Layering contract-compatible services with gradually increasing levels of abstractions and refining rules over existing services creates very clear and simple designs that help you scale and distribute data very well

·         Explicit Boundaries: Forbidding foreign access or even knowledge about service internals allows radical changes inside and “underneath” services.

·         Autonomy allows for data partitioning and data access optimization and avoids “tight coupling in the backend”.

·         Policy: Separating out policy from the service/message contract allows flexible deployment of the compatible services across a variety of security and trust scenarios and also allows for dynamic adaptation to “far” or “near” communications paths by mandating certain QoS properties such as reliable messaging.

 

Service-Orientation is most useful if you don’t consider it as just another technique or tool, but embrace it as a paradigm. And very little of this thinking has to do with SOAP or XML. SOAP and XML are indeed just tools.

Sunday, May 16, 2004 4:58:07 AM (Pacific Daylight Time, UTC-07:00)  #    Comments [4] - Trackback
Architecture | SOA

I didn't spend much time for anything except writing, coding, travel, speaking and being at geek parties in the past weeks. Hence, I am sure I am the last one to notice, but I find it absolutely revolutionary that the Microsoft Visual C++ 2003 command line compiler (Microsoft C/C++ Version 13.1) is now a freebie.

Sunday, May 16, 2004 12:46:49 AM (Pacific Daylight Time, UTC-07:00)  #    Comments [0] - Trackback
Technology | CLR
Stuff
About the author/Disclaimer

The content of this site are my own personal opinions and do not represent my employer's view in anyway. In addition, my thoughts and opinions often change, and as a weblog is intended to provide a semi-permanent point in time snapshot you should not consider out of date posts to reflect my current thoughts and opinions.

© Copyright 2008
Clemens Vasters
Sign In
Statistics
Total Posts: 714
This Year: 7
This Month: 0
This Week: 0
Comments: 1213
Themes
Pick a theme:
All Content © 2008, Clemens Vasters
DasBlog theme 'Business' created by Christoph De Baene (delarou)