March 30, 2007
@ 03:44 PM

One of the "niche" features in WCF that deserves a lot more attention than it is getting is our P2P support. The NetPeerTcpBinding looks, from the developer perspective, mostly like any other binding. The main difference between P2P applications and "normal" client/server apps is, of course, that they are serverless. Hence, P2P apps are commonly based on message exchanges where every peer node in a mesh talks to everyone else in a broadcast fashion and that model favors (but doesn't require) symmetric duplex contracts*

When I say that it works like mostly any other binding, I really only mean the developer experience. The NetPeerTcpBinding packs so much network intelligence under its hood that it boggles the mind. The P2P technology underneath will figure out the optimal layout for a peer mesh, propagate messages through the mesh in an optimal fashion using members of the mesh as routers as appropriate. You can hook in filters to control the message propagation, you can control the hop counts, there are detection mechanisms for when a party gets split off the mesh and reconnects, and there are various ways to secure your meshes. And you basically get all the stuff for free if you just pick that binding and configure it.

The Peer Channel team has a blog, too. Links to samples:

(a) Basic NetPeerTcpBinding samples - Uses the PNRP resolver mode
(b) Scenario samples:
       (i) Chat - Demonstrates Chat using the non-PNRP custom resolver
       (ii) Custom Resolver - Demonstrates how to write your own Custom Resolver service and client.


* A symmetric duplex contract defines itself as the callback contract:
[ServiceContract(CallbackContract = typeof(IChat))]
public interface IChat
{
  ...
}

Categories: WCF

There are a lot of blog entries that I'd write if they weren't already written. Stupid statement. No, really. One of the great qualities of the documentation that we built for WCF and WF and CardSpace is that it's completely legible and understandable :)

Since there's just a lot of stuff in the SDK docs and one easily gets lost in the forest, I'll point out a few of the conceptual docs and/or samples and may add the one or the other commentary here or there. For the first one that I selfishly point out the only actual commentary is that I wrote that piece ;)

Go read about Message Inspectors and how to implement client- and/or server-side schema-based validation in WCF, complete with the ability to refer to the validation schemas by config. Adventure-seekers might be interested in poking around in that code and replace the schema validation and the schemas with XSLTs and transforms. That would create some interesting followup-challenges for synthesizing the ContractDescription that projects out the correct pre-transformation representation for WSDL, but I guess that'd be part of the fun.

Categories: WCF

March 29, 2007
@ 08:02 AM

A bad sign for how much I’m coding these days is that I had a HDD crash three weeks ago and only restored Visual Studio into fully working condition with all my tools and stuff today. I’ve decided that that has to change otherwise I’ll get really rusty.

Picking up the thread from “Professor Indigo” Nicholas Allen, I’ve built a little program that illustrates an alternate handling strategy for poisonous messages that WCF throws into the poison queue on Vista and Longhorn Server if you ask it to (ReceiveErrorHandling.Move). The one we’re showing in the docs is implementing a local resolution strategy that’s being fired within the service when the service ends up faulting; that’s the strategy for ReceiveErrorHandling.Fault and works for MSMQ 3.0. The strategy I’m showing here requires our latest OS wave.

When a message arrives at a WCF endpoint through a queue, WCF will – if the queue is transactional – open a transaction and de-queue the message. It will then try to dispatch it to the target service and operation. Assuming the dispatch works, the operation gets invoked and – might – tank. If it does, an exception is raised, thrown back into the WCF stack and the transaction aborts. Happily, WCF grabs the next message from the queue – which happens to be the one that just caused the failure due to the rollback – and the operation – might – tank again.

Now, the reasons why the operation might fail are as numerous as the combinations of program statement combinations that you could put there. Anything could happen. The program is completely broken, the input data causes the app to go to that branch that nobody ever cared to test – or apparently not enough, the backend database is permanently offline, the machine is having an extremely bad hardware day, power fails, you name it.

So what if the application just keeps choking and throwing on that particular message? With either of the aforementioned error handling modes, WCF is going to take the message out of the loop when its patience with the patient is exhausted. With the ReceiveErrorHandling.Fault option, WCF will raise an error event that can be caught and processed with a handler. When you use ReceiveErrorHandling.Move things are a bit more flexible, because the message causing all that trouble now sits in a queue again.

The headache-causing problem with poison messages is that you really, really need to do something about them. From the sender’s perspective, the message has been delivered and it puts its trust into the receiver to do the right thing. “Here’s that $1,000,000 purchase order! I’m done, go party!”. If the receiving service goes into the bug-induced loop of recurring death, you’ve got two problems: You have a nasty bug that’s probably difficult to repro since it happens under stress, and you’ve got a $1,000,000 purchase order unhappily sitting in a dark hole. Guess what your great-grand-boss’ boss cares more about.

The second, technically slightly more headache-causing problem with poison messages (if that’s possible to imagine) is that they just sit there with all the gold and diamonds that they might represent, but they are effectively just a bunch of (if you’re lucky) XML goo. Telling a system operator to go and check the poison message queues or to surface their contents to him/her and look what’s going on there is probably not a winning strategy.

So what to do? Your high-throughput automated-processing solution that does the regular business behind the queue has left the building for lunch. That much is clear. How do you hook in some alternate processing path that does at least surface the problem to an operator or “information worker”– or even a call center agent pool – in a legible and intelligible fashion so that a human can look at the problem and try finding a fix? In the end, we’ve got the best processing unit for non-deterministic and unexpected events sitting  between our shoulders, one would hope. How about writing a slightly less automated service alternative that’s easy to adjust and try to get the issue surfaced to someone or just try multiple things [Did someone just say “Workflow”?] – and hook that straight up to where all the bad stuff lands: the poison queue.

Here’s the code. I just coded that up for illustrative purposes and hence there’s absolutely room for improvement. I’m going to put the project files up on wcf.netfx3.com and will update this post with the link. We’ll start with the boilerplate stuff and the “regular” service:

using System;
using System.Collections.Generic;
using System.Text;
using System.ServiceModel.Channels;
using System.ServiceModel;
using System.Runtime.Serialization;
using System.ServiceModel.Description;
using System.Workflow.Runtime;
using ServerErrorHandlingWorkflow;
using ServerData;

namespace Server
{
    [
ServiceContract(Namespace=Program.ServiceNamespaceURI)]
   
interface IApplicationContract
    {
        [
OperationContract(IsOneWay=true)]
       
void SubmitData(ApplicationData data);
    }


    [
ServiceBehavior(TransactionAutoCompleteOnSessionClose=true,
                     ReleaseServiceInstanceOnTransactionComplete=
true)]
   
class ApplicationService : IApplicationContract
    {
        [
OperationBehavior(TransactionAutoComplete=true,TransactionScopeRequired=true),
         System.Diagnostics.
DebuggerStepThrough]
       
public void SubmitData(ApplicationData data)
        {
           
throw new Exception("The method or operation is not implemented.");
        }
    }

Not much excitement here except that the highlighted line will always cause the service to tank. In real life, the path to that particular place where the service consistently finds its way into a trouble-spot is more convoluted and may involve a few thousand lines, but this is a good approximation for what happens when you hit a poison message. Stuff keeps failing.

The next snippet is our alternate service. Instead of boldly trying to do complex processing, it simply punts the message data to a Workflow. That’s assuming that the message isn’t completely messed up to begin with and can indeed be de-serialized. To mitigate that scenario we could also use a one-way universal contract and be even more careful. The key difference between this and the “regular” service is that the alternate service turns off the WCF address filter check. We’ll get back to that. 


    [ServiceBehavior(AddressFilterMode = AddressFilterMode.Any)]
    class ApplicationErrorService : IApplicationContract
    {
       
public void SubmitData(ApplicationData data)
        {
           
Dictionary<string,object> workflowArgs = new Dictionary<string,object>();
            workflowArgs.Add(
"ApplicationData",data);
           
WorkflowInstance workflowInstance =
               
Program.WorkflowRuntime.CreateWorkflow(
                         
typeof(ErrorHandlingWorkflow),
                          workflowArgs);
            workflowInstance.Start();
        }
    }

So now we’ve got the fully automated middle-of-the-road default service and our “what do we do next” alternate service. Let’s hook them up.

    class Program
    {
       
public const string ServiceNamespaceURI =
               
"http://samples.microsoft.com/2007/03/WCF/PoisonHandling/Service";
       
public static WorkflowRuntime WorkflowRuntime = new WorkflowRuntime();

       
static void Main(string[] args)
        {
           
string msmqQueueName = Properties.Settings.Default.QueueName;
           
string msmqPoisonQueueName = msmqQueueName+";poison";
           
string netMsmqQueueName =
                
"net.msmq://" + msmqQueueName.Replace('\\', '/').Replace("$","");
           
string netMsmqPoisonQueueName = netMsmqQueueName+";poison";
           
           
if (!System.Messaging.MessageQueue.Exists(msmqQueueName))
            {
                System.Messaging.
MessageQueue.Create(msmqQueueName, true);
            }

First – and for this little demo only – we’re setting up a local queue and do a little stringsmithing to get the app.config stored MSMQ format queue name into the net.msmq URI format. Next …

            ServiceHost applicationServiceHost = new ServiceHost(typeof(ApplicationService));
           
NetMsmqBinding queueBinding = new NetMsmqBinding(NetMsmqSecurityMode.None);
            queueBinding.ReceiveErrorHandling =
ReceiveErrorHandling.Move;
            queueBinding.ReceiveRetryCount = 1;
            queueBinding.RetryCycleDelay =
TimeSpan.FromSeconds(1);
            applicationServiceHost.AddServiceEndpoint(
typeof(IApplicationContract),
                                                      queueBinding,
                                                      netMsmqQueueName);

Now we’ve bound the “regular” application service to the queue. I’m setting the binding parameters (look them up at your leisure) in a way that we’re failing very fast here. By default, the RetryCycleDelay is set to 30 minutes, which means that WCF is giving you a reasonable chance to fix temporary issues while stuff hangs out in the retry queue. Now for the poison handler service:

      
           
ServiceHost poisonHandlerServiceHost = new ServiceHost(typeof(ApplicationErrorService));
           
NetMsmqBinding poisonBinding = new NetMsmqBinding(NetMsmqSecurityMode.None);
            poisonBinding.ReceiveErrorHandling =
ReceiveErrorHandling.Drop;
            poisonHandlerServiceHost.AddServiceEndpoint(
typeof(IApplicationContract),
                                                        poisonBinding,
                                                        netMsmqPoisonQueueName);

Looks almost the same, hmm? The trick here is that we’re pointing this one to the poison queue into which the regular service drops all the stuff that it can’t deal with. Otherwise it’s (almost) just a normal service. The key difference between the ApplicationErrorService service and its sibling is that the poison-message handler service implementation is decorated with [ServiceBehavior(AddressFilterMode = AddressFilterMode.Any)].Since the original message was sent to the a different (the original) queue and we’re now looking at a sub-queue that has a different name and therefore a different WS-Addressing:To identity, WCF would normally reject processing that message. With this behavior setting we can tell WCF to ignore that and have the service treat the message as if it landed at the right place – which is what we want.

And now for the unspectacular run-it and drop-a-message-into-queue finale:

            applicationServiceHost.Open();
            poisonHandlerServiceHost.Open();
           
           
Console.WriteLine("Application running");

           
ChannelFactory<IApplicationContract> client =
              
new ChannelFactory<IApplicationContract>(queueBinding,
                                                        netMsmqQueueName);
           
IApplicationContract channel = client.CreateChannel();
            
ApplicationData data = new ApplicationData();
            data.FirstName =
"Clemens";
            data.LastName =
"Vasters";
            channel.SubmitData(data);    
            ((
IClientChannel)channel).Close();

           
Console.WriteLine("Press ENTER to exit");
       
           
Console.ReadLine();
        }
    }
}

The Workflow that’s hooked up to the poison handler in my particular sample project does nothing big. It’s got a property that is initialized with the data item and just has a code activity that spits out the message to the console. It could send an email, page an operator through messenger, etcetc. Whatever works.

Categories: MSMQ | WCF

I see an increasing number of research efforts going on to get people’s heads around the blogosphere and how to figure out what's relevant and what's not. 4-5 years back it was quite easy to do so, because there were so few of “us bloggers” and you could read pretty much all blogs that mattered in your area of interest withion an hour of your day, but now all of that has grown so much out of proportion that noise and signal blur into a “wodge of stuff” that’s hard to get through or judge. So now people start resorting to bots and lots of statistics to do analysis and my intuition tells me that while that may yield interesting data, a bot can’t really capture the signal amplitude. With that I mean relevance and authority.

I think I’m observing several types of blogs that deserve different attention and weight. Interestingly enough, that isn’t necessarily captured by discoverable metadata such as inbound links or trackbacks or pingbacks. The types I can come up with are the following and it’d be great if you could give me your opinion on whether that resonates with you and whether you have good examples for the individual types. I am giving some examples realizing that some blogs have N+1 of these characteristics. The crosscutting concern here are comments. I am not sure how to think about those yet. Also, this list is not at all scientific; it’s just a (my) perspective. 

“The Authority”
The blog has been around forever and the author has built up so much credibility and following that “everyone interested” is subscribed to the feed. Since that’s so, people are at most giving “Look at that” links and there is no widespread debate because the blog entries are undisputably good and accurate data; most people just consume the feed.

“The Troublemaker”
The blog has been around for a while and the author has build up enough credibility for people to care. The author intentionally takes extreme positions to spark debate and that works and people are linking and voicing opinion. Lots of people are lurking, lots of links if the position is particularly outrageous.

“The Collaborator”
The blog has been around for a while and the author has build up enough credibility for people to care. The author has a reputation to be interested in broad collaboration, raises interesting challenges and ask broad questions that spark constructive debate.

“The Linkblogger”
The blog has been around for a while and the author has built a reputation for being a good observer for what’s going on in blogland. Lots of people are relying on the editorial skill to cut through the noise and are mostly consuming. Inbound links becoming rare over time, because the blog eventually becomes a utility.

“The Magazine”
The blog has been around for a while and the author has built a reputation for being good at figuring out what’s going on in the industry and is essentially a news outlet. Lots of incoming links due to novelty factor.

“The Blip in the Noise”
The blog is sitting on one of the big blog properties (such as weblog.asp.net) and shows up on people’s radar mostly through the consolidated feed. Inbound links may flare up on an interesting post, but otherwise the main blog is just a lonely place. If there are enough blips, people may end up subscribing to the actual blog feed.

“The Googleable Answer”
This is the blog who is #1 to #5 with the answer to something that thousands are having a problem with. Google for 0x800123123 or some HRESULT and you find this person. The author is proud of this post because (s)he "is the answer", not support.microsoft.com.   (look for "dllhost.exe.config" ...)

 “The Shooting Star”
The blog is relatively new or has been ignored but the author has done an astonishing stunt that ended up on Slashdot or digg (etc). Tons of links. Server tanks. People subscribe and lurk for a while and if the author can follow through the blog will end up on somewhere in one of the categories above or otherwise on the category below.

“I want to blog”
The blog has no general relevance whatsoever. Nobody is particularly interested. Sadly, that's the majority.

Another observation that I have is that the blog volume doesn’t directly correlate to relevance. Someone can be silent for 3 months and have huge amplitude and some blogs on people who post every day may not matter at all in the big picture.

(Thanks to Scott Hanselman for the "Googleable Answer" contribution) 

Categories: Blog

March 10, 2007
@ 03:37 PM

I read that Google runs buses. Headline material in the New York Times. Impressive. Ehh. Here at Microsoft, buses are run by the King County Metro Transit system and we all get a free Flexpass. And our shuttle system is constrained to connections within and between the campus locations in and around Redmond. Of course that's just cost effective, logical and boring and therefore not newsworthy, I guess.  

Categories:

March 8, 2007
@ 08:39 AM

COM Is Love.

Disagree? Stop reading.

Agree? Still feel it? Well, I just learned that there's a very unique way you can show your love for COM. Own it!

Own it? Yes, I'm completely not kidding. We've got an open position for a Program Manager to own COM+, DCOM, RPC, the WCF/COM Integration, System.EnterpriseServices and all the future goodness that we're going to stick into Longhorn Server and future versions of Windows to keep COM going and make it increasingly integrated with all the goodness that we're working on for the future of distributed systems in the years to come. COM dead? Pfft. 

If you are interested and have difficulties figuring out how to work the job web page (that is the preferred way, however) send me a mail with your resume to clemensv at microsoft.com. And mind that my email address just serves as a proxy here so be as serious as you would be about applying for a job with someone whose blog you don't read... 

 

Categories: