Composite Metadata (this will change your life)

I am in a blogging mood today … Here are some thoughts around composite metadata. Sorry for the bold title ;)

* * *

Whenever I am asked what I consider the most important innovation of the CLR, I don’t hesitate to respond “extensible metadata” coming in the form of custom attributes. Everyone who has followed this blog for a while and looked at some of the source code I published knows that I am seriously in love with attributes. In fact, very few of the projects I write don’t include at least one class derived from Attribute and once you use the XmlSerializer, Enterprise Services or ASMX, there’s no way around using them.

In my keynote on contracts and metadata at the Norwegian Visual Studio .NET 2003 launch earlier this year, I used the sample that’s attached at the bottom of this article. It illustrates how contracts can be enforced by both, schema validation and validation of object graphs based on the same set of constraints. In schema, the constraints are defined using metadata (restrictions) inside element or type definitions, and in classes, the very same restrictions can be applied using custom attributes, given you have a sufficient set of attributes and the respective validation logic. In both cases, the data is run through a filter that’s driven by the metadata information. If either filter is used at the inbound and outbound channels of a service, contract enforcement is automatic and “contract trust” between services, as defined in my previous article, can be achieved. So far, so good.

In my example, the metadata instrumentation for a CLR type looks like this:

      [System.Xml.Serialization.XmlTypeAttribute(
           Namespace="urn:schemas-newtelligence-com:transactionsamples:customerdata:v1")]
       public class addressType
       {
              [Match(@"\p{L}[\p{L}\p{P}0-9\s]*"),MaxLength(80)]
              public string City;
              public countryNameType Country;
              public countryCodeType CountryCode;
              [MaxLength(10)]
              public string PostalCode;
              [MaxLength(160)]
              public string AddressLine;
      }

… while the corresponding schema is a bit better factored and looks like this:

    <xsd:simpleType name="nameType">
              <xsd:restriction base="xsd:string">
                     <xsd:pattern value="\p{L}[\p{L}\p{P}0-9\s]*" />
              </xsd:restriction>
    </xsd:simpleType>
    <xsd:complexType name="addressType">
              <xsd:sequence>
                     <xsd:element name="City">
                            <xsd:simpleType>
                                   <xsd:restriction base="nameType">
                                          <xsd:maxLength value="80" />
                                   </xsd:restriction>
                            </xsd:simpleType>
                     </xsd:element>
                     <xsd:element name="Country" type="countryNameType" />
                     <xsd:element name="CountryCode" type="countryCodeType" />
                     <xsd:element name="PostalCode">
                            <xsd:simpleType>
                                   <xsd:restriction base="xsd:string">
                                          <xsd:maxLength value="10" />
                                   </xsd:restriction>
                            </xsd:simpleType>
                     </xsd:element>
                     <xsd:element name="AddressLine">
                            <xsd:simpleType>
                                   <xsd:restriction base="xsd:string">
                                          <xsd:maxLength value="160" />
                                   </xsd:restriction>
                            </xsd:simpleType>
                     </xsd:element>
              </xsd:sequence>
       </xsd:complexType>

The restrictions are expressed differently, but they are aspects of type in both cases and semantically identical. And both cases work and even the regular expressions are identical. All the sexiness of this example aside, there’s one thing that bugs me:

In XSD, I can create a new simple type by extending a base type with additional metadata like this

<xsd:simpleType name="nameType">
       <xsd:restriction base="xsd:string">
              <xsd:pattern value="\p{L}[\p{L}\p{P}0-9\s]*" />
       </xsd:restriction>
</xsd:simpleType>

which causes the metadata to be inherited by the subsequent element definition that again uses metadata to further augment the type definition with metadata rules:

<xsd:element name="City">
       <xsd:simpleType>
              <xsd:restriction base="nameType">
                     <xsd:maxLength value="80" />
              </xsd:restriction>
       </xsd:simpleType>
</xsd:element>

So, XSD knows how to do metadata inheritance on simple types. The basic storage type (xsd:string) isn’t changed by this augmentation, it’s just the validation rules that change, expressed by adding metadata to the type. The problem is that the CLR model isn’t directly compatible with this. You can’t derive from any of the simple types and therefore you can’t project this schema directly onto a CLR type definition. Therefore I will have to apply the metadata onto every field/property, which is the equivalent of the XSD’s element declaration. The luxury of the <xsd:simpleType/> definition and inheritable metadata doesn’t exist. Or does it?

Well, using the following pattern it indeed can. Almost.

Let’s forget for a little moment that the nameType simple type definition above is a restriction of xsd:string, but let’s focus on what it really does for us. It encapsulates metadata. When we inherit that into the City element, an additional metadata item is added, resulting in a metadata composite of two rules – applied to the base type xsd:string.

So the about equivalent of this expressed in CLR terms could look like this:

    [AttributeUsage(AttributeTargets.Field)]
    [Match(@"\p{L}[\p{L}\p{P}0-9\s]+")]
    public class NameTypeStringAttribute : Attribute
    {
    }

    [System.Xml.Serialization.XmlTypeAttribute(
       Namespace="urn:schemas-newtelligence-com:transactionsamples:customerdata:v1")]
    public class addressType
    {
        [NameTypeString,MaxLength(80)]
        public string City;

        …
    }

Now we have an attribute NameTypeString(Attribute) that fulfills the same metadata containment function. The attribute has an attribute. In fact, we could even go further with this and introduce a dedicated “CityString” meta-type either by composition:

   [AttributeUsage(AttributeTargets.Field)]
   [NameTypeString,MaxLength(80)]
      public class CityStringAttribute : Attribute
    {

}

… or by inheritance

   [AttributeUsage(AttributeTargets.Field)]
   [MaxLength(80)]
      public class CityStringAttribute : NameTypeStringAttribute
    {
    }

Resulting in the simple field declaration

[CityString] public string City;

The declaration essentially tells us “stored as a string, following the contract rules as defined in the composite metadata of [CityString]”.

Having that, there is one thing that’s still missing. How does the infrastructure tell if an attribute is indeed a composite and that the applicable set of metadata is a combination of all attributes found on this attribute and attributes that are declared on itself?

The answer is the following innocent looking marker interface:

public interface ICompositeAttribute
{ }

If that marker interface is found on an attribute, the attribute is considered a composite attribute and the infrastructure must (potentially recursively) consider attributes defined on this attribute in the same way as attributes that exist on the originally inspected element – for instance, a field.

    [AttributeUsage(AttributeTargets.Field)]
    [Match(@"\p{L}[\p{L}\p{P}0-9\s]+")]
    public class NameTypeStringAttribute : Attribute, ICompositeAttribute
    {   }

Why a marker interface and not just another attribute on the attribute? The answer is quite simple: Convenience. Using the marker interface, you can find composites simply with the following expression: *.GetCustomAttributes(typeof(ICompositeAttribute),true)

And why not use a base-class “CompositeAttribute”? Because that would be an unnecessary restriction for the composition of attributes. If only the marker interface is used, the composite can have any base attribute class, including those built into the system.

But wait, this is just one side of the composition story for attributes. There’s already a hint on an additional composition quality two short paragraphs up: *.GetCustomAttributes(typeof(ICompositeAttribute),true). The metadata search algorithm doesn’t only look for concrete attribute types, but also looks for interfaces, allowing the above expression to work.

So how would it be if an infrastructure like Enterprise Services would not use concrete attributes, but would also support composable attributes as illustrated here …

    public interface ITransactionAttribute
    {
        public TransactionOption TransactionOption
        {
            get;
        }
    }

    public interface IObjectPoolingAttribute
    {
        public int MinPoolSize
        {
            get;
        }

        public int MaxPoolSize
        {
            get;
        }
    }

In that case, you would also be able to define composite attributes that define standardized behavior for a certain class of ServicedComponents that you have in your application and should all behave in a similar way, resulting in a declaration like this:

      public class StandardTransactionalPooledAttribute :
        Attribute, ITransactionAttribute, IObjectPoolingAttribute
    {
    }

      [StandardTransactionalPooled]
    public class MyComponent : ServiceComponent
    {

}

While it seems to be an “either/or” thing at first, both illustrated composition patterns, the one using ICompositeAttribute and the other that’s entirely based on the inherent composition qualities of interface are useful. If you want to reuse a set of pre-built attributes like the ones that I am using to implement the constraints, the marker interface solution is very cheap, because the coding effort is minimal. If you are writing a larger infrastructure and want to allow your users more control over what attributes do and allow them to provide their own implementation, “interface-based attributes” may be a better choice.

Download: MetadataTester.zip

Clemens Vasters