About Mapping

2 minutes read

There's an interesting comment on Ted Neward's blog which is related to what I've been saying here:

Ian Fairman wrote:
This whole approach of munging your own shipping request schema to the parcel company's schema reminds me of how some people would tout entity beans as reusable simply by using object-relational mapping tools to map them to 3rd party database schemas. Unfortunately the "magic" in these mapping tools can be quite easily fooled by extra fields in either the bean or the database table - what do you do with them? And that is probably the simplest problem you'd meet. This mapping magic is a pipe dream. The real solution is to define standard schemas. Getting everyone to agree on these schemas may be a pain in the arse but at least it will work when you've done it.

The problem with the standardization and agreement dream is that it is indeed much less realistic than doing all the mapping work. What we need are stable schemas that we can rely on and map from and to, but not necessarily standardized ones. Until you get people from all state-owned postal services around the world and all the private courier and parcel services to agree on a set of schemas to use in unmodified form (or all banks, or all whatever-you-choose-industry), XML has likely already been replaced by the next thing. 

Message standards are hard to agree on. If you look at EDIFACT or ANSI X.12, you'll find that neither really defines something like a schema. Both standards define "dictionaries" of segments which you can choose from when you assemble/trim message definitions. An EDIFACT ORDERS message is never caught in the wild as specified in the EDIFACT dictionaries. In fact, it's already a messy manual task to go from one user's ORDERS to another user's ORDERS, because the precise implementations of ORDERS may differ wildly.

Even worse ... what if industry A agrees on a standard format for exchanging purchase orders and industry B does the same for themselves and now a company from industry A wants to purchase something from a company in industry B. Poof!

The moment the first company came up with a graphical, easy-to-understand editor for XML Schema, the whole standardization hope was gone out the window already. Everyone can define schema and if they can, they will. Now with stuff like InfoPath, my Dad can even define schema. It's not going to be pretty, but it is schema.

If one out of 500 companies doesn't support a "standard schema" or even augments it with their own idea of certain things and that one company happens to be a big fish, forget standardization. You will have to understand what they mean and you will have to map between schemas. Schema standardization won't work on a grand scale and it will just not happen.

At the same time, "generic mapping" won't work, unless we take metadata much further than we do now and start to formalize semantics. The "semantic web" efforts go into that direction, but there's plenty of work to be done. In the absence of standardization, mapping between schemas is a growing necessity and while it seems like an ugly job -- it is a job. It's a manual job. There's no magic here. And it surely will create jobs. It's simply a necessary new field in programming and it's begging for better tools.

Updated:

Leave a Comment