SOA & Data integrity
Clemens Vasters made a post about sharing data stores across services. He says the temptation is just too big that some developer will go and make a database join across the “data domains” of services and cause a co-location dependency of data and schema dependencies between services.
You could say I have zero to none experience on SOA, but I’m very interested in the subject and read my share part of articles on it. It’s kinda like when .NET was released and you found (kinda) a whole new way of developing applications, this is like a whole new way of designing applications with a bunch of new programming methods on the way down.
But the more I read about SOA and discuss the topic with collegues, the more questions I have. And the problem is, most of them stay unanswered. At work, I’m implementing a (small) application the SOA way for the first time.
The topic mentioned Clemens blogged about, is not that difficult. The first thing developers say when first hearing about SOA is that you will get performance issues when you can’t lay really large joins over tables but have to get all the data through each individual service. Ofcourse this depends on how the services are defined, but that’s another topic. I asked Clemens about deleting and his simple answer was: Don’t delete, by more disks. Thanks for the answer. 🙁
Anyway, in the comments, Ray Jezek has some questions as well. One argument of SOA is that you can replace a service with another one. Clements has the example of a customer services replaced by Siebel. And although it had crossed my mind, Ray raises questions about the identifying key of the customer. What if we defined an int and Siebel uses GUIDs? It’s kind of an extreme situation, but what about using an integer and a string, where the string also uses A-Z as possibilities. I ask you, what if we have both services running side-by-side and we want to take out our own service and use Siebel? Because Siebel already has our customers so we can’t generate them with out own keys. Now we have to build a mapper service in which we map our previous key with the Siebel key. That’s probably doable.
But there’s more. All the time, I hear that I have to use a unique business entity as the key. For example, use CustomerID and not some internal id. With internal ids, we’d never be able to ‘just’ transfer our customers into Siebel and always need a mapper. But what if we come across the case we have cusomter id’s of [00223] and Siebel uses [82A02PPQ]? Or even worse, uses {17e97ffa-d478-4c17-87de-a075d826fe1f} ? Then all of a sudden, our users are overflown with totally different patterns in their CustomerIDs.
This is getting quite a long blog, but the question Ray raises is, why aren’t there any examples and/or implementations on these subjects? Has everybody only been thinking about SOA but never implemented it yet? Very hard to believe. Also hard to believe is nobody has been thinking about these problems. What also is hard to believe, but true in fact, that there’s not much information on these topics. Maybe there’s a really big gap here Microsoft might fill with some Patterns & Practices or Architecture site. Or maybe you know of good sites and/or implementations.
Interesting sightings Dennis. Clemens certainly has a point stating schema dependencies between services is a bad practice. For me this is an indication Clemens and many others in the field spend way to much time abstracting the SOA principles to the public instead of bringing these “revolutionary” concepts into practice. I’ve been working on a material traceability system for over a year now where we split functionality in self contained containers. We therefore have a material service storing material definitions and material instances. On the side we have an equipment service which stores equipment definitions, configurations, and instances. The material tracking service (a small part of the whole) ties the two together, the tracking services relates material check- ins/outs to equipment. In the real world this would mean I check-out material (3 parts for instance) from the warehouse, where the warehouse is actually an equipment instance. The material is then transported by a conveyer (equipment) and is being checked-in in the production center where it eventually becomes 1 part (the 3 parts being assembled together of some other process step).
This use-case would ask the material service for the part definition, which could be an interface over some fancy ERP system. To determine if the parts are in stock it then then makes a request to the equipment service which interfaces against a piece of MES business logic, retrieving the actual stock for us. It then relates the material handling with the actual equipment being used and sends the information to a message queue (we use MSMQ for our message based integration approach). The message only contains the material and equipment ID’s which are GUID’s in our case and the “actuals”, start datetime, end datatime, amount, operator/system etc. The material tracking service achieves this information.
If you look closely you would notices that all services are isolated self contained containers, all storing their share of data and exchanges this data upon requests. The explicit interfaces abstract the clients completely from the underlying data stores. Though the services described are definitely schema dependent. The material tracking service depends on the material / equipment schema. Although the material services is “replaceable” the equivalent should speak the same language! SOA doesn’t only force another POV towards system architecture, it also forces architects to agree upon dependencies between the explicit boundaries of your services. We therefore need flexible business standards like BatchML, B2MML where a common set of schemas is agreed upon and the standard decides which part of schema and thus data is encapsulated within a service boundary.
And you are correct, there are no examples which explain real world scenario’s as the one briefly described above. I also think examples would confront the SOA evangelist with SOA’s shortcomings which are to busy atm with hyping SOA.