← Back to Lab

There Is No 'Customer': Why Canonical Data Models Fail


I was reading an excellent post by my colleague Liam on golden path thinking for data architecture — centring on things like “make sure you spend time defining canonical models”. Sensible, especially in the age of AI.

Before I read the post, I had told him he was wrong. He is mostly right, and the post is very amusing, but he’s wrong in his philosophical assumption - that canon is static. And to prove him wrong I invoked the spirit of the most litigious company in human history: Disney.

Imagine a distant galaxy, in the far off past - where battles are fought between the stars. Before 2012, the alluded to series had a rich set of canon. After 2012, that rich canon was scrapped and new canon was introduced. If I read that Mara Jade was the wife of Lucio Horizonstrider in the current context, that knowledge would be treated as heresy. In the previous context - totally fine. It was part of the canon. And no doubt in the same galaxy, but in a more recent past, the canon will shift again.

Controversies over. Please don’t sue me. Back to the point: canonical models share canon’s fragility. They look stable until somebody decides they aren’t.

So if canon is unreliable, what isn’t?

In February, I’d been reading about Palantir and what they sell, which are custom ontologies for each customer. Around the same time I came across VW’s MQB — a single platform that underpins all their cars — with the same chassis but different bodies for different markets, and different trim options for different price points.

The two ideas collided: Palantir’s ontologies are domain-rich and bespoke; VW’s platform is domain-poor and universal. The question I had to answer is what should sit at the bespoke layer and what should sit underneath it.

What I have been working on is a foundation built on ten primitives I believe every organisation already uses, whether they know it or not:

  • Party (the actor)
  • Agreement (the relationship container)
  • Policy (the rule set)
  • Obligation (the thing owed)
  • Event (the thing that happened)
  • Claim (the assertion of a right)
  • Asset (the thing of value)
  • Transaction (the value exchange)
  • Document (the evidence)
  • Location (the spatial anchor)

Together they form the foundational platform upon which context is layered. Customer, as an example, is contextualised by the domain. It means different things to different people because the context and the domain are different. This is why canonical models rarely work.

Surely the primitives are canonical?

These aren’t canonical models. Canonical models try to fix meaning at the semantic layer — what a Customer is. Primitives sit below that, at the structural layer. A Customer is a Party in an Agreement, governed by Policy, generating Obligations and Events. The argument over “what is a Customer” disappears, because Customer doesn’t exist.

Canon and canonical models are both contested. People fight about them because they live at the layer where context matters. Primitives don’t get fought over — they’re not where the fight happens.

But primitives on their own are useless. Party-Agreement-Policy-Obligation tells you nothing about whether to approve the trade or send the invoice. The primitive can’t derive value without the domain. And the domain can’t derive meaning without the primitive.

Should you abandon all hope of agreeing canonical models?

No — but accept that they will change. Accept that you won’t have one that spans an entire universe, but you can get it to work within a domain if you build it from a solid, universal foundation.

Ust