← Back to Lab

Pride and Primitives


It is a truth universally acknowledged, that a single dataset in possession of good volume, must be in want of understanding. However little known the feelings or views of such a dataset may be, this truth is so well fixed in the minds of the consulting industry, that it is considered the rightful property of some ontology or other.

And while Jane Austen has never gone out of fashion, modelling (ontological, semantic, data) has recently come back into fashion. Partly because AI needs structure to be useful. Partly because everyone’s tired of building the same thing from scratch every eighteen months. And partly because the industry is collectively arriving at a suspicion that the people who said “just shove it in a lake and figure it out later” might not have been entirely right.

So let’s talk about ontologies. What they are, why they matter, and what happens when you take the idea seriously enough to ask whether there’s a universal foundation underneath all of them.

Sense and Semantics

An ontology, in the data sense, is a formal model of what exists in your domain and how those things relate to each other. Not a schema. Not an ERD. A statement about what kinds of things your organisation believes are real and how they connect.

A schema says “this table has these columns.” An ontology says “a Customer is a Party who plays a Role in the context of an Agreement, and that Agreement creates Obligations, and when those Obligations are breached, Claims arise.” The schema tells you where the data lives. The ontology tells you what the data means.

Most organisations have ontologies. They just don’t know it. Every time someone draws a box labelled “Customer” on a whiteboard and draws an arrow to a box labelled “Order,” they’re doing ontological modelling. They’re doing it badly — their first impressions of what “Customer” means are almost invariably wrong — but they’re doing it.

The question isn’t whether you need an ontology. You already have one. The question is whether it’s any good, or whether it merely has the appearance of good design without any of the substance.

Three Suitors at the Door

Three things have converged to make ontological thinking not just useful but urgent. Each comes with its own proposal, and for once, all three are worth accepting.

First, AI agents need stable structure. You can’t point a language model at a data lake and expect coherent answers. You can, technically, but you’ll get confidently wrong ones — all the eloquence of a Wickham with none of the reliability. Agents need a governed, semantically consistent surface to query against. An ontology provides that surface.

Second, regulatory pressure is compounding. GDPR, HIPAA, MiFID II, DORA, the EU AI Act — every one of these operates on universal concepts: data subjects, processing agreements, obligations, breach events. If your data model is bespoke per department, your compliance surface multiplies with every regulation. If your model is ontological, compliance becomes compositional: constraints applied to shared types, not reimplemented per silo.

Third, and most prosaically: everyone is tired of the “single customer view” project that never finishes. I’ve lost count of how many organisations I’ve seen spend eighteen months and a seven-figure budget trying to reconcile customer records across systems, only to end up with a golden record that nobody trusts. The reason it doesn’t work is structural. They’re trying to reconcile the wrong thing. But I’ll come back to that.

Ten Thousand a Year (and Ten Primitives)

Here’s the thesis, stated plainly: most industries share an ontological foundation. The entities that appear in every regulated, commercial, or institutional domain — the actors, the agreements between them, the obligations those agreements create, the events that trigger claims, the transactions that settle them — are structurally identical across industries. What changes is the vocabulary, not the form.

I didn’t design these primitives. I discovered them. Six unrelated industries — insurance, healthcare, banking, retail, consumer packaged goods, data governance — converged on the same structural patterns independently, over centuries, through accumulated institutional practice. Insurance didn’t copy healthcare. Retail didn’t copy banking. They each arrived at the same shapes because the underlying institutional problems are the same. One might almost call it a convergence of sensibility.

There are ten of them: Party (the actor), Agreement (the relationship container), Policy (the rule set), Obligation (the thing owed), Event (the thing that happened), Claim (the assertion of a right), Asset (the thing of value), Transaction (the value exchange), Document (the evidence), and Location (the spatial anchor).

Together, they constitute a platform. Not a straitjacket — a platform. The kind of thing the Volkswagen Group builds when it wants the Golf, the A3, the Octavia, and the Leon to share a chassis while looking and feeling completely different. The primitives are the chassis. Domain-specific vocabulary — what insurance calls a “policy” versus what data governance calls a “policy” — is the body. Your organisation’s particular workflows and business rules are the trim. The chassis is engineered once. Everything above it varies.

And the single most important structural insight? There is no customer. “Customer” is not an entity. It’s a role that a Party plays in the context of an Agreement. Decompose it and you get three independently valuable things: who is it (the Party), what are they doing (the Role — Purchaser, Loyalty Member, Account Holder), and under what terms (the Agreement that created the relationship). A single Party might play half a dozen roles — rather like a certain someone at a ball who is at once a gentleman, a landlord, a neighbour, and a suitor, depending entirely on which relationship you’re examining.

That decomposition is why the single customer view project fails: you can’t reconcile “customers” across systems because each system captured a different relationship. Resolve the Party first, and the roles sort themselves out.

Advantageous Connections

Build the primitives once, and three expensive problems get dramatically cheaper.

Ontological modelling — the months of discovery workshops, the “what do we mean by customer” arguments, the entity-relationship diagrams that nobody maintains — collapses from invention to configuration. The 60–70% of structure that is universal comes free. The bespoke work shrinks to domain colouring and system integration. What once required a prolonged and expensive courtship now requires merely an introduction.

AI agents become transferable. An agent trained against the Party → Agreement → Obligation → Claim chain works in insurance and healthcare and banking. The domain colouring provides the specifics. The structural pattern — the API surface — is shared.

And cross-divisional analytics become structurally possible. A conglomerate with insurance, banking, and healthcare divisions currently has three incompatible ontologies — three great houses that do not visit. With shared primitives, the query “show me all Parties with an active role across more than one division” is answerable. Without them, it’s a six-month data integration project.

In Which the Author Displays His Ignorance

I should be upfront about what the primitives don’t do, because it makes the argument stronger.

They create a risk of false fluency. Someone who learns the ten primitives might believe they understand insurance, or healthcare, or CPG, when they understand only the skeleton. It would be a bit like reading a summary of all Austen’s plots and believing you understand the novels. The domain colouring — the reserves and subrogation in insurance, the diagnosis codes and prior authorisations in healthcare, the trade terms and proof-of-performance in CPG — is where the hard problems live. Where the real expertise resides. Where the decades of institutional knowledge are concentrated. The primitives don’t replace that expertise. They provide the foundation upon which it’s expressed.

The primitives tell you that a Claim exists. The domain colouring tells you how to adjudicate it. Skip the colouring and you have an elegant diagram and a useless system. Skip the primitives and you have a useful system that can never talk to the one next door.

I have been a good deal happier than I deserve to find that the primitives hold across six industries. I am not so foolish as to claim they’ll hold across all of them without further testing.

Reader, We Coloured It.

Ontologies are back because the problems they solve — semantic fragmentation, duplicated effort, AI fragility, regulatory complexity — have become too expensive to keep ignoring. The universal primitives are an attempt to take that seriously: to compound institutional knowledge rather than reinvent it, to build the chassis once rather than re-engineering it per project, per domain, per department.

Ten primitives validated across six industries. The domain colouring provides the specificity. And if the whole thing sounds too good to be true — well. I did warn you about first impressions.

Ust