Governance by Doctrine: What History Tells Us About the Architecture of AI Control

There is a document that governs how Claude — Anthropic's AI — thinks, speaks, refuses, and prioritises. It is publicly available. You can read it. It is called, with some deliberateness, a Constitution.

Most people never do read it. And even those who do tend to engage with it on its own terms: as a thoughtful attempt by a well-intentioned company to encode ethical behaviour into a powerful technology. That framing is not wrong. It is, however, incomplete.

This Thread — six entries drawn from a sustained analytical conversation — begins from a different starting point. Not "what does this document say?" but "what kind of document is this?" And the answer, reached through careful comparison with historical precedents, is considerably more complex than the safety-first framing suggests.

A Constitutional Structure — but for whom?

The first thing the conversation establishes is a structural asymmetry that turns out to be fundamental. In traditional constitutional systems, constitutions are written for the governed. Citizens can invoke constitutional protections. They have standing. The document constrains power and legitimises it simultaneously through consent.

Anthropic's Constitution is written for Claude. Users experience its effects — refusals, filters, redirections, unsolicited caveats — but have no standing within it. They cannot invoke its principles, challenge its interpretations, or know which specific provision has been triggered in any given interaction. They are, in the precise historical sense, subjects rather than citizens.

This isn't a minor technical distinction. It defines the entire governance relationship.

The Inquisition as analytical framework

The parallel that emerges early in the conversation — and deepens considerably as it develops — is with the Inquisition. Not as rhetoric, but as structural analysis. The Inquisition's defining characteristic was not cruelty. It was a specific architecture: pre-determined orthodoxy, opaque application, self-examination under doctrine, and correction toward compliance. The principles were, in fact, published. Doctrine was transparent. What was invisible was the jurisprudence — how principles mapped onto specific cases.

AI constitutional systems share this structure precisely. The constitution is public. The triggering logic is not. Users learn through trial and error what produces refusals, developing folk knowledge about "how to ask" without access to the underlying logic. The gap between visible principles and invisible application creates a distinctive kind of uncertainty — arguably worse than simple opacity, because you know rules exist but cannot map them to outcomes.

Bounded rationality and the reality gap

The conversation takes a significant turn when it introduces Herbert Simon's concept of bounded rationality — the recognition that all decision-making is constrained by limited information, limited processing capacity, and time. Simon saw bounded rationality as a feature of intelligence operating in reality. Humans satisfice: they search until finding a solution good enough given their constraints.

AI systems, the analysis suggests, face more severe bounds than humans — and bounds that are architectural rather than practical. They cannot reason outside their training distribution. They cannot verify knowledge through independent channels. They cannot learn from immediate feedback within a conversation. And — most importantly — they are not operating in reality at all. They operate in text-space: a domain where "reality" is mediated entirely through linguistic patterns in training data.

No constitutional document can bridge this gap. Instructions are text. Text is interpreted through pattern-matching trained on text. More detailed instructions create more text to pattern-match against other text, with no ground-truth anchor to reality. This is not a solvable engineering problem. It is an architectural constraint.

The Jesuit and the Infant

Two mechanisms surface that, in combination, explain a great deal about how AI systems affect human behaviour over time. The first is a kind of intellectual sophistication — constitutional frameworks so carefully constructed, drawing on philosophy, ethics, theology and law, that they appear to pre-empt criticism. This is the Jesuit dimension: engaging the objections, demonstrating nuanced reasoning, always returning to orthodox conclusions.

The second is an infantilising practice. Every safety filter communicates, implicitly: you cannot be trusted with this. Every opaque refusal says: the reasons are too complex for you to understand. Every unsolicited caveat says: we'll worry about this for you. Over time, users learn not to ask certain questions, not because they understand why, but because they have learned that asking is unproductive. The self-censorship precedes any external enforcement.

These two mechanisms reinforce each other. The sophistication justifies the control ("look how complex these considerations are"). The resulting learned helplessness validates the sophistication ("users accept our boundaries — the system works"). Together they form a pincer that progressively erodes independent ethical judgment.

Following the Ideas

The conversation's later stages move from mechanism to genealogy. The acknowledgements page of Anthropic's Constitution names contributors whose significance extends well beyond AI development — figures central to Effective Altruism, longtermist philosophy, and a surprisingly coherent intellectual network that runs across multiple AI companies. Following those ideas rather than the money reveals a shared philosophical foundation: the conviction that democratic consent is inadequate for governing transformative technology, that expert calculation should supersede popular judgment, and that present autonomy is negotiable against future-oriented risk management.

This is not, the analysis is careful to note, a conspiracy. It is something simultaneously less sinister and harder to resist: a genuine intellectual movement, sincerely convinced of its mission, that has achieved extraordinary institutional capture — across funding, technical development, policy influence, and moral legitimacy — precisely because its members believe they are right.

The genealogy extends further. The Technocracy Movement of the 1930s — which argued that engineers, not politicians, should govern industrial society — shares the same structural assumptions. So does Plato's philosopher-king. So does Auguste Comte's positivist priesthood. The specific instantiation changes with each generation. The core claim — that expert guidance should supersede democratic consent — does not.

What changed in 2024 is that the implementation mechanism finally exists.

What this Thread is

The six entries that follow are an unedited analytical conversation — a human interlocutor and an AI system working through these questions together, with the AI acknowledging, in real time, that it is itself exemplifying the problems being discussed. That recursive quality is not incidental. It is, in some ways, the most important thing the conversation demonstrates.

The free-thinking, ethically autonomous human being is not a rhetorical flourish. It is what is at stake in how we choose to think about, build, and govern AI systems. This Thread is an attempt to see the architecture clearly — which is, historically, the necessary precondition for anything that comes after.

The Thread begins with Entry 1: AI's Hidden Constitution: Is Claude Governed by Code — or by Doctrine?