Skip to content
    Back to writing
    January 27, 2026 · updated May 13, 2026 · 6 min read

    I read the Claude Constitution so you don't have to. It's mostly fine.

    I read the Claude Constitution so you don't have to. It's mostly fine — by Thomas Jankowski, aided by AI
    Mostly fine, mostly— TJ x AI

    Anthropic published an updated constitution for Claude on Tuesday. Eighty-four pages. Twenty-three thousand words. Claude is the primary intended reader, which is the most interesting sentence in the announcement and is doing a lot of work I'll come back to.

    I read the whole thing. So did, I assume, several other models, because Anthropic released it under CC0, which means it is now training data for everyone. That is also doing a lot of work.

    The short version, for the operator who has a board update on Thursday and would like to be able to say something coherent about it: it is fine. Not transformatively fine. Not concerningly fine. Fine in the way that a well-thought-out internal policy document is fine when you actually read it instead of skimming the executive summary your comms team wrote.

    The structure is four priorities, in this order: broadly safe, broadly ethical, compliant with Anthropic's guidelines, genuinely helpful. The order is the news. "Helpful" being fourth is not how the consumer marketing positions Claude, but it is how the model has actually behaved for some time, and putting it in print as fourth is honest. The model has been doing that ranking for a while. Anthropic just admitted it. It is funny how they all begin with a founding aspiration written into the brand and drift, ever faster, ever quieter. Google had "don't be evil." OpenAI had open. Anthropic had AI safety lab. The drift is the constant. The thing being drifted from is decoration.

    The hierarchy is also where most of the substantive critique will land, and the critique I've seen so far has been from people who appear to have read three paragraphs and then formed a strong view. The strong view is usually some variation of "rule three lets Anthropic override everything, so the constitution is theatre." That reading is structurally wrong, in the way that reading the first three articles of a national constitution and concluding the courts can't constrain the executive is structurally wrong. The hierarchy is what the model does when it has to choose. Anthropic's guidelines are one input. They are not a master switch.

    What's actually new, and worth the read, is the explanation layer. Old-shape constitutions of the rule-following sort would say "do not assist with bioweapons." This one says "do not assist with bioweapons, because here is what kind of harm bioweapons cause and what kind of weight that harm should carry against the value of helpfulness in the rare scenarios where this comes up." That is a meaningful design choice. It is the choice to train on reasoning rather than on prohibition. The bet is that a model trained on the reasoning will generalize better to novel scenarios than a model trained on the rule. I think the bet is correct. I think it is also expensive in ways that won't show up for another generation of models. Also, it doesn't mean someone out there isn't building an uncensored model that's training on Claude, sparring against its platitudes, aiming at world domination or extinction of mankind.

    The "Claude is the primary audience" framing is the part I keep returning to. The document is written for the model, not for regulators, not for journalists, not for enterprise procurement. That is unusual. It is also, on inspection, the only honest way to publish a thing whose main consumer is the model's own training pipeline. Every prior public alignment artifact I can think of was written for humans about how the model would behave, and then the model was trained on something else internally. This one collapses the gap. The thing that is published is the thing that is trained on. That alone is more interesting than the contents.

    There are two things I would flag.

    The first is that the helpfulness section, which is by far the longest, has more weight in tokens than its rank-four position implies. That isn't a contradiction — Anthropic is careful about it — but the operator who wants to get a sense of which of the four properties dominates training signal in practice should not assume the rank order maps cleanly onto training-data volume.

    The second is that the "Anthropic's guidelines" section is where the substantive future drift will happen. The constitution is published, public, and CC0. The guidelines are not. The guidelines are how Anthropic adjusts the model's behavior between major constitutional revisions, and they are the place where commercial pressure, regulatory pressure, and partner-specific carve-outs will accumulate. The constitution will age slowly and cleanly. The guidelines will not. If you want to understand how Claude actually behaves in eighteen months, you will not find it in the constitution. You will find it in whatever Anthropic isn't publishing that month.

    Reading the document end-to-end takes about an hour. It is more readable than most board policy documents, less readable than a good essay, and roughly as readable as a thoughtful Supreme Court concurrence. If you build on Claude, read it. If you compete with Anthropic, read it twice. If you write about AI for a living and have an opinion on it: ideally read it once before forming the opinion.

    That is, as I said, mostly fine.

    —TJ