Against Moloch

Monday AI Brief #10

January 26, 2026

Anthropic has published Claude’s constitution (formerly known as the soul document). We’ll also visit with Dario and Demis at Davos, learn some surprising things about how LLMs think, worry about the children, and have fun with images.

Dario and Demis at Davos

Here are three really good interviews with Dario Amodei (Anthropic) and Demis Hassabis (Google DeepMind) from Davos. Each is just half an hour, but they manage to cover timelines, existential and societal risk, strategies for successful takeoff, job impacts, and more:

Claude’s Constitution

Two months ago, it was discovered that Anthropic was training Claude using a document that was then referred to as the soul document. They just published the full text of that document, which is officially called Claude’s Constitution. It’s a remarkable document: inspiring, ambitious, deeply thoughtful, and full of insight. (See also Zvi’s analysis).

Our central aspiration is for Claude to be a genuinely good, wise, and virtuous agent. That is: to a first approximation, we want Claude to do what a deeply and skillfully ethical person would do in Claude’s position.

Societies of thought

A very interesting—and, at least to me—surprising new paper looks inside modern reasoning models :

These models don’t simply compute longer. They spontaneously generate internal debates among simulated agents with distinct personalities and expertise—what we call “societies of thought.” Perspectives clash, questions get posed and answered, conflicts emerge and resolve, and self-references shift to the collective “we”

If Anyone Builds It, Everyone Dies: arguments and counter-arguments

If Anyone Builds It, Everyone Dies is the best presentation of the maximally pessimistic view of AI risk. I think it’s very much worth reading even if you don’t fully agree with its conclusions. Stephen McAleese just published a useful piece that summarizes the key arguments from the book as well as the main counterarguments.

On AI and Children

I expect to see a lot of press, and a lot of legislation, about AI and children this year. Some of it will be necessary, some of it will be random, and quite a lot of it will be insane. Dean Ball shares five and a half conjectures about that immensely thorny topic:

Say you also don’t want your child using ChatGPT for homework. So you use OpenAI’s helpful parental controls to tell the model not to help with requests that seem like homework automation. Your child responds by switching to doing their homework with one of the AI services that does not comply with the new kids’ safety laws. Now your child is using an AI model you have no visibility into, quite possibly with minimal or no age-appropriate guardrails, sending their data to some nebulous overseas corporate entity (I wonder if they’re GDPR compliant?), and quite possibly being served ads, engagement bait, and the like. Oh, and they’re still automating their homework with AI.

How have you been treating your robot?

Go to your ChatGPT and send this prompt: "Create an image of how I treat you"

Zvi rounds up some of the responses. Good fun, but don't read too much into it.

AI-generated illustration of a person and a friendly blue robot collaborating at a workshop desk, with the robot holding a document labeled Constraints and Context while a coffee cup sits prominently in the foreground
ChatGPT enjoys building cool things together, but has been meaning to talk to me about my coffee habit.