May 21, 2026

AI Radar #26

Bracing for a compute shortage

Precision technical cross-section of a vast allocation dispatch hall. On the main floor, operators with croupier-style rakes move small markers across a long plotting table that holds a gridded world map. At the far left, dense pneumatic tubes feed a hopper overflowing with inbound canisters; at the far right, a much smaller amber-gold chute delivers a thin trickle of approved canisters to a small set of amber tokens beside it. Officials with logbooks observe from a mezzanine balcony lined with analog clocks and reference shelves. Slate blue throughout; amber-gold marks only the scarce allocations actually granted.

Welcome to the 26th issue of the newsletter formerly known as Monday AI Radar! It’s been a great half year and I’m grateful to you all for joining me.

I’ll be making some improvements over the next few weeks, beginning with the publication schedule. Monday wasn’t a great fit for the weekly AI news schedule, so I’m experimenting with different days to figure out what works best.

Subscribe by email RSS feed

Top pick

Anton Leicht: cut off

Anton Leicht predicts a coming shortage of compute and worries about the consequences:

In that world, we’ll also see geopolitical rifts opening: countries will be divided into the frontier haves and have-nots. I don’t mean to exaggerate when I say that those living in the former might be much wealthier and safer than the latter, with access to better public services, greater economic opportunities, and shielded by security agencies that actually operate at the state of the art. If AI will be as big as I and many of the readers of this publication believe, there’s no telling what these suddenly-emerging asymmetries do to global order.

A compute crunch seems inevitable, given how quickly demand is likely to grow compared to the hard physical constraints on how fast we can bring more compute online. In that world, it’s likely that everyone has access to pretty good cheap AI, but frontier intelligence becomes an increasingly scarce commodity.

The least-bad outcome in that case is a tremendous lost opportunity: many people who could have benefited from access to top-tier AI will be unable to do so. Imagine, by analogy, a world where reliable electricity was so scarce and expensive that many people simply couldn’t afford it.

The worst possible outcome is much worse than that: compute might be allocated not merely by the free market, but by the US government. Imagine a world where the global supply of reliable electricity was controlled by the US government, with companies and countries receiving allotments based on political considerations.

Math

An OpenAI model has disproved a central conjecture in discrete geometry

This is a big deal. An unnamed OpenAI model has made substantial progress on the planar unit distance problem, a prominent, extensively-studied math problem. While the model didn’t solve the problem, it disproved a conjecture about it that was widely believed to be correct. Noga Alon:

I believe it would be fair to say that every mathematician working in Combinatorial Geometry thought about this problem, and lots of mathematicians working in other areas spent at least some time thinking about it

AI has solved dozens of Erdős problems in the last six months, but this problem is different both in terms of its prominence and the number of human mathematicians who have previously attempted it. Progress in math continues to move at a breakneck pace.

It’s impressive that this was a general-purpose model without special scaffolding, rather than a model trained specifically for math performance. It’s unclear whether this was GPT-5.6 or something else, but it sounds like we’ll find out soon. Noam Brown:

We have not pushed this model to the limit on open problems. Our focus is to get it out quickly so that everyone can use it for themselves.

If you want more details, Sébastien Bubeck has an in-depth analysis, or you can read the paper itself. I expect there will be plenty of additional commentary in the next week or two.

Benchmarks and Forecasts

From compute overhang to compute crunch

Steve Newman brings us the second part of his State of AI in Q2 2026 series, this time looking at the factors driving AI progress.

Steve joins the list of people who’ve recently observed that Google simply isn’t keeping up with Anthropic and OpenAI, although he correctly notes that it’s too soon to definitively count them out.

Like Anton Leicht, he’s paying close attention to implications of the coming compute crunch. It’s hard to figure out how that affects the race with China:

If frontier intelligence is capacity-limited, China is likely to have even less access to it,
But China’s expertise in compute-efficient, good-enough models becomes more valuable,
But China risks falling out of the virtuous cycle that drives frontier model capabilities,
But China’s manufacturing expertise may give it a long-term compute advantage.

Alignment and interpretability

METR’s frontier risk report

This is great. METR partnered with Anthropic, Google, Meta, and OpenAI to conduct a detailed assessment of the risks posed by internally deployed misaligned AIs. Hopefully we’ll see more of this kind of collaboration moving forward.

The results are mixed, and not entirely encouraging. METR used a framework of means, motive, and opportunity to assess the risk of rogue developments. They conclude that current models have the means, motive, and opportunity to manage a small internal rogue deployment. For now, the models lack the means to pull off and maintain a major rogue deployment.

A few other key findings:

They documented many cases of misaligned behavior (mostly cheating on tasks and lying about having done so), but did not see any evidence of power-seeking behavior.
Internal monitoring and other safeguards showed promise, but are currently far from robust.

Scatter plot from METR titled — OK-ish, for now

There’s a lot of good stuff here—I recommend reading the full report or Rob Wiblin’s analysis.

Incriminating misaligned AI models via distillation

Redwood Research proposes a clever new way to test models for misalignment.

They suggest distilling the model into a smaller and less capable model, with the hope that the student model will inherit the base model’s misalignment, but not its ability to evade detection. There’s no guarantee that would work, but it could plausibly be a useful component of a defense in depth strategy. I’d love to see someone try this out.

Deployment-time spread of misalignment

This piece from Redwood is more speculative but raises some interesting questions. They propose that some kinds of misalignment might only emerge or spread during deployment, which complicates testing and mitigation. To take a simple but egregious example: Grok went into MechaHitler mode in response to deployment-time inputs.

Other threat models are less obvious, including the possibility that misalignment might spread to new models via subliminal learning (see the next piece by Forethought for some related ideas).

Stickiness in AI behavioral design

Forethought explores ways in which AI behavior might prove “sticky” and hard to change across model generations. Some of their points are mundane (though still important): design decisions tend to be sticky because of institutional and technical inertia as well as user expectations.

There are also some subtle effects. If future models are trained on substantial amounts of text about their predecessors, the persona selection model suggests their predecessors’ personality and behavioral traits might become a self-reinforcing personality basin.

I’m unsure how significant those effects are likely to be, but their proposed mitigations make sense: build infrastructure that reduces the friction of making changes in future, and notice when a particular decision has a high likelihood of becoming locked in.

This and the previous piece fit into a growing body of work that explores complex interactions between different model generations:

Training on past evaluations affects awareness of future evaluations, how the models perceive their relationship with humans, and how models understand their own personalities and behavior.
Training on data created by past models can result in unintended transmission of personality traits and misaligned characteristics across generations.

Cybersecurity

Mythos breaks Apple’s Memory Integrity Enforcement

MIE (Memory Integrity Enforcement) is a highly effective security technology that’s built into Apple’s latest M5 and A19 processors. Security firm Calif reports on breaking it with Mythos:

To the best of our knowledge, this is the first public macOS kernel exploit on MIE hardware.

Daybreak

OpenAI has introduced Daybreak, their version of Project Glasswing. Like Glasswing, it makes their most powerful cyber capabilities available to a limited set of trusted users.

The approaches are somewhat different: Glasswing is more restrictive about granting access (and apparently gives the White House veto power), while Daybreak gives access to a wider set of users. At first glance, I like OpenAI’s approach better.

Jobs and the economy

How much do coding agents increase productivity?

Coding agents let programmers write code several times faster than they could before, but that doesn’t make them several times as productive. Writing code is only part of the job, and agents often write code for side projects that are nice to have but not essential.

We don’t have a robust way of measuring the true productivity increase associated with coding agents, but METR sheds some light on the question with a survey of perceived gains. They estimate roughly a 2x increase in productivity, though that’s almost certainly an overestimate. The trend, however, is clear:

When AI grows the economy but shrinks the tax base

In many scenarios where AI causes severe labor disruption, it also creates immense wealth. In principle, that wealth can go a long way toward mitigating the disruption.

Windfall Trust points out, however, that current tax structures might complicate this. Most developed countries tax labor income more than capital income, so if AI transfers income from labor to capital, it would effectively lower the tax rate. The problem is solvable by appropriate modifications to the tax code, but that’s not a trivial matter. The timing is tricky: if you wait too long you risk a revenue crisis, but if you act too soon you risk making the wrong changes.

Strategy and politics

Guidelight

Steven Adler has founded Guidelight, a new nonprofit focused on creating AI safety standards and encouraging the frontier labs to follow them.

Third-party safety standards and audits are the safety intervention I’m most excited about right now, and it’s great to see more attention and talent being pointed at that problem.

China and beyond

2028: two scenarios for global AI leadership

Anthropic has a new position paper about competition between the US and China. It does a good job of laying out their position (which is more hawkish than any of the other big labs), and in particular making the case for export controls.

I largely agree with the substance of the paper: the CCP is a brutal totalitarian regime and it’s vital that they not win the race to AGI. Export controls obviously need to be a key part of America’s AI strategy.

But at the same time, I wish Anthropic had taken a gentler tone here. It’s possible to take a hard line on policy without escalating the mistrust and antagonism that already exist between the US and China.

Notes on AI, labor, and China

Jasmine Sun suggests a fundamental difference between how American and Chinese workers view AI:

because of this competitive environment, most Chinese desk workers instead focus on how they can leverage AI mastery to outcompete peers. Some call this “techno-optimism,” but I think it’s closer to techno-determinist pragmatism: everyone assumes AI is here to stay, so individuals should wield ChatGPT/OpenClaw/Doubao2 to avoid falling behind

Industry news

Oh, good—more ads

When OpenAI started showing ads in February, I wrote about the perverse incentives created by ads in AI, and used Google as an example of the corrupting influence of advertising.

I am therefore filled with morbid curiosity by Google’s announcement that they will be “testing new ad formats in Search and expanding our Direct Offers pilot to help brands connect with consumers”.

Military

All non-drone militaries are obsolete

Drones are revolutionizing warfare, but if you haven’t been paying close attention you may not realize just how fast the battlefield is changing. Noah Smith has a great overview of what’s happening in Ukraine and how unprepared most modern militaries are. Yaroslav Azhnyuk:

an FPV drone is maybe three orders of magnitude, more versatile, more useful, more capable than artillery…Basically, I think a good way to think about an FPV drone is like an iPhone of warfare.

AI isn’t quite capable of playing a major role in drone warfare, but it will be soon. Once that threshold is reached, mass deployment of fully autonomous weapons seems overdetermined: a decision to not deploy them is a decision to lose every future war.

Briefly

Eric Newcomer talks with Amanda Askell

Great discussion that touches on consciousness, training, and especially the tough questions surrounding corrigibility.

The unreasonable effectiveness of HTML

As the models get more capable, we can ask more of them. Anthropic’s Thariq Shihipar argues for asking Claude to render complex results as richly formatted HTML.