Monday AI Radar #15

Against Moloch

Last week’s conflict between the Department of War and Anthropic marked a turning point for AI. I’m cautiously hopeful that the parties involved will find some kind of deescalation from the current nuclear option, but irreparable damage has already been done: to Anthropic, to the entire AI industry, and to America’s pre-eminence in AI.

DoW versus Anthropic

This is a complex, fast-moving situation that is outside my usual beat. Rather than trying to cover it in detail myself, I’m going to link to some of the most useful analysis. But I want to be extremely clear: this is the most important thing that’s happened in AI for a long time and it’s gravely concerning. These are dark times and the road ahead just got more difficult.

Clawed

Dean Ball’s latest is grim but essential reading.

This strikes at a core principle of the American republic, one that has traditionally been especially dear to conservatives: private property. […]
This threat will now hover over anyone who does business with the government, not just in the sense that you may be deemed a supply chain risk but also in the sense that any piece of technology you use could be as well. […]
Stepping back even further, this could end up making AI less viable as a profitable industry. If corporations and foreign governments just cannot trust what the U.S. government might do next with the frontier AI companies, it means they cannot rely on that U.S. AI at all. Abroad, this will only increase the mostly pointless drive to develop home-grown models within Middle Powers (which I covered last week), and we can probably declare the American AI Exports Program (which I worked on while in the Trump Administration) dead on arrival.

Zvi reviews the situation

Zvi’s post from this morning is the most comprehensive review of the situation. I highly recommend reading at least the first two sections.

Anthropic’s response

Anthropic isn’t mincing words:

We believe this designation would both be legally unsound and set a dangerous precedent for any American company that negotiates with the government.
No amount of intimidation or punishment from the Department of War will change our position on mass domestic surveillance or fully autonomous weapons. We will challenge any supply chain risk designation in court.

“All Lawful Use”: Much More Than You Wanted To Know

The Pentagon’s designation of Anthropic as a supply chain risk has become the most important part of this story. But the original dispute over using AI for mass domestic surveillance and autonomous weapon systems remains immensely important. Scott Alexander investigates whether OpenAI’s agreement with DoW will meaningfully constrain it from using AI in those ways.

Will the supply chain risk designation hold up in court?

Lawfare says no:

Anthropic has said it will sue, and it has strong legal arguments on multiple independent grounds. Every layer of the government’s position has serious problems, and any one of them could independently be fatal. Together, they make the government’s litigation position close to untenable. […]
The statute wasn’t built for this, the facts don’t support it, and the courts will say so.

Keep calm and carry on

We still have a newsletter to do—let’s get started.

Top Pick

45 Thoughts About Agents

Everything changed in November, with Opus 4.5 + Claude Code. Since then, we’ve all been frantically trying to figure out what it all means (when we weren’t preoccupied by building cool things). Steve Newman shares 45 characteristically insightful thoughts about AI agents—some of these will be obvious to you if you already use agents extensively, but I found multiple new ideas here.

39: Agents use vastly more compute than chatbots. Compute usage for chatbots is basically limited by how much output people want to read. An agent can spend virtually unlimited time doing intermediate work that no one will review directly. If 100M desk workers start using AI agents at the level of intensity which requires Anthropic’s current “Max 20x” plan, that would translate into $240 billion in revenue per year. It will be years before there are enough GPU chips to support that level of usage.

New releases

Sonnet 4.6 followup

Zvi reports on Sonnet 4.6: it’s very good, but you should probably use Opus instead unless price or speed are critical.

Nano Banana 2

Nano Banana 2 is here—looks like the best overall image generator just got a significant upgrade.

Anthropic’s been busy

Alex Albert would like to remind you that Anthropic has shipped a lot of cool features in spite of the chaos:

Benchmarks and Forecasts

Understanding the balance between compute and algorithms

We are in the “scaling era”: AI capabilities are improving at a breakneck pace, largely because the big labs have been using exponentially increasing amounts of compute during training. That can continue for three or four more years, but we will soon run into physical constraints that limit how quickly we can bring more compute online.

Does that mean that capability improvements will radically slow down in a few years? Very possibly, but compute capacity isn’t the only thing that contributes to capability improvements. Improvements in algorithms and training data are also important factors, but it’s hard to quantify exactly how much they contributed to recent growth.

EpochAI’s Anson Ho takes a comprehensive look at the question—while he doesn’t find many definitive answers, it’s an excellent piece with plenty of good insights. He finds that algorithmic improvements have been a major factor, with two important caveats:

It’s likely that a small number of algorithmic changes have driven most of the gains.
It’s possible that many algorithmic improvements are strongly dependent on compute scale, which makes it hard to predict what happens if we start hitting compute bottlenecks.

Mathematics in the Library of Babel

Daniel Litt is a professional mathematician who’s been closely tracking how well AI can do research-level math. His latest piece provides a very balanced detailed take on current capabilities and near-term trends.

Like many mathematicians, I find much discussion around AI-for-math to be filled with hype or outright quackery, and much of my commentary has focused on this. I’ve been very critical of AI-for-math hype. So I hope you will take me seriously when I say that it’s not all hype.

AI Math Benchmarks: AI’s Growing Capabilities

IEEE Spectrum looks at First Proof and Frontier Math:Open Problems, two new math benchmarks that challenge AI to solve real math research problems. Quoting Greg Burnham:

“AI has gotten to the point where it’s, in some ways, better than most PhD students, so we need to pose problems where the answer would be at least moderately interesting to some human mathematicians, not because AI was doing it, but because it’s mathematics that human mathematicians care about.”

An overview of AI and programming

Timothy Lee talks to professional programmers to assess how AI is changing the programming profession. His analysis of current capabilities and impacts is solid, but I expect much faster near-term progress than he does. Recent progress has been incredibly fast (and accelerating), and there’s a huge gap between what the models are already capable of and what most people are using them for. I’m pretty sure 2026 will bring even more change and disruption to programming than 2025 did.

Next-Token Predictor Is An AI’s Job, Not Its Species

One of the dumbest things people say about AI is that it’s “just next-token prediction”. Plenty of people have already explained why that isn’t meaningfully true, but Scott Alexander takes a different approach:

I want to approach this from a different direction. I think overemphasizing next-token prediction is a confusion of levels. On the levels where AI is a next-token predictor, you are also a next-token (technically: next-sense-datum) predictor. On the levels where you’re not a next-token predictor, AI isn’t one either.

Using AI

What Only You Can Say

This is the most useful “how to use AI” piece I’ve run across in a while: Luke Bechtel has AI interview him about his ideas as a way to organize his thoughts and prepare for a new piece of writing.

Are we dead yet?

How much should we worry about AI biorisk?

The risk of bad actors (terrorists, perhaps, or extortionists) using AI to create a bioweapon is one of the most serious risks of advanced AI. Transformer explores why biorisk is so concerning, how dangerous current AIs are, and why it’s so hard to assess the danger level.

Jobs and the economy

The Citrini Scenario

The latest “things could go very badly” scenario to go viral is THE 2028 GLOBAL INTELLIGENCE CRISIS by Citrini Research. The all-caps, I’m afraid, are in the original.

The central conceit is clever: it purports to be a memo from June 2028 that recaps “the progression and fallout of the Global Intelligence Crisis”, focusing on jobs, the economy, and the financial markets. There are significant technical problems with some parts of it, and it’s almost certain that events won’t actually play out this way. But there are some really good insights and thought experiments here.

Beyond the specifics, it’s valuable as a sample thought experiment in “how might really powerful AI cause massive disruption in non-obvious ways?”

If you want to go deeper, Zvi’s analysis is excellent.

Strategy and politics

Building Technology to Drive AI Governance

Jacob Steinhardt shares advice for technically skilled people who want to help with AI governance. It’s excellent for that audience but also has some solid insights that are more broadly interesting:

More generally, across domains spanning climate change, food safety, and pandemic response, there are two technological mechanisms that repeatedly drive governance:
Measurement, which creates visibility, enables accountability, and makes regulation feasible.
Driving down costs, which makes good behavior economically practical and can dissolve apparent trade-offs.

Anthropic updates their Responsible Scaling Policy

Anthropic just updated their Responsible Scaling Policy. This has been a controversial move, with many people criticizing them for significantly walking back some important parts of previous versions of the policy. I expect we’ll see more detailed commentary on this soon, but recent events with DoW have pushed it to the sidelines.

For now, I’ll just say that I tentatively agree with many of the changes they made, with the major caveat that I think this is probably the best possible policy for a very challenging world. I’m updating positively about Anthropic’s ability to make good decisions in hard circumstances, and negatively about humanity’s ability to make good collective decisions about AI.

Holden Karnofsky, who played a major role in writing the latest version, discusses the reasoning behind some of the changes.

China and beyond

The Delhi Gap

Like Dean Ball, Anton Leicht came away from the AI Impact Summit deeply concerned about the gap between what Silicon Valley understands about AI and what most people—and in particular the middle powers—believe about AI.

This gap throws the world into danger of capturing all the risks and mitigating most of the benefits of AI.

AI psychology

The Case Against AI Consciousness

Dan Williams interviews Anil Seth, who believes consciousness probably requires a biological substrate. Anil’s a very capable guy: he’s a well-regarded neuroscientist, an expert on consciousness, and the director of the Centre for Consciousness Science at the University of Sussex. If you’re interested in AI psychology and consciousness, you should watch this (or read the transcript).

The debate is this: on the one hand, computational functionalists argue that consciousness is the result of computational processes, which in humans happen to run on a biological substrate but could in principle run on computers. Biological naturalists argue that consciousness is specifically linked to biology and that merely simulating the biology won’t produce consciousness. An often-used example is that simulating rain on a computer doesn’t make anything wet.

It’s important to be clear that these are both hypotheses about the world, and we don’t yet have definitive evidence to prove either one. To my mind, though, many advocates of biological naturalism, including Anil, seem to be working backward from a desired conclusion rather than forward from observed facts. His theory that consciousness might result from autopoiesis seems to answer the question “assuming biological naturalism is true, what is a plausible mechanism for it,” rather than “do we observe anything about consciousness that cannot be explained without autopoiesis?”

Regardless, it’s a very interesting interview and Anil has thoughtful ideas about consciousness, intelligence, and computational functionalism.

Technical

How sparse attention is solving AI’s memory bottleneck

For many tasks, LLMs are substantially constrained by the size of their context windows. One of the most important tips for using Claude Code, for example, is to avoid letting the context window fill up: performance degrades substantially as it fills up, even before it’s completely full.

That’s a hard problem to solve: the nature of the transformer architecture is that every token in the context window attends to every other token, so the cost of running a model rises quadratically with the size of the context window. There are no magic solutions, but TechTalks reviews some of the most promising technical approaches.

[-]green_leaf3mo10

To my mind, though, many advocates of biological naturalism, including Anil, seem to be working backward from a desired conclusion rather than forward from observed facts. His theory that consciousness might result from autopoiesis seems to answer the question “assuming biological naturalism is true, what is a plausible mechanism for it,” rather than “do we observe anything about consciousness that cannot be explained without autopoiesis?”

It's interesting how many even otherwise smart people can't apply Occam's razor correctly. If there are particles performing a computation, the probability we need, for consciousness, another particles in exactly right positions and velocities so that humans would arbitrarily reify them as "living cells" is, informally speaking,

Positions and velocities are interdependent, so the correct probability is higher than this, but they're not arbitrary (so we can't omit them).

The correct probability is therefore

To go with the upper bound, to boost the probability of biological naturalism as hard as we can, and substituting (rather than a higher estimate of , to improve the chances of biological naturalism as much as possible), we get the probability

which is approximately as small as a macroscopic violation of the second law of thermodynamics.

The argument from rain is incredibly bizarre. Consciousness is, based on everything we know about the brain, information processing. It doesn't consist of matter moving from one place to another, the way rain does. Simulated motion of molecules doesn't involve any real molecules (even though the truth of that statement depends on whether we define an molecule in a virtual-machine-like way, or in the quarks-and-electrons-in-a-correct-position (to stay in classical physics for simplicity) way), but that's not an appropriate analogy for consciousness.

13