Introduction

To help improve my own world models around AI, I am trying to understand and distill different worldviews. One worldview I am trying to understand is ‘AI as a normal technology’, by Arvind Narayanan and Sayash Kapoor. As a stepping stone to distilling that 15,000 word beast, I am first distilling a follow-up discussion between Ajeya Cotra and Arvind Narayanan, on the particular question about how quickly AI will progress and diffuse through society.

I found it surprisingly difficult to compress the key points of the discussion, so I have structured this post as follows:

Summary of Arvind’s key beliefs. My attempt at isolating Arvind’s key beliefs for this discussion. However, I strongly recommend reading, if not skimming, the remainder of the article for nuance/context.
What would change Arvind’s mind? A large part of the conversation was Ajeya formulating concrete observations or experiments that would separate their world views.
Miscellaneous points of interest. On several occasions, Arvind presents some interesting arguments, but they often involve multiple inter-related sub-claims which cannot be neatly dis-entangled.

A central reason I have struggled to distill things is that (I think) Arvind follows contextualizing norms, whereas I am trying to compress things into decoupled statements (in the sense of Decoupling vs Contextualizing Norms). For this reason, I quote significant portions of what is said, because the precise phrasing matters to accurately represent Arvind’s views.

Summary of Arvind’s key beliefs

This is my attempt at isolating Arvind’s key beliefs from the discussion. As stated in the introduction, I recommend reading/skimming the rest of the article for more nuance and context.

The impact of AI on society (positive or negative) will be gradual and continuous, happening over decades, rather than a few years.
- Arvind believes that the majority of practical use cases of AI cannot be developed without real-world data or real-world testing. For example, to produce a system that can plan a wedding, you need tons of appropriate data and/or a lot of real-world testing.
- Arvind believes that things may appear discontinuous if we do not have appropriate monitoring and research that ensures society/relevant stakeholders are continuously updated on both the capabilities and the diffusion of AI.
Safety and risk reduction largely follows from this gradual and continuous deployment of AI. Society will have time to adjust and develop appropriate defences/interventions.
Arvind is hesitant to support pausing AI development or banning frontier open weights models, only doing so under specific circumstances that in his view are unlikely to happen. Given the continuous nature of AI development, a pause would be a blunt tool, whose costs would outweigh its benefits.

(I actually believe the biggest crux is that Arvind thinks super-intelligence is not a meaningful concept and there is not significant room above human intelligence in most real-world tasks, e.g., forecasting or persuasion. However, this is not relevant for the discussion between Arvind and Ajeya. I plan to describe this more in a distillation of AI as a normal technology.)

What would change Arvind’s mind?

A large fraction of the discussion is Ajeya brainstorming observations or experiments that would separate their world views. They basically boil down to things developing quickly, without needing a lot of domain specific data or real-world testing.

Generalizing from games or simulations to open-ended tasks

Ajeya: The current assumption is that, to be production-ready for applications like travel booking, you need significant training on that specific application. But what if continued AI capability development leads to better transfer learning from domains where data is easily collected to domains where it’s harder to collect — for instance, if meta-learning becomes highly effective with most training done on games, code, and other simulated domains?
Arvind: If we can achieve significant generalization from games to more open-ended tasks, or if we can build truly convincing simulations, that would be very much a point against the speed limit thesis.

A reliable agent, but which had high inference costs

Ajeya: I'm curious if your view would change if a small engineering team could create an agent with the reliability needed for something like shopping or planning a wedding, but it's not commercially viable because it's expensive and takes too long on individual actions, needing to triple-check everything.
Arvind: That would be super convincing. I don't think cost barriers will remain significant for long.

An RCT that showed AI can guide high schoolers to make smallpox

Ajeya: Let's say we did an RCT and discovered that random high school students could be talked through exactly how to make smallpox, with AI helping order DNA fragments and bypass KYC monitoring. If that happened in 2025, would you support pausing AI development until we could harden those systems and verify the AI now fails at this task?
[...]
Arvind: this would have to be in a world where open models aren't competitive with the frontier, right? Because otherwise it wouldn’t matter. But yes, if those preconditions hold — if we think pausing would actually affect attackers' access to these AI capabilities, and if the RCT evidence is sufficiently compelling — then I could see some version of a pause being warranted.

Broad adoption of AI within AI companies

Ajeya: Let's say we magically had deep transparency into AI companies and how they're using their systems internally, and we start seeing AI systems rapidly being given deference in really broad domains, reaching team lead level, handling procurement decisions, moving around significant money. Would that change your view on how suddenly the impacts might hit the rest of the world?
Arvind: That would be really strong evidence that would substantially change my views on a lot of what we’ve talked about.

Companies prioritizing accelerating AI R&D more than end-user products

Ajeya: What I worry about is a world where companies are directing AI development primarily toward accelerating AI R&D and hardware R&D. They try to make enough money to keep going, but won't bother creating a great personal assistant AI agent because it’s hard to do right now but would become much easier after this explosive capabilities progress is complete.
Arvind: That's a fair concern, though I personally think it's unlikely because I believe much of the learning has to happen through deployment. But I very much understand the concern, and I'd support transparency interventions that would let us know if this is happening.

Developing real-world capabilities mostly in a lab

Ajeya: Do you have particular experiments that would be informative about whether transfer can go pretty far, or whether you can avoid extensive real-world learning?
Arvind: The most convincing set of experiments would involve developing any real-world capability purely (or mostly) in a lab — whether self-driving or wedding planning or drafting an effective legal complaint by talking to the client.

Creation of effective general purpose assistants in 2025 or 2026

Ajeya: if in 2025 or 2026 there was a fairly general-purpose personal assistant that worked out of the box — could send emails, book flights, and worked well enough that you'd want to use it — would that shift your thinking about how quickly this technology will impact the real world?
Arvind: That would definitely be a big shift. [...] we've learned about the capability-reliability gap, prompt injection, context issues, and cost barriers. If all those half-dozen barriers could be overcome in a one-to-two year period, even for one application, I'd want to deeply understand how they changed so quickly. It would significantly change my evaluation.

Learning a new language with a minimal phrasebook

Ajeya: I've been looking for good benchmarks or ecological measurements of meta-learning and sample-efficient learning — basically any reproducible measurement. But I've come up short because it's quite hard to confirm the model doesn't already know something and is actually learning it. Do you have any suggestions?
Arvind: I've seen some attempts in different domains. For instance, testing if a model can learn a new language given just a phrasebook, knowing the language wasn't in the training data. That would be pretty strong evidence if the model could learn it as well as if it had extensive training data.

Miscellaneous

On several occasions, Arvind presents some interesting arguments, but they often involve multiple inter-related sub-claims which cannot be neatly dis-entangled. I have just copied big chunks of the discussion, but have sub-headers that indicate the key discussion points.

Assume models can escape/self-reproduce, safety via continuous development and diffusion

Arvind: I think we should assume every model will be capable of escape and self-reproduction. Safety shouldn't rely on that being difficult.
Ajeya: Do you think current models are capable of that, or is this just a conservative assumption we should be making?
Arvind: It's partly a conservative assumption, but it also relates to resilience versus fragility. I think many proposed safety interventions actually increase fragility. They try to make sure the world doesn’t get into some dangerous state, but they do it in such a way that if the measure ever fails, it will happen discontinuously rather than continuously, meaning we won't have built up an “immune system” against smaller versions of the problem. If you have weak models proliferating, you can develop defenses that scale gradually as models get stronger. But if the first time we face proliferation is with a super-strong model, that's a much tougher situation.
Ajeya: I think I see two implicit assumptions I'd want to examine here.
First, on the object level, you seem to believe that the defender-attacker balance will work out in favor of defense, at least if we iteratively build up defenses over time as we encounter stronger and stronger versions of the problem (using increasingly stronger AIs for better and better defense). One important reason I'm unsure about this assumption is that if AI systems are systematically misaligned and collectively extremely powerful, they may coordinate with one another to undermine human control, so we may not be able to straightforwardly rely on some AIs keeping other AIs in check.
Then, on the meta level, it also seems like you believe that if you’re wrong about this, there will be some clear warning sign before it’s too late. Is that right?
Arvind: Yes, I have those assumptions. And if we don’t have an early warning sign, the most likely reason is that we weren’t doing enough of the right kinds of measurement.

Can AI’s be CEOs, premature to think about 100% AI-run companies, slow vs fast takeoff

(The opening paragraph below is a good example of how Arvind has contextual norms that make it hard for me to understand what their position even is. Either they are saying something obviously false, or, there is some difference in our background assumptions.)

Arvind: Many of these capabilities that get discussed — I'm not even convinced they're theoretically possible. Running a successful company is a classic example: the whole thing is about having an edge over others trying to run a company. If one copy of an AI is good at it, how can it have any advantage over everyone else trying to do the same thing? I'm unclear what we even mean by the capability to run a company successfully — it's not just about technical capability, it's about relative position in the world.
[…]
Ajeya: [Human CEOs] might not be competitive unless they defer high-level strategy to AI, such that humans are CEOs on paper but must let their AI make all decisions because every other company is doing the same. Is that a world you see us heading toward? I think I've seen you express skepticism earlier about reaching that level of deference.
Arvind: I think we're definitely headed for that world. I'm just not sure it's a safety risk. [...] For the foreseeable future, what it means to “run a company” will keep changing rapidly, just as it has with the internet. I don't see a discontinuity where AI suddenly becomes superhuman at running companies and brings unpredictable, cataclysmic impacts. As we offload more to AI, we'll see economically transformative effects and enter a drastically different world. To be clear, I think this will happen gradually over decades rather than a singular point in time. At that stage, we can think differently about AI safety. It feels premature to think about what happens when companies are completely AI-run.
Ajeya: I don't see it as premature because I think there's a good chance the transition to this world happens in a short few years without enough time for a robust policy response, and — because it’s happening within AI companies — people in the outside world may feel the change more suddenly.
Arvind: Yup, this seems like a key point of disagreement! Slow takeoff is core to my thinking, as is the gap between capability and adoption — no matter what happens inside AI companies, I predict that the impact on the rest of the world will be gradual.

Biorisk, intervening in whole bio supply chain, costs and benefits of restricting open models

Ajeya: Okay, let’s consider the example of biorisk. Let's say we did an RCT and discovered that random high school students could be talked through exactly how to make smallpox, with AI helping order DNA fragments and bypass KYC monitoring. If that happened in 2025, would you support pausing AI development until we could harden those systems and verify the AI now fails at this task?
Arvind: Well, my hope would be that we don't jump from our current state of complacency directly to that point. We should have testing in place to measure how close we're getting, so we can respond more gradually. While this is a low-confidence statement, I think the preferred policy response would focus on controlling the other bottlenecks that are more easily manageable — things like screening materials needed for synthesis and improving authentication/KYC — rather than pausing AI development, which seems like one of the least effective ways to mitigate this risk.
Ajeya: But let's say we're unlucky — the first time we do the RCT, we discover AIs are more powerful than we thought and can already help high schoolers make smallpox. Even if our ultimate solution is securing those supply chain holes you mentioned, what should we do about AI development in the meantime? Just continue as normal?
Arvind: Well, this would have to be in a world where open models aren't competitive with the frontier, right? Because otherwise it wouldn’t matter. But yes, if those preconditions hold — if we think pausing would actually affect attackers' access to these AI capabilities, and if the RCT evidence is sufficiently compelling — then I could see some version of a pause being warranted.
Ajeya: So maybe in 2027 we will do this RCT, and we will get this result, and we will want to be able to stop models from proliferating. And then we might think — I wish that in 2025, we had done things to restrict open source models beyond a certain capability level. This particular question is very confusing to me because I think open source models have huge benefits in terms of people understanding where capability levels are in a way that AI companies can't gate or control and in letting us do a whole bunch of safety research on those models. But this is exactly the kind of thing I would like to hear you speak to — do you think it's valuable to give ourselves that lever? And how should we think about if or when to make that choice?
Arvind: There's certainly some value in having that lever, but one key question is: what's the cost? On utilitarian grounds alone, I’m not sure it's justified to restrict open models now because of future risks. To justify that kind of preemptive action, we'd need much more evidence gathering. Do we know that the kind of purely cognitive assistance that models can provide is the bottleneck to the threats we’re worried about? And how do other defenses compare to restricting open models in terms of cost and effectiveness? But more saliently, I don't think a cost-benefit approach gives us the full picture. The asymmetry between freedom-reducing interventions and other interventions like funding more research is enormous. Governments would rapidly lose legitimacy if they attempt what many view as heavy-handed interventions to minimize speculative future risks with unquantified probabilities.

Continuous vs discontinuous development, how it’s partly our choice, how it impacts policy feasibility, frontier lab transparency as a form of continuity

Ajeya: I think the biggest difference between our worldviews is how quickly and with how little warning we think these risks might emerge. I want to ask — why do you think the progression will be continuous enough that we will get plenty of warning?
Arvind: Partly I do think the progression will be continuous by default. But partly I think that's a result of the choices we make — if we structure our research properly, we can make it continuous. And third, if we abandon the continuity hypothesis, I think we're in a very bad place regarding policy. We end up with an argument that's structurally similar — I'm not saying substantively similar — to saying “aliens might land here tomorrow without warning, so we should take costly preparatory measures.”
If those calling for intervention can't propose some continuous measure we can observe, something tied to the real world rather than abstract notions of capability, I feel that's making a policy argument that's a bridge too far. I need to think more about how to make this more concrete, but that's where my intuition is right now.
Ajeya: Here's one proposal for a concrete measurement — we probably wouldn't actually get this, but let's say we magically had deep transparency into AI companies and how they're using their systems internally. We're observing their internal uplift RCTs on productivity improvements for research engineers, sales reps, everyone. We're seeing logs and surveys about how AI systems are being used. And we start seeing AI systems rapidly being given deference in really broad domains, reaching team lead level, handling procurement decisions, moving around significant money. If we had that crystal ball into the AI companies and saw this level of adoption, would that change your view on how suddenly the impacts might hit the rest of the world?
Arvind: That would be really strong evidence that would substantially change my views on a lot of what we’ve talked about. But I'd say what you're proposing is actually a way to achieve continuity, and I strongly support it. This intervention, while it does reduce company freedom, is much weaker than stopping open source model proliferation. If we can't achieve this lighter intervention, why are we advocating for the stronger one?

Frontier lab strategy, one brilliant insight vs training on billions of data per task

Ajeya: I think the eventual smoothing [of jagged capabilities] might not be gradual — it might happen all at once because large AI companies see that as the grand prize. They're driving toward an AI system that's truly general and flexible, able to make novel scientific discoveries and invent new technologies — things you couldn't possibly train it on because humanity hasn't produced the data. I think that focus on the grand prize explains their relative lack of effort on products — they're putting in just enough to keep investors excited for the next round. It's not developing something from nothing in a bunker, but it's also not just incrementally improving products. They're doing minimum viable products while pursuing AGI and artificial superintelligence.
It's primarily about company motivation, but I can also see potential technical paths — and I'm sure they're exploring many more than I can see. It might involve building these currently unreliable agents, adding robust error checking, training them to notice and correct their own errors, and then using RL across as many domains as possible. They're hoping that lower-hanging fruit domains with lots of RL training will transfer well to harder domains — maybe 10 million reps on various video games means you only need 10,000 data points of long-horizon real-world data to be a lawyer or ML engineer instead of 10 million. That's what they seem to be attempting, and it seems like they could succeed.
Arvind: That's interesting, thank you.
Ajeya: What's your read on the companies' strategies?
Arvind: I agree with you — I've seen some executives at these companies explicitly state that strategy. I just have a different take on what constitutes their “minimum” effort — I think they've been forced, perhaps reluctantly, to put much more effort into product development than they'd hoped.
Ajeya: Yeah, back in 2015 when OpenAI was incorporated, they probably thought it might work more like inventing the nuclear bomb — one big insight they could develop and scale up. We're definitely not there. There's a spectrum from “invent one brilliant algorithm in a basement somewhere” all the way to “gather billions of data points for each specific job.” I want us to be way on the data-heavy end — I think that would be much better for safety and resilience because the key harms would first emerge in smaller forms, and we would have many chances to iterate against them (especially with tools powered by previous-generation AIs). We're not all the way there, but right now, it seems plausible we could end up pretty close to the “one-shot” end of the spectrum.

Concluding thoughts

My overall position is that a fast take-off is plausible, and that is reason enough to react. A big reason is that we have previously seen big jumps in capability, e.g. from GPT2 to GPT3, and Arvind does not present any compelling reasons why more scale/unhobbling/minor breakthroughs won’t cause more big jumps in capability. For example, somebody could feasibly discover some memory-scaffolding that allows AI systems to have robust and useful long-term memory.

Second, I am glad I did this in-depth reading of the article. I usually skim-read, and my original takeaway from skim-reading this conversation was that there was no obvious disagreement between Arvind and Ajeya: they seem to agree on what trends are worth tracking or what experiments would be useful doing. Now after writing this distillation, I have some (rudimentary) model of Ajeya’s and Arvind’s beliefs, which I can call upon when thinking about future trends or evaluating some new research.

LESSWRONG
LW