Raymond Douglas

LESSWRONG
LW

Raymond Douglas — LessWrong

I went down a rabbithole on inference-from-goal-models a few years ago (albeit not coalitional ones) -- some slightly scattered thoughts below, which I'm happy to elaborate on if useful.

A great toy model is decision transformers: basically, you can make a decent "agent" by taking a predictive model over a world that contains agents (like Atari rollouts), conditioning on some 'goal' output (like the player eventually winning), and sampling what actions you'd predict to see from a given agent. Some things which pop out of this:
- There's no utility function or even reward function
  - You can't even necessarily query the probability that the goal will be reached
- There's no updating or learning -- the beliefs are

David Duvenaud

David Duvenaud, mrinank_sharma, Raymond Douglas

16d

We’re publishing a new paper that presents the first large-scale analysis of potentially disempowering patterns in real-world conversations with AI.

AI assistants are now embedded in our daily lives—used most often for instrumental tasks like writing code, but increasingly in personal domains: navigating relationships, processing emotions, or advising on major life decisions. In the vast majority of cases, the influence AI provides in this area is helpful, productive, and often empowering.
However, as AI takes on more roles, one risk is that it steers some users in ways that distort rather than inform. In such cases, the resulting interactions may be disempowering: reducing individuals’ ability to form accurate beliefs, make authentic value judgments, and

... (read 358 more words →)

GD Roundup #4 - inference, monopolies, and AI Jesus

Raymond Douglas

1mo

Probably the biggest recent news was the Phil Trammell and Dwarkesh Patel paper on Capital in the 22nd Century, which provoked many many reactions. I am going to conspicuously not dig into it because of the sheer volume of thoughtful commentary floating around, but I do recommend at least skimming Zvi’s summary.

Instead I’ll mostly be responding to real-world events, like inference cost trends and the rise of AI Jesus. Buckle up!

(Crossposted on Substack if that's more your thing)

Transformative Economics and AI Polytheism

We’ve finally started publishing talks from the December Post-AGI workshop, and I really highly recommend them — so far it’s just the first two keynotes.

Anton Korinek’s talk is, in my opinion, the best... (read 1552 more words →)

When does competition lead to recognisable values?

Jan_Kulveit

Jan_Kulveit, beren, David Duvenaud, Raymond Douglas

1mo

Transcript of Beren Millidge's Keynote at The Post-AGI Workshop, San Diego, December 2025

You know how human values might survive in a very multifarious AI world where there's lots of AIs competing? This is the kind of MOLOCH world that Scott Alexander talks about. And then I realized that to talk about this, I've got to talk about a whole lot of other things as well—hence the many other musings here. So this is probably going to be quite a fast and somewhat dense talk. Let's get started. It should be fun.

Two Visions of AI Futures

The way I think about AI futures kind of breaks down into two buckets. I call them AI... (read 7239 more words →)

The Economics of Transformative AI

Jan_Kulveit

Jan_Kulveit, David Duvenaud, Raymond Douglas

1mo

Anton Korinek is an economist at UVA and the Brookings Institution who focuses on the macroeconomics of AI. This is a lightly edited transcript of a recent lecture where he lays out what economics actually predicts about transformative AI — in our view it's the best introductory resource on the topic, and basically anyone discussing post-labour economics should be familiar with this.

The talk covers historical development of what the bottlenecks are: for most of history, land was the bottleneck and humans were disposable. The Industrial Revolution flipped this. AI may flip it again: if labor becomes reproducible, humans are no longer the bottleneck.

Korinek walks through what this implies for growth (potentially dramatic),... (read 5341 more words →)

Replying toOn green

Raymond Douglas2moReview for 2024 Review

On green

To my mind, what this post did was clarify a kind of subtle, implicit blind spot in a lot of AI risk thinking. I think this was inextricably linked to the writing itself leaning into a form of beauty that doesn't tend to crop up much around these parts. And though the piece draws a lot of it back to Yudkowsky, I think the absence of green much wider than him and in many ways he's not the worst offender.

It's hard to accurately compress the insights: the piece itself draws a lot on soft metaphor and on explaining what green is not. But personally it made me realise that the posture I... (read more)

Replying toThe Choice Transition

Raymond Douglas2moReview for 2024 Review

The Choice Transition

One year later, I am pretty happy with this post, and I still refer to it fairly often, both for the overall frame and for the specifics about how AI might be relevant.

I think it was a proper attempt at macrostrategy, in the sense of trying to give a highly compressed but still useful way to think about the entire arc of reality. And I've been glad to see more work in that area since this post was published.

I am of course pretty biased here, but I'd be excited to see folks consider this.

Replying toThe Checklist: What Succeeding at AI Safety Will Involve

Raymond Douglas2moReview for 2024 Review

The Checklist: What Succeeding at AI Safety Will Involve

I think this post is on the frontier for some mix of:

Giving a thorough plan for how one might address powerful AI
Conveying something about how people in labs are thinking about what the problem is and what their role in it is
Not being overwhelmingly filtered through PR considerations

Obviously one can quibble with the plan and its assumptions but I found this piece very helpful in rounding out my picture of AI strategy - for example, in thinking about how to decipher things that have been filtered through PR and consensus filters, or in situating work that focuses on narrow slices of the wider problem. I still periodically refer back to it when I'm trying to think about how to structure broad strategies.

Raymond Douglas2mo

Sorry! I realise now that this point was a bit unclear. My sense of the expanded claim is something like:

People sometimes talk about AI UBI/UBC as if it were basically a scaled-up version of the UBI people normally talk about, but it's actually pretty substantially different
Global UBI right now would be incredibly expensive
In between now and a functioning global UBI we'd need some mix of massive taxes and massive economic growth (which could indeed just be the latter!)
But either way, the world in which that happened would not be economics as usual
(And maybe it is also a huge mess trying to get this set up beforehand so that it's robust to the

... (read more)

•••

Gradual Disempowerment Monthly Roundup #3

Raymond Douglas

2mo

Farewell to Friction

So sayeth Zvi: “when defection costs drop dramatically, equilibria break”. Even if AI makes individual tasks easier, this can still cause all kinds of societal problems because for many features of the world, the difficulty is load-bearing. John Stone gives a reflection on this phenomenon, ironically with the editing help of GPT5. He draws a nice parallel with the Jevons paradox: now that AI is making certain tasks like job applications easier, people are just spamming them in a way that overwhelms the system.

And the problem is a lot broader than applications and filtering processes. Last year, two Harvard students plugged some smart glasses into facial recognition software so that they could automatically identify people... (read 1110 more words →)

Raymond Douglas's Shortform

Raymond Douglas

2mo

This is a special post for quick takes (aka "shortform"). Only the owner can create top-level comments.

Raymond Douglas2moQuick Take

Last week we wrapped the second post-AGI workshop; I'm copying across some reflections I put up on twitter:

The post-AGI question is very interdisciplinary: whether an outcome is truly stable depends not just on economics and shape of future technology but also on things like the nature of human ideological progress and the physics of interplanetary civilizations
Some concrete takeaways:
- proper global UBI is *enormously* expensive (h/t @yelizarovanna)
- instead of ‘lower costs’, we should talk about relative prices (h/t @akorinek)
- lots human values are actually pretty convergent - they’re shared by many animals (h/t @BerenMillidge)
Among the many tensions in perspective, one of the more productive ones was between the ‘alignment is easy so let’s try to solve

... (read more)

•••

Raymond Douglas3mo

Very nice! A couple months ago I did something similar, repeatedly prompting ChatGPT to make images of how it "really felt" without any commentary, and it did mostly seem like it was just thinking up plausible successive twists, even though the eventual result was pretty raw.

Pictures in order

Gradual Disempowerment Monthly Roundup #2

Raymond Douglas

3mo

Another month, another wave of concerning behaviour, now also available on substack (tell your friends!). Since it’s early days, I’d gladly welcome thoughts on the format, or sources to include next time.

Also, if you’re a real diehard for this stuff, I and some of the other gradual disempowerment coauthors are helping to organise a one-day workshop on the effects of AGI on society in San Diego on December 4th — you can apply here.

Machine Culture

At least six AI or AI-assisted artists debuted on the billboard charts in the past two months. One even earned a multi-million dollar deal. All the major wins have still had humans actually writing the lyrics, but it’s clear that... (read 1605 more words →)

Upcoming Workshop on Post-AGI Economics, Culture, and Governance

David Duvenaud

David Duvenaud, Raymond Douglas, Jan_Kulveit, scasper, MariaK

4mo

This is an announcement and call for applications to the Workshop on Post-AGI Economics, Culture, and Governance taking place in San Diego on Wednesday, December 3, overlapping with the first day of NeurIPS 2025.

This workshop aims to bring together a diverse range of expertise to deepen our collective understanding of how the world might evolve after the development of AGI, and what we can do about it now.

The draft program features:

Anton Korinek on the Economics of Transformative AI
Alex Tamkin of Anthropic on "The fractal nature of automation vs. augmentation"
Anders Sandberg on "Cyborg Leviathans and Human Niche Construction"
Beren Millidge of Zyphra on "When does competition lead to recognisable values?"
Anna Yelizarova of Windfall Trust on "What would UBI actually entail?"
Ivan Vendrov of

... (read 339 more words →)

Replying toGradual Disempowerment Monthly Roundup

Raymond Douglas4mo

Gradual Disempowerment Monthly Roundup

Are people interested in a regular version of this, probably on a substack? Plus, any other thoughts on the format.

•••

Gradual Disempowerment Monthly Roundup

Raymond Douglas

4mo

Since publishing the original Gradual Disempowerment paper, my coauthors and I have been keeping an eye out for the obvious warning signs, as well as any rare glimmers of hope. Recently there’s been enough that I figured it was worth collecting it all in one place in case anyone else was curious.

If it seems like there’s appetite, we might make this a regular thing — I’ll put a comment at the bottom to react/reply to.

Governance By AI

Albania has appointed an AI called Diella as a government minister: a scaffolded system built by the Albanian government in collaboration with Microsoft, on top of OpenAI models. She recently addressed parliament for the first time, to... (read 1617 more words →)

120

Raymond Douglas6mo

best guesses: valuable, hat tip, disappointed, right assumption wrong conclusion, +1, disgusted, gut feeling, moloch, subtle detail, agreed, magic smell, broken link, link redirect, this is the diff

I wonder if it would be cheap/worthwhile to just get a bunch of people to guess for a variety of symbols to see what's actually intuitive?

•••

Summary of our Workshop on Post-AGI Outcomes

David Duvenaud

David Duvenaud, Raymond Douglas, Nora_Ammann, Jan_Kulveit

6mo

Last month we held a workshop on Post-AGI outcomes. This post is a list of all the talks, with short summaries, as well as my personal takeaways.

The first keynote was @Joe Carlsmith on “Can Goodness Compete?”. He asked: can anyone compete with “Locusts”: those who want to use all resources to replicate as fast as possible?

Longer version with transcript

The second keynote was @Richard_Ngo on “Flourishing in a highly unequal world”. He argued that future beings will vary greatly in power and intelligence, so we should aim for “healthy asymmetric" relations, analogous to that between parent and child.

Morgan MacInnes of U Toronto Political Science spoke on "The history of technologically provoked welfare... (read 757 more words →)

109

Replying to‘AI for societal uplift’ as a path to victory

Raymond Douglas7mo

‘AI for societal uplift’ as a path to victory

Ah! Ok, yeah, I think we were talking past each other here.

I'm not trying to claim here that the institutional case might be harder than the AI case. When I said "less than perfect at making institutions corrigible" I didn't mean "less compared to AI" I meant "overall not perfect". So the square brackets you put in (2) was not something I intended to express.

The thing I was trying to gesture at was just that there are kind of institutional analogs for lots of alignment concepts, like corrigibility. I wasn't aiming to actually compare their difficulty -- I think like you I'm not really sure, and it does feel pretty hard to pick a fair standard for comparison.

LESSWRONG
LW

LESSWRONG
LW

Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development

Decomposing Agency — capabilities without desires

‘AI for societal uplift’ as a path to victory

The Choice Transition

Raymond Douglas

Disempowerment patterns in real-world AI usage

GD Roundup #4 - inference, monopolies, and AI Jesus

When does competition lead to recognisable values?

The Economics of Transformative AI

Gradual Disempowerment Monthly Roundup #3

Raymond Douglas's Shortform

Gradual Disempowerment Monthly Roundup #2

Raymond Douglas

Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development

Decomposing Agency — capabilities without desires

‘AI for societal uplift’ as a path to victory

The Choice Transition

Raymond Douglas

Disempowerment patterns in real-world AI usage

GD Roundup #4 - inference, monopolies, and AI Jesus

When does competition lead to recognisable values?

The Economics of Transformative AI

Gradual Disempowerment Monthly Roundup #3

Raymond Douglas's Shortform

Gradual Disempowerment Monthly Roundup #2

Transformative Economics and AI Polytheism

Two Visions of AI Futures

Farewell to Friction

Machine Culture

Governance By AI