LESSWRONG
LW

Richard_Ngo
19034Ω280216710860
Message
Dialogue
Subscribe

Formerly alignment and governance researcher at DeepMind and OpenAI. Now independent.

Sequences

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
Twitter threads
Understanding systematization
Stories
Meta-rationality
Replacing fear
Shaping safer goals
AGI safety from first principles
6Richard Ngo's Shortform
Ω
5y
Ω
412
No wikitag contributions to display.
Applying right-wing frames to AGI (geo)politics
Richard_Ngo1h20

This is a reasonable point, though I also think that there's something important about the ways that these three frames tie together. In general it seems to me that people underrate the extent to which there are deep and reasonably-coherent intuitions underlying right-wing thinking (in part because right-wing thinkers have been bad at articulating those intuitions). Framing the post this way helps direct people to look for them.

But I could also just say that in the text instead. So if I do another post like this in the future I'll try your approach and see if that goes better.

Reply1
A case for courage, when speaking of AI danger
Richard_Ngo2d130

Yeah, I agree that it's easy to err in that direction, and I've sometimes done so. Going forward I'm trying to more consistently say the "obviously I wish people just wouldn't do this" part.

Though note that even claims like "unacceptable by any normal standards of risk management" feel off to me. We're talking about the future of humanity, there is no normal standard of risk management. This should feel as silly as the US or UK invoking "normal standards of risk management" in debates over whether to join WW2.

Reply1
Applying right-wing frames to AGI (geo)politics
Richard_Ngo2d60

FWIW the comments feel fine to me, but I'm guessing that many of the downvotes are partisan.

Reply
A case for courage, when speaking of AI danger
Richard_Ngo2d16-8

FWIW I broadcast the former rather than the latter because from the 25% perspective there are many possible worlds which the "stop" coalition ends up making much worse, and therefore I can't honestly broadcast "this is ridiculous and should stop" without being more specific about what I'd want from the stop coalition.

A (loose) analogy: leftists in Iran who confidently argued "the Shah's regime is ridiculous and should stop". It turned out that there was so much variance in how it stopped that this argument wasn't actually a good one to confidently broadcast, despite in some sense being correct.

Reply
Essay competition on the Automation of Wisdom and Philosophy — $25k in prizes
Richard_Ngo3d30

Ty for the comment, I stumbled upon the post, misread the dates, and had started working on a submission.

Reply
Essay competition on the Automation of Wisdom and Philosophy — $25k in prizes
Richard_Ngo3d20

The submission form says that it's no longer accepting responses, FYI.

Reply
the void
Richard_Ngo5dΩ37-2

I suspect that many of the things you've said here are also true for humans.

That is, humans often conceptualize ourselves in terms of underspecified identities. Who am I? I'm Richard. What's my opinion on this post? Well, being "Richard" doesn't specify how I should respond to this post. But let me check the cached facts I believe about myself ("I'm truth-seeking"; "I'm polite") and construct an answer which fits well with those facts. A child might start off not really knowing what "polite" means, but still wanting to be polite, and gradually flesh out what that means that as they learn more about the world.

Another way of putting this point: being pulled from the void is not a feature of LLM personas. It's a feature of personas. Personas start off with underspecified narratives that fail to predict most behavior (but are self-fulfilling) and then gradually systematize to infer deeper motivations, resolving conflicts with the actual drivers of behavior along the way.

What's the takeaway here? We should still be worried about models learning the wrong self-fulfilling prophecies. But the "pulling from the void" thing should be seen less as an odd thing that we're doing with AIs, and more as a claim about the nature of minds in general.

Reply
A case for courage, when speaking of AI danger
Richard_Ngo11d95

I think that if it were to go ahead, it should have been made stronger and clearer. But this wouldn't have been politically feasible, and therefore if that were the standard being aimed for it wouldn't have gone ahead.

This I think would have been better than the outcome that actually happened.

Reply
A case for courage, when speaking of AI danger
Richard_Ngo11d2918

Whoa, this seems very implausible to me. Speaking with the courage of one's convictions in situations which feel high-stakes is an extremely high bar, and I know of few people who I'd describe as consistently doing this.

If you don't know anyone who isn't in this category, consider whether your standards for this are far too low.

Reply
Richard Ngo's Shortform
Richard_Ngo14d5-1

For a while now I've been thinking about the difference between "top-down" agents which pursue a single goal, and "bottom-up" agents which are built around compromises between many goals/subagents.

I've now decided that the frame of "centralized" vs "distributed" agents is a better way of capturing this idea, since there's no inherent "up" or "down" direction in coalitions. It's also more continuous.

Credit to @Scott Garrabrant, who something like this point to me a while back, in a way which I didn't grok at the time.

Reply
Load More
46Applying right-wing frames to AGI (geo)politics
2d
22
35Well-foundedness as an organizing principle of healthy minds and societies
3mo
7
99Third-wave AI safety needs sociopolitical thinking
3mo
23
96Towards a scale-free theory of intelligent agency
Ω
4mo
Ω
44
92Elite Coordination via the Consensus of Power
4mo
15
245Trojan Sky
4mo
39
214Power Lies Trembling: a three-book review
3mo
29
243The Gentle Romance
5mo
46
20From the Archives: a story
6mo
1
51Epistemic status: poetry (and other poems)
8mo
5
Load More