Raemon

LessWrong team member / moderator. I've been a LessWrong organizer since 2011, with roughly equal focus on the cultural, practical and intellectual aspects of the community. My first project was creating the Secular Solstice and helping groups across the world run their own version of it. More recently I've been interested in improving my own epistemic standards and helping others to do so as well.

Sequences

Feedbackloop-First Rationality
The Coordination Frontier
Privacy Practices
Keep your beliefs cruxy and your frames explicit
LW Open Source Guide
Tensions in Truthseeking
Project Hufflepuff
Rational Ritual
Drawing Less Wrong

Wikitag Contributions

Comments

Sorted by
Answer by Raemon103

One thing is I'm definitely able to spin up side projects that I just would not have been able to do before, because I can do them with my "tired brain."

Some of them might turn out to be real projects, although it's still early stage.

Raemon*Ω6116

My current guess is:

1. This is more relevant for up-to-the first couple generations of "just barely superintelligent" AIs.

2. I don't really expect it to be the deciding factor after many iterations of end-to-end RSI that gets you to the "able to generate novel scientific or engineering insights much faster than a human or institution could." 

I do think it's plausible that the initial bias towards "evil/hackery AI" could start it off in a bad basin of attraction, but a) even if you completely avoided that, I would still basically expect this to rediscover this on it's own as it gained superhuman levels of competence, b) one of the things I most want to use a slightly-superhuman AI to do is to robustly align massively superhuman AI, and I don't really see how to do that without directly engaging with the knowledge of the failure modes there.

I think there are other plans that route more though "use STEM AI to build an uploader or bioenhancer, and then have an accelerated human-psyche do the technical philosophy necessary to handle the unbounded alignment case. I could see that being the right call, and I could imagine the bias from the "already knows about deceptive alignment etc" being large-magnitude enough to matter in the initial process. [edit: In those cases I'd probably want to filter out a lot more than just "unfriendly AI strategies"]

But, basically, how this applies depends on what it is you're trying to do with the AI, and what stage/flavor of AI you're working with and how it's helping.

Raemon100

It'd be nice to have the key observations/evidence in the tl;dr here. I'm worried about this but would like to stay grounded in how bad it is exactly.

Raemon122

I think I became at least a little wiser reading this sentence. I know you're mostly focused on other stuff but I think I'd benefit from some words connecting more of the dots.

Raemon51

I think the Gears Which Turn The World sequence, and Specializing in Problems We Don't Understand, and some other scattered John posts I don't remember as well, are a decent chunk of an answer.

Raemon7-2

Curated. I found this a clearer explanation of "how to think about bottlenecks, and things that are not-especially-bottlenecks-but-might-be-helpful" than I previously had. 

Previously, I had thought about major bottlenecks, and I had some vague sense of "well, there definitely seems like there should be more ways to be helpful than just tackling central bottlenecks, but a lot of ways to do that misguidedly." But I didn't have any particular models for thinking about it, and I don't think I could have explained it very well.

I think there are better ways of doing forward-chaining and backward-chaining than listed here (but which roughly correspond to "the one who thought about it a bit," but with a bit more technique for getting traction).

I do think the question of "to what degree is your field shaped like 'there's a central bottleneck that is to a first approximation the only thing that matters here'?" is an important question that hasn't really been argued for here. (I can't recall offhand if John has previously written a post exactly doing that in those terms, although the Gears Which Turn the World sequence is at least looking at the same problemspace)

Raemon112

Update: In a slack I'm in, someone said:

A friend of mine who works at US AISI advised:

> "My sense is that relevant people are talking to relevant people (don't know specifics about who/how/etc.) and it's better if this is done in a carefully controlled manner."

And another person said:

Per the other thread, a bunch of attention on this from EA/xrisk coded people could easily be counterproductive, by making AISI stick out as a safety thing that should be killed

And while I don't exactly wanna trust "the people behind the scenes have it handled", I do think the failure mode here seems pretty real.

Raemon20

I guess I'm just kinda surprised "perspective" feels metaphorical to you – it seems like that's exactly what it is.

(I think it's a bit of a long clunky word so not obviously right here, but, still surprised about your take)

Load More