Is it worth avoiding detailed discussions of expectations about agency levels of powerful AIs?

Mar 16, 2023

I don't think this question is specific to agency. I think this is about the entire concept of infohazards, and your arguments are fully general against all AI infohazards discussion.

Personally, I'm of the view that the worst idea in history is the idea of "bad ideas" that need to be hidden. I think the alignment community is shooting itself in the foot by trying to suppress ideas that the capabilities community is already fully aware of.

Seth Herd

Mar 17, 2023

I think this is a tricky tradeoff. There's effectively a race between alignment and capabilities research. Better theories of how AGI is likely to be constructed will help both efforts. Which one it will help more is tough to guess.

The one thought I'd like to add is that the AI safety community may think more broadly and creatively about approaches to building AI. So I wouldn't assume that all of this thinking has already been done.

I don't have an answer on this, and I've thought about it a lot since I've been keeping some potential infohazard ideas under my hat for maybe the last ten years.

Max H

Mar 17, 2023

I've read the 2021 MIRI conversations sequence, and various other writings by Nate and Eliezer. I found their explanations of convergent instrumental goals, agency, and various other topics convincing and explanatory, without much further thinking of my own.

I think in most or all cases, they were doing their best to explain clearly, without worrying much about infohazards. But the concepts are complicated and counter-intuitive, and sometimes when their explanations weren't landing, they decided to move on to other topics.

So, I think you should feel free to try communicating as clearly as you can, without holding back because of worries about infohazards. Perhaps you'll succeed in explaining where others have failed.

[-]Max H3y10

(Also, if you do succeed in writing what you think is an infohazardously-good explanation, you can just ask someone you trust to read it privately before posting it publicly.)

baturinsky

Mar 16, 2023

Looks like estimating the architecture of the future AGI is considered the "infohazard" too. While knowing it could be very useful to figure out which way we will have to align them.

Mar 17, 2023

0-1

If you go think up obvious things to do, and then go look at AI papers 6 months after, you will see there is a form of "intelligence convergence". Everything you thought of would have been tried. Therefore, do not worry about 'creating an idea'. Assume whatever you thought of is already being tried or it doesn't work.

^{^}

I think, for example, that talk about how AI might be a winner takes all game might have encouraged the "full speed ahead" approach to developing AGI

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

11

[ Question ]

Is it worth avoiding detailed discussions of expectations about agency levels of powerful AIs?

11

11

5 Answers sorted by
top scoring

Mar 16, 2023

Mar 17, 2023

Mar 17, 2023

Mar 16, 2023

Mar 17, 2023

11

[ Question ]

Is it worth avoiding detailed discussions of expectations about agency levels of powerful AIs?

11

11

5 Answers sorted by top scoring

Mar 16, 2023

Mar 17, 2023

Mar 17, 2023

Mar 16, 2023

Mar 17, 2023

5 Answers sorted by
top scoring