28

22nd Sep 2022

1 min read

A

1 3

28

AI Risk SkepticismInferential DistanceAI

Frontpage

28

New Answer

New Comment

1 Answers sorted by
top scoring

Gordon Seidoh Worley

Sep 23, 2022

83

I think many working in AI safety, and specifically those focused on alignment, basically don't understand what values are and, alarmingly to me, haven't invested a lot of effort into figuring it out. I think this is motivated stopping, though, because figuring out what values are and how they work is hard and a problem that doesn't easily avail itself to methods well known by AI researchers, so it gets ignored or pushed off as something AI will figure out for us. As a result, most of the time value is treated as some kind of black box at worst and as an abstract mathematical construct akin to preferences at best, which is a step beyond just totally ignoring not knowing what values are but not anywhere close to where I think we'll need to be to build aligned AI.

I've tried to address this in the past, but don't know how much impact I've really had. The recent work on the shard theory of values gives me hope the situation is changing.

2 comments, sorted by

top scoring

Click to highlight new comments since: Today at 6:54 PM

[-]Shmi3y110

Were the questions eventually answered to your satisfaction? If so, what/who did it? Or did you end up concluding that the AI Safety people have no idea what they are talking about when they mention "intelligence"? Or was the inferential gap just too large and you ended up doing all the work on your own?Or did something else happen?

Reply

[-]Aris3y162

The inferential gap didn't end up being worked out through conversation and I ended up mainly working that out by reading (Superintelligence, The Precipice, AGI Safety Fundamentals in that order) and bridging the other side of information with my own. I think this was pretty unfortunate time-wise though. Some of the things that were helpful included:

- Increased understanding on my end of how ML worked such that I could understand what "learning" looked like. Once I understood this, it was easier to see how my initial questions might have sounded irrelevant to someone working on AI Safety.
- A better understanding of what an AI planning multiple steps in advance (such as behaving until a treacherous turn) might look like.
- Encountering terms like APS or TAI, which communicated the ideas in ways that don't try to say "general intelligence"

I'd mostly thank AGI Safety Fundamentals for these! I don't regret reading any of those resources, but I do think I'd have come to find AI Safety to be important more quickly if someone had addressed my questions with more understanding of my own background in the early stages.

Reply

Moderation Log

LESSWRONG
LW

LESSWRONG
LW

28

[ Question ]

What Do AI Safety Pitches Not Get About Your Field?

28

28

1 Answers sorted by
top scoring

Sep 23, 2022

28

[ Question ]

What Do AI Safety Pitches Not Get About Your Field?

28

28

1 Answers sorted by top scoring

Sep 23, 2022

1 Answers sorted by
top scoring