LESSWRONG
LW

AI
Frontpage

4

[ Question ]

What are some open exposition problems in AI?

by Sai Sasank Y
16th Aug 2021
1 min read
A
2
2

4

AI
Frontpage

4

What are some open exposition problems in AI?
4Daniel Kokotajlo
3Daniel Kokotajlo
New Answer
New Comment

2 Answers sorted by
top scoring

Daniel Kokotajlo

Aug 17, 2021

40

Scott Garrabrant once said something that really stuck with me: (paraphrasing) "You can get a mind in two ways -- by building it, or by searching for it. The way things are going with deep learning right now, it seems like we aren't going to build AGI, we are going to search for it. And that means trouble."

I feel like this is ultimately a pretty simple point but it is I think really important, it is a good foundation for thinking about (and taking seriously) things like inner alignment issues. Some comments:

--Obviously it's not binary, there's a spectrum between "building/designing" and "searching."

--Given our current level of understanding, I claim that doing stochastic gradient descent to train big neural networks is by default pretty far towards the "searching" end of the spectrum. The thing we get out the other end is a giant messy glob of neurons and we don't know in advance what it will be like (and often we don't even understand it afterwards, not even a little bit.) Contrast with traditional software, which is pretty far along the "building/designing" end of the spectrum.

Add Comment

Daniel Kokotajlo

Aug 17, 2021

30

I'd love to see someone explain what the priors of neural networks are (I think it's called "minimal circuit" or something like that) compared and contrasted with e.g. the Solomonoff prior, or the Levin prior. It would answer questions like these:

--What is the neural network prior? (My tentative, shitty answer: It's the probability distribution over types of circuits, that tells you how some types are more likely to occur than others, were you to randomly sample parameter configurations.)

--How is it different than the solomonoff prior? (My TSA: The solomonoff prior is over types of programs rather than types of circuits. This isn't by itself a big deal because circuits are programs and programs can be approximated by circuits. More importantly, there are many circuits that get high NN prior but low solomonoff prior, and vice versa. In particular solomonoff prior doesn't penalize programs for running for a very long time, it instead entirely emphasizes minimum description length.)

--Why does this matter? (My TSA: As a first approximation, we should think of SGD on big neural networks as selecting the highest-prior circuit that scores perfectly on the training data. It's like making a posterior by conditionalizing the prior on some data. Mignard et al would argue that this is more than just an approximation, though lots of people disagree with them and think it's more complicated than that. This has important implications for inner alignment issues, see e.g. Paul's stuff.)

Add Comment
Moderation Log
More from Sai Sasank Y
View more
Curated and popular this week
A
2
0

Basically, topics or ideas that are not well explained, but could benefit from good expositions.