I can't help but feel that this sneakily avoids some of the hard parts of the problem by assuming that we know how to find certain things like "the state according to the AI/human" from the start.

Reply

[-]Noumero4y10

Yeah, this is the part I'm confused about as well. I think this proposal involves training a neural network emulating a human? Otherwise I'm not sure how (F( $s_{m}$ ), $o_{h}$ ) is supposed to work. It requires a human to make a prediction about the next step using observations and the direct translation of the machine state, which requires us to have some way to describe the full state in a way that the "human" we're using can understand. This precludes using actual humans to label the data, because I don't think we actually have any way to provide such a description. We'd need to train up a human simulator specifically adapted for parsing this sort of output.

Reply

[-]Noumero4y10

I'm a bit confused about how it'd be work in practice. Could you provide an example of a concrete machine-learning setup, and how its inputs/outputs would be defined in terms of your variables?

Reply

[-]scottviteri4y10

A proper response to this entails another post, but here is a terse explanation of an experiment I am running: Game of Life provides the transition T, in a world with no actions. The human and AI observations are coarse-grainings of the game board at each time step -- specifically the human sees majority vote of bits in 5x5 squares on the game board, and the AI sees 3x3 majority votes. We learn human and AI prediction functions that take in previous state and predicted observation, minimizing difference between predicted observations and next observations given the Game of Life transition function. We then learn F between the AI and the human. We run the AI and human and lockstep and see if the AI can use F to suggest better beliefs in S_H, as measured by ability of the human to predict its future observations.

Reply

Moderation Log

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

9

REPL's and ELK

9

Ω 6

9

Ω 6