233

LESSWRONG
LW

232
DeepMindMachine Learning (ML)Academic PapersGaming (videogames/tabletop)AI
Frontpage

52

[1911.08265] Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model | Arxiv

by DragonGod
21st Nov 2019
1 min read
4

52

This is a linkpost for https://arxiv.org/abs/1911.08265

52

[1911.08265] Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model | Arxiv
5Charlie Steiner
8gwern
1[comment deleted]
7FactorialCode
4SoerenMind
New Comment
4 comments, sorted by
top scoring
Click to highlight new comments since: Today at 7:49 AM
[-]Charlie Steiner6y50

Welp, we're doomed (/s), as soon as someone figures out how to get 100 million tries at taking over the world so we can crush the world-taking-over problem with stochastic gradient descent.

Reply
[-]gwern6y80

Meta-learning and transfer learning. You take over 100 million different simulated worlds, and the actual real world is a doddle.

Reply
[+][comment deleted]6y10
[-]FactorialCode6y70

Yeah, it's interesting that this works so well, but I think that best way to think of this is as a middle ground between full model based RL and model free RL. Their data efficiency isn't going to be optimal, because they're effectively throwing away the information carried by the observations. However, by making that choice, they don't need to model irrelevant details, so they end up with a very accurate and effective MCTS. As a result, I'd wager that with smaller neural networks or more experience, completely model-free RL would out-preform this agent, because all the modelling power can be focused on representing the policy. Likewise, with larger networks or less experience, I would expect this to fall behind MBRL that also predicts observations because the latter would be more data efficient.

I would have liked it if they had done more investigation into why they were able to outperform AZ in go. At the moment, they seem to have left it to one line of speculation.

Reply
[-]SoerenMind6y40

Posted a little reaction to this paper here.

Reply
Moderation Log
More from DragonGod
View more
Curated and popular this week
4Comments
DeepMindMachine Learning (ML)Academic PapersGaming (videogames/tabletop)AI
Frontpage
, 11/21/2019
Reason: Comment deleted by its author.