How big a deal is this? What, if anything, does it signal about when we get smarter than human AI?
Thanks. Key quote:
What this indicates is not that deep learning in particular is going to be the Game Over algorithm. Rather, the background variables are looking more like "Human neural intelligence is not that complicated and current algorithms are touching on keystone, foundational aspects of it." What's alarming is not this particular breakthrough, but what it implies about the general background settings of the computational universe.
His argument proves too much.
You could easily transpose it for the time when Checkers or Chess programs beat professional players: back then the "keystone, foundational aspect" of intelligence was thought to be the ability to do combinatorial search in large solution spaces, and scaling up to AGI was "just" a matter of engineering better heuristics. Sure, it didn't work on Go yet, but Go players were not using a different cortical algorithm than Chess players, were they?
Or you could transpose it for the time when MCTS Go programs reached "dan" (advanced amateur) level. They still couldn't beat professional players, but professional players were not using a different cortical algorithm than advanced amateur players, were they?
AlphaGo succeded at the current achievement by using artificial neural networks in a regime where they are know to do well. But this regime, and the type of games like Go, Chess, Checkers, Othello, etc. represent a small part of the range of human cognitive tasks. In fact, we probably find this kind of board games fascinating precisely because they are very different than the usual cognitive stimuli we deal with in everyday life.
It...
It's a big deal for Go, but I don't think it's a very big deal for AGI.
Conceptually Go is like Chess or Checkers: fully deterministic, perfect information two-player games.
Go is more challenging for computers because the search space (and in particular the average branching factor) is larger and known position evaluation heuristics are not as good, so traditional alpha-beta minimax search becomes infeasible.
The first big innovation, already put into use by most Go programs for a decade (although the idea is older) was Monte Carlo tree search, which addresses the high branching factor issue: while traditional search either does not expand a node or expands it and recursively evaluates all its children, MCTS stochastically evaluates nodes with a probability that depends on how promising they look, according to some heuristic.
DeepMind's innovation consists in using a NN to learn a good position evaluation heuristic in a supervised fashion from a large database of professional games, refining it with reinforcement learning in "greedy" self-play mode and then using both the refined heuristic and the supervised heuristic in a MCTS engine.
Their approach essentially relies on big...
How big a deal is this? What, if anything, does it signal about when we get smarter than human AI?
It shows that Monte-Carlo tree search meshes remarkably well with neural-network-driven evaluation ("value networks") and decision pruning/policy selection ("policy networks"). This means that if you have a planning task to which MCTS can be usefully applied, and sufficient data to train networks for state-evaluation and policy selection, and substantial computation power (a distributed cluster, in AlphaGo's case), you can significantly improve performance on your task (from "strong amateur" to "human champion" level). It's not an AGI-complete result however, any more than Deep-Blue or TD-gammon were AGI-complete.
The "training data" factor is a biggie; we lack this kind of data entirely for things like automated theorem proving, which would otherwise be quite amenable to this 'planning search + complex learned heuristics' approach. In particular, writing provably-correct computer code is a minor variation on automated theorem proving. (Neural networks can already write incorrect code, but this is not good enough if you want a provably Friendly AGI.)
This is a big deal, and it is another sign that AGI is near.
Intelligence boils down to inference. Go is an interesting case because good play for both humans and bots like AlphaGo requires two specialized types of inference operating over very different timescales:
Machines have been strong in planning/search style inference for a while. It is only recently that the slower learning component (2nd order inference over circuit/program structure) is starting to approach and surpass human level.
Critics like to point out that DL requires tons of data, but so does the human brain. A more accurate comparison requires quantifying the dataset human pro go players train on.
A 30 year old asian pro will have perhaps 40,000 hours of play...
I'm not going to argue that you should pay attention to EY. His arguments convince me, but if they don't convince you, I'm not gonna do any better.
What I'm trying to get at is, when you ask "is there any evidence that will result in EY ceasing to urgently ask for your money?"... I mean, I'm sure there is such evidence, but I don't wish to speak for him. But it feels to me that by asking that question, you possibly also think of EY as the sort of person who says: "this is evidence that AI risk is near! And this is evidence that AI risk is near! Everything is evidence that AI risk is near!" And I'm pointing out that no, that's not how he acts.
While we're at it, this exchange between us seems relevant. ("Eliezer has said that security mindset is similar, but not identical, to the mindset needed for AI design." "Well, what a relief!") You seem surprised, and I'm not sure what about it was surprising to you, but I don't think you should have been surprised.
Basically, even if you're right that he's wrong, I feel like you're wrong about how he's wrong. You seem to have a model of him which is very different from my model of him.
(Btw, his opinion seems to be that AlphaGo's methods are what makes it more of a leap than a self-driving car or than Deep Blue, not the results. Not sure that affects your position.)
I also think MIRI should stop hitting people up for money and get a normal funding stream going. You know, let their ideas of how to avoid UFAI compete in the normal marketplace of ideas.
Currently MIRI gets their funding by 1) donations 2) grants. Isn't that exactly what the normal funding stream for non-profits is?
I should say, getting this working is very impressive, and took an enormous amount of effort. +1 to the team!
An interesting comment:
...The European champion of Go is not the world champion, or even close. The BBC, for example, reported that “Google achieves AI ‘breakthrough’ by beating Go champion,” and hundreds of other news outlets picked up essentially the same headline. But Go is scarcely a sport in Europe; and the champion in question is ranked only #633 in the world. A robot that beat the 633rd-ranked tennis pro would be impressive, but it still wouldn’t be fair to say that it had “mastered” the game. DeepMind made major progress, but the Go journey is still
DeepMind's go AI, called AlphaGo, has beaten the European champion with a score of 5-0. A match against top ranked human, Lee Se-dol, is scheduled for March.