I'm trying to come up with a set of questions for self-calibration, related to AI and ML.

I've written down what I've come up with so far below. But I am principally interested in what other people come up with -- thus the question metatype -- both for questions, and for predictions for the questions.

So far I have an insufficient number of questions to produce anything like a nice calibration curve. I've also struggled with coming up with meaningful questions.

I've rot13'ed my predictions to avoid anchoring anyone. I'm pretty uncertain about most of these as point estimates however.

On Explicitly Stated ML / Systems Goals

It is (relatively) easy to determine if these are fulfilled or not. The trade-off is that they likely have little relation to AGI.

1. OpenAI succeeds in defeating top pro teams on unrestricted Dota2

OpenAI has explicitly said that they wish to beat top human teams in the MOBA Dota2. Their latest attempt to do so used self-play and familiar policy-gradient strategies on an incredibly massive scale to train, but still lost to top teams who won (relatively?) easily.

I'm also interested in people's probabilities on whether OpenAI succeeds, conditional on OpenAI not including genuine algorithmic novelty in their learning methods, although that's a harder question to define because of cloudiness around "algorithmic novelty."

My prediction: Friragl-svir creprag.

2. Tesla succeeds in a self-driving car driving coast-to-coast without intervention.

Tesla sells cars with (ostensibly) all the hardware necessary for full self-driving, and an in-house self-driving research program that uses a mix of ML and hard-coded rules. They have a goal of giving a demonstration autonomous coast-to-coast drive, although this goal has been repeatedly delayed. There is widespread skepticism both of the sensor suite in Tesla cars and of the maturity of their software.

My prediction: Gra creprag.

3. DeepMind reveals a skilled RL-trained agent for StarCraft II.

After AlphaGo, DeepMind announced that they would try to create an expert-level agent for SCII. They've released preliminary research related to this topic, although what they've revealed is far from such an agent.

My prediction: Nobhg svir creprag.

On Goals Not So Clearly Marked As Targets

1. Someone gets a score on the Winograd Schema tests above 80%.

The Winograd Schemas are a series of tests designed to test the common-sense reasoning properties of a system. Modern ML struggles to get better than random chance -- a 50% is approximately equal to random guessing, and modern state of the art gets less than 70%. (This is the best score I could find; there are several papers which claim better scores, but these deal with subsets of the Winograd Schemas as far as I can tell. [I.e., the classic "Hey, we got a better score... on a more constrained dataset."] I might be wrong about this, if I'm wrong please illuminate me.)

My prediction: Nobhg svir creprag.

2. Reinforcement Learning Starts Working

This is a bad, unclear goal. I'm not sure how to make it clearer and could use help.

There are a lot of articles about how reinforcement learning doesn't work. It generalizes incredibly poorly, and only succeeds on complex tasks like Dota2 by playing literal centuries worth of games. If some algorithm were discovered, such that one could get RL to work with kind of the same regularity that supervised learning works, that would be amazing. I'm still struggling with a way to rigorously note this. A 10x improvement in sample efficiency on the Atari suite would (probably?) fulfill this, but I'm not sure what else would. And it's quite a PITA to keep track of what the current state-of-the-art on Atari is anyhow.

My prediction: Need to get this more defined.


New Answer
Ask Related Question
New Comment
4 comments, sorted by Click to highlight new comments since: Today at 9:33 AM

elityre has done work on this for BERI, suggesting >30 questions.

Regarding the question metatype, Allan Dafoe has offered a set of desiderata in the appendix to his AI governance research agenda.

Prediction: yelp.com stops using CAPTCHA as part of the process of creating a new account, by end of 2019. CAPTCHA is defined here as any puzzle that the user is asked to solve to prove they are human.

My credence:


I find your predictions 1 through 3 not clearly defined.

Does OpenAI bot need to defeat a pro team in unconstrained dota 2 at least once during 2019? Or does it need to win at least one and more than 50% games against pro teams in 2019?

Suppose tesla releases a video footage or a report of their car reaching from one coast to the other, but it had some minor or not so minor problems. How minor should they be to count? Are humans allowed to help it recharge or anything like that?

How do you define "skilled" in SC II?

Fair. For (1), more than 50% because that was how they've been defining victories in these tournaments. For (2), no unplanned interventions -- i.e, it's fine if they want to drive it on a gravel driveway that they know the thing cannot handle, or fill it up at the supercharger because the car clearly cannot handle that, but in general no interventions because the car would potentially crash in a situation it (ostensibly) should handle. And for (3), meh, can beat the native scripted AI seems reasonable.

New to LessWrong?