Eli's shortform feed

New post: Capability testing as a pseudo fire alarm

[epistemic status: a thought I had]

It seems like it would be useful to have very fine-grained measures of how smart / capable a general reasoner is, because this would allow an AGI project to carefully avoid creating a system smart enough to pose an existential risk.

I’m imagining slowly feeding a system more training data (or, alternatively, iteratively training a system with slightly more compute), and regularly checking its capability. When the system reaches “chimpanzee level” (whatever that means), you... (read more)

14jimrandomh5moIn There’s No Fire Alarm for Artificial General Intelligence [https://intelligence.org/2017/10/13/fire-alarm/] Eliezer argues: If I have a predetermined set of tests, this could serve as a fire alarm, but only if you've successfully built a consensus that it is one. This is hard, and the consensus would need to be quite strong. To avoid ambiguity, the test itself would need to be demonstrably resistant to being clever Hans [https://thegradient.pub/nlps-clever-hans-moment-has-arrived/]'ed. Otherwise it would be just another milestone.

I very much agree.

Eli's shortform feed

by elityre 1 min read2nd Jun 201976 comments


I'm mostly going to use this to crosspost links to my blog for less polished thoughts, Musings and Rough Drafts.