Leonard Dung and I have a new draft – preprint here & here – arguing against the view that major actors in AI development should out of self-interest race in their attempts to build advanced AI. We argue (roughly) that this pro-racing view 1) underestimates the risks, 2) overestimates the...
I recently co-wrote a paper with Leonard Dung (accepted at Philosophical Studies) with the above title, preprint here. To post something short rather than nothing, below is the abstract: Creating systems that are aligned with our goals is seen as a leading approach to create safe and beneficial AI in...
Abstract The question, how increasing intelligence of AIs influences central problems in AI safety remains neglected. We use the framework of reinforcement learning to discuss what continuous increases in the intelligence of AI systems implies for central problems in AI safety. We first argue that predicting the actions of an...
TL;DR * The alignment problem is less fundamental for AI safety than the problem of predicting actions of AI systems, especially if they are more intelligent than oneself. We dub this the prediction problem. * The prediction problem may be insoluble. * If the prediction problem is insoluble, predicting the...