Series: How to Purchase AI Risk Reduction
I recently explained that one major project undergoing cost-benefit analysis at the Singularity Institute is that of a scholarly AI risk wiki. The proposal is exciting to many, but as Kaj Sotala points out:
This idea sounds promising, but I find it hard to say anything about "should this be funded" without knowing what the alternative uses for the money are. Almost any use of money can be made to sound attractive with some effort, but the crucial question in budgeting is not "would this be useful" but "would this be the most useful thing".
Indeed. So here is another thing that donations to SI could purchase: good research papers by skilled academics.
Our recent grant of $20,000 to Rachael Briggs (for an introductory paper on TDT) provides an example of how this works:
- SI thinks of a paper it wants to exist but doesn't have the resources to write itself (e.g. a clearer presentation of TDT).
- SI looks for a few productive academics well-suited to write the paper we have in mind, and approaches them directly with the grant proposal. (Briggs is an excellent choice for the TDT paper because she is a good explainer and has had two of her past decision theory papers selected as among the 10 best papers of the year by The Philosopher's Annual.)
- Hopefully, one of these academics says "yes." We award them the grant in return for a certain kind of paper published in one of a pre-specified set of journals. (In the case of the TDT grant to Rachael Briggs, we specified that the final paper must be published in one of the following journals: Philosopher's Imprint, Philosophy and Phenomenological Research, Philosophical Quarterly, Philosophical Studies, Erkenntnis, Theoria, Australasian Journal of Philosophy, Nous, The Philosophical Review, or Theory and Decision.)
- SI gives regular feedback on outline drafts and article drafts prepared by the article author.
- Paper gets submitted, revised, and published!
For example, SI could award grants for the following papers:
- "Objections to CEV," by somebody like David Sobel (his "Full Information Accounts of Well-Being" remains the most significant unanswered attack on ideal-preference theories like CEV).
- "Counterfactual Mugging," by somebody like Rachael Briggs (here is the original post by Vladimir Nesov).
- "CEV as a Computational Meta-Ethics," by somebody like Gert-Jan Lokhorst (see his paper "Computational Metaethics").
- "Non-Bayesian Decision Theory and Normative Uncertainty," by somebody like Martin Peterson (the problem of normative uncertainty is a serious one, and Peterson's approach is a different line of approach than the one pursued by Nick Bostrom, Toby Ord, and Will Crouch, and also different from the one pursued by Andrew Sepielli).
- "Methods for Long-Term Technological Forecasting," by somebody like Bela Nagy (Nagy is the lead author on one of the best papers in the field)
- "Convergence to Rational Economic Agency," by somebody like Steve Omohundro (Omohundro's 2007 paper argues that advanced agents will converge toward the rational economic model of decision-making, if true this would make it easier to predict the convergent instrumental goals of advanced AIs, but his argument leaves much to be desired in persuasiveness as it is currently formulated).
- "Value Learning," by somebody like Bill Hibbard (Dewey's 2011 paper and Hibbard's 2012 paper make interesting advances on this topic, but there is much more work to be done).
- "Learning Preferences from Human Behavior," by somebody like Thomas Nielsen (Nielsen's 2004 paper with Finn Jensen described the first computationally tractable algorithms capable of learning a decision maker’s utility function from potentially inconsistent behavior. Their solution was to interpret inconsistent choices as random deviations from an underlying “true” utility function. But the data from neuroeconomics suggest a different solution: interpret inconsistent choices as deviations from an underlying “true” utility function that are produced by non-model-based valuation systems in the brain, and use the latest neuroscientific research to predict when and to what extent model-based choices are being “overruled” by the non-model-based valuation systems).
(These are only examples. I don't necessarily think these particular papers would be good investments.)