A Scholarly AI Risk Wiki


22


lukeprog

Series: How to Purchase AI Risk Reduction

One large project proposal currently undergoing cost-benefit analysis at the Singularity Institute is a scholarly AI risk wiki. Below I will summarize the project proposal, because:

  • I would like feedback from the community on it, and
  • I would like to provide just one example of the kind of x-risk reduction that can be purchased with donations to the Singularity Institute.

 

 

The Idea

Think Scholarpedia:

  • Open-access scholarly articles written at roughly the "Scientific American" level of difficulty.
  • Runs on MediaWiki, but articles can only be created and edited by carefully selected authors, and curated by experts in the domain relevant to each article. (The editors would be SI researchers at first, and most of the authors and contributors would be staff researchers, research associates, or "remote researchers" from SI.)

But the scholarly AI risk wiki would differ from Scholarpedia in these respects:

  • Is focused on the subject of AI risk and related subjects.
  • No formal peer review system. The articles would, however, be continuously revised in response to comments from experts in the relevant fields, many of whom already work in the x-risk field or are knowledgeable participants on LessWrong and in the SIAI/FHI/etc. communities.
  • Articles will be written for a broader educated audience, not just for domain experts. (Many articles on Scholarpedia aren't actually written at the Scientific American level, despite that stated intent.)
  • A built-in citations and references system, Biblio (perhaps with the BibTeX addition).

Example articles: Eliezer Yudkowsky, Nick Bostrom, Ben Goertzel, Carl Shulman, Artificial General Intelligence, Decision Theory, Bayesian Decision Theory, Evidential Decision Theory, Causal Decision Theory, Timeless Decision Theory, Counterfactual Mugging, Existential Risk, Expected Utility, Expected Value, Utility, Friendly AI, Intelligence Explosion, AGI Sputnik Moment, Optimization Process, Optimization Power, Metaethics, Tool AI, Oracle AI, Unfriendly AI, Complexity of Value, Fragility of Value, Church-Turing Thesis, Nanny AI, Whole Brain Emulation, AIXI, Orthogonality Thesis, Instrumental Convergence Thesis, Biological Cognitive Enhancement, Nanotechnology, Recursive Self-Improvement, Intelligence, AI Takeoff, AI Boxing, Coherent Extrapolated Volition, Coherent Aggregated Volition, Reflective Decision Theory, Value Learning, Logical Uncertainty, Technological Development, Technological Forecasting, Emulation Argument for Human-Level AI, Evolutionary Argument for Human-Level AI, Extensibility Argument for Greater-Than-Human Intelligence, Anvil Problem, Optimality Notions, Universal Intelligence, Differential Intellectual Progress, Brain-Computer Interfaces, Malthusian Scenarios, Seed AI, Singleton, Superintelligence, Pascal's Mugging, Moore's Law, Superorganism, Infinities in Ethics, Economic Consequences of AI and Whole Brain Emulation, Creating Friendly AI, Cognitive Bias, Great Filter, Observation Selection Effects, Astronomical Waste, AI Arms Races, Normative and Moral Uncertainty, The Simulation Hypothesis, The Simulation Argument, Information Hazards, Optimal Philanthropy, Neuromorphic AI, Hazards from Large-Scale Computation, AGI Skepticism, Machine Ethics, Event Horizon Thesis, Acceleration Thesis, Singularitarianism, Subgoal Stomp, Wireheading, Ontological Crisis, Moral Divergence, Utility Indifference, Personhood Predicates, Consequentialism, Technological Revolutions, Prediction Markets, Global Catastrophic Risks, Paperclip Maximizer, Coherent Blended Volition, Fun Theory, Game Theory, The Singularity, History of AI Risk Thought, Utility Extraction, Reinforcement Learning, Machine Learning, Probability Theory, Prior Probability, Preferences, Regulation and AI Risk, Godel Machine, Lifespan Dilemma, AI Advantages, Algorithmic Complexity, Human-AGI Integration and Trade, AGI Chaining, Value Extrapolation, 5 and 10 Problem.

Most of these articles would contain previously unpublished research (not published even in blog posts or comments), because most of the AI risk research that has been done has never been written up in any form but sits in the brains and Google docs of people like Yudkowsky, Bostrom, Shulman, and Armstrong.

 

Benefits

More than a year ago, I argued that SI would benefit from publishing short, clear, scholarly articles on AI risk. More recently, Nick Beckstead expressed the point this way:

Most extant presentations of SIAI's views leave much to be desired in terms of clarity, completeness, concision, accessibility, and credibility signals.

Chris Hallquist added:

I've been trying to write something about Eliezer's debate with Robin Hanson, but the problem I keep running up against is that Eliezer's points are not clearly articulated at all. Even making my best educated guesses about what's supposed to go in the gaps in his arguments, I still ended up with very little.

Of course, SI has long known it could benefit from clearer presentations of its views, but the cost was too high to implement it. Scholarly authors of Nick Bostrom's skill and productivity are extremely rare, and almost none of them care about AI risk. But now, let's be clear about what a scholarly AI risk wiki could accomplish:

  • Provide a clearer argument for caring about AI risk. Journal-published articles like Chalmers (2010) can be clear and scholarly, but the linear format is not ideal for analyzing such a complex thing as AI risk. Even a 65-page article like Chalmers (2010) can't hope to address even the tiniest fraction of the relevant evidence and arguments. Nor can it hope to respond to the tiniest fraction of all the objections that are "obvious" to some of its readers. What we need is a modular presentation of the evidence and the arguments, so that those who accept physicalism, near-term AI, and the orthogonality thesis can jump right to the sections on why various AI boxing methods may not work, while those who aren't sure what to think of AI timelines can jump to those articles, and those who accept most of the concern for AI risk but think there's no reason to assert humane values over arbitrary machine values can jump to the article on that subject. (Note that I don't presume all the analysis that would go into building an AI risk wiki would end up clearly recommending SI's current, very specific positions on AI risk, but I'm pretty sure it would clearly recommend some considerable concern for AI risk.)
  • Provide a clearer picture of our AI risk situation. Without clear presentations of most of the relevant factors, it is very costly for interested parties to develop a clear picture of our AI risk situation. If you wanted to get roughly as clear a picture of our AI risk situation as can be had today, you'd have to (1) read several books, hundreds of articles and blog posts, and the archives of SI's decision theory mailing list and several forums, (2) analyze them in detail to try to fill in all the missing steps in the reasoning presented in these sources, and (3) have dozens of hours of conversation with the leading experts in the field (Yudkowsky, Bostrom, Shulman, Armstrong, etc.). With a scholarly AI risk wiki, a decently clear picture of our AI risk situation will be much cheaper to acquire. Indeed, it will almost certainly clarify the picture of our situation even for the leading experts in the field.
  • Make it easier to do AI risk research. A researcher hoping to do AI risk research is in much the same position as the interested reader hoping to gain a clearer picture of our AI risk situation. Most of the relevant material is scattered across hundreds of books, articles, blog posts, forum comments, mailing list messages, and personal conversations. And those presentations of the ideas leave "much to be desired in terms of clarity, completeness, concision, accessibility..." This makes it hard to do research, in big-picture conceptual ways, but also in small, annoying ways. What paper can you cite on Thing X and Thing Y? When the extant scholarly literature base is small, you can't cite the sources that other people have dug up already. You have to do all that digging yourself.

There are some benefits to the wiki structure in particular:

  • Some wiki articles can largely be ripped/paraphrased from existing papers like Chalmers (2010) and Muehlhauser & Salamon (2012).
  • Many wiki articles can be adapted to become journal articles, if they are seen as having much value. Probably, 1-3 wiki articles could be developed, then adapted and combined into a journal article and published, and then the original wiki article(s) could be published on the wiki (while citing the now-published journal article).
  • It's not an all-or-nothing project. Some value is gained by having some articles on the wiki, more value is gained by having more articles on the wiki.
  • There are robust programs and plugins for managing this kind of project (MediaWiki, Biblio, etc.)
  • Dozens or hundreds of people can contribute, though they will all be selected by editors. (SI's army of part-time remote researchers is already more than a dozen strong, each with different skills and areas of domain expertise.)

 

Costs

This would be a large project, and has significant costs. I'm still estimating the costs, but here are some ballpark numbers for a scholarly AI risk wiki containing all the example articles above:

  • 1,920 hours of SI staff time (80 hrs/week for 24 months). This comes out to about $48,000, depending on who is putting in these hours.
  • $384,000 paid to remote researchers and writers ($16,000/mo for 24 months; our remote researchers generally work part-time, and are relatively inexpensive).
  • $30,000 for wiki design, development, hosting costs