Series: How to Purchase AI Risk Reduction

I recently explained that one major project undergoing cost-benefit analysis at the Singularity Institute is that of a scholarly AI risk wiki. The proposal is exciting to many, but as Kaj Sotala points out:

This idea sounds promising, but I find it hard to say anything about "should this be funded" without knowing what the alternative uses for the money are. Almost any use of money can be made to sound attractive with some effort, but the crucial question in budgeting is not "would this be useful" but "would this be the most useful thing".

Indeed. So here is another thing that donations to SI could purchase: good research papers by skilled academics.


Our recent grant of $20,000 to Rachael Briggs (for an introductory paper on TDT) provides an example of how this works:

  1. SI thinks of a paper it wants to exist but doesn't have the resources to write itself (e.g. a clearer presentation of TDT).
  2. SI looks for a few productive academics well-suited to write the paper we have in mind, and approaches them directly with the grant proposal. (Briggs is an excellent choice for the TDT paper because she is a good explainer and has had two of her past decision theory papers selected as among the 10 best papers of the year by The Philosopher's Annual.)
  3. Hopefully, one of these academics says "yes." We award them the grant in return for a certain kind of paper published in one of a pre-specified set of journals. (In the case of the TDT grant to Rachael Briggs, we specified that the final paper must be published in one of the following journals: Philosopher's Imprint, Philosophy and Phenomenological Research, Philosophical Quarterly, Philosophical Studies, Erkenntnis, Theoria, Australasian Journal of Philosophy, Nous, The Philosophical Review, or Theory and Decision.)
  4. SI gives regular feedback on outline drafts and article drafts prepared by the article author.
  5. Paper gets submitted, revised, and published!

For example, SI could award grants for the following papers:

  • "Objections to CEV," by somebody like David Sobel (his "Full Information Accounts of Well-Being" remains the most significant unanswered attack on ideal-preference theories like CEV).
  • "Counterfactual Mugging," by somebody like Rachael Briggs (here is the original post by Vladimir Nesov).
  • "CEV as a Computational Meta-Ethics," by somebody like Gert-Jan Lokhorst (see his paper "Computational Metaethics").
  • "Non-Bayesian Decision Theory and Normative Uncertainty," by somebody like Martin Peterson (the problem of normative uncertainty is a serious one, and Peterson's approach is a different line of approach than the one pursued by Nick Bostrom, Toby Ord, and Will Crouch, and also different from the one pursued by Andrew Sepielli).
  • "Methods for Long-Term Technological Forecasting," by somebody like Bela Nagy (Nagy is the lead author on one of the best papers in the field)
  • "Convergence to Rational Economic Agency," by somebody like Steve Omohundro (Omohundro's 2007 paper argues that advanced agents will converge toward the rational economic model of decision-making, if true this would make it easier to predict the convergent instrumental goals of advanced AIs, but his argument leaves much to be desired in persuasiveness as it is currently formulated).
  • "Value Learning," by somebody like Bill Hibbard (Dewey's 2011 paper and Hibbard's 2012 paper make interesting advances on this topic, but there is much more work to be done).
  • "Learning Preferences from Human Behavior," by somebody like Thomas Nielsen (Nielsen's 2004 paper with Finn Jensen described the first computationally tractable algorithms capable of learning a decision maker’s utility function from potentially inconsistent behavior. Their solution was to interpret inconsistent choices as random deviations from an underlying “true” utility function. But the data from neuroeconomics suggest a different solution: interpret inconsistent choices as deviations from an underlying “true” utility function that are produced by non-model-based valuation systems in the brain, and use the latest neuroscientific research to predict when and to what extent model-based choices are being “overruled” by the non-model-based valuation systems).

(These are only examples. I don't necessarily think these particular papers would be good investments.)


New Comment
44 comments, sorted by Click to highlight new comments since:

I'm curious as to why you chose to target this paper at academic philosophers. Decision theory isn't my focus, but it seems that while the other groups of researchers in this area (mathematicians, computer scientists, economists, etc) talk to one another (at least a little), the philosophers are mostly isolated. The generation of philosophers trained while it was still the center of research in logic and fundamental mathematics is rapidly dying off and, with them, the remaining credibility of such work in philosophy.

Of course, philosophers are the only group that pay any attention to things like Newcomb's problem so, if you were writing for another group, you'd probably have to devote one paper to justifying the importance of the problem. Also, given some of the discussions on here, perhaps the goal is precisely to write this in an area isolated from actual implementation to avoid the risk of misuse (can't find the link, but I recall seeing several comment threads discussing the risks of publishing this at all).

while the other groups of researchers in this area (mathematicians, computer scientists, economists, etc) talk to one another (at least a little), the philosophers are mostly isolated.

A more relevant question is whether mathematicians, CS folks and economists would talk to one another about foundational issues in decision theory. It appears that this subtopic has mostly been classified under academic philosophy, which explains why other DT researchers would pay little attention to Newcomb's problem. (It's true that other DT researchers have considered the related problem of making credible precommitments, and this literature should probably be cited in any introduction to UDT/TDT.) A good review of the one journal paper would suffice to raise interest among folks dealing with the more applied kind of DT.


What if the researchers reach conclusions you don't expect, or disagree with? Do you have a plan for what happens to the money if, after a few months of working on it, Briggs informs you she no longer believes the TDT ideas are workable?

It's complicated, but here's one thought...

Notice that one of my example papers was a paper of objections to CEV. Right now we're at the stage of making the arguments and concepts in play formalized enough that they can be defended or attacked rigorously. If somebody formalizes and clarifies an argument well enough to properly attack, they've done at least half our work for us.


Have you looked into how other private agencies (Sloan, Templeton, Pioneer) go about purchasing research? This seems like a new model to me, and might be more fraught than you think.

Also, there are standard euphemisms such as "supporting" and "funding" rather than "purchasing."

Changed to "Funding" in the title.

Seems quite reasonable. But I don't have a clear picture of your general strategy. Do you have a path (read: a likely conjunction of paths) to getting a world-class mathematician to take an interest in forming a new decision theory? Talking about the details of CEV seems premature to me if we don't know that certain kinds of extrapolation are theoretically possible.

We have many plans, as this is something we strategize about alot. I do actually plan to write up an explanation of more of our plans within the next month.

informs you she no longer believes the TDT ideas are workable?

If anything, that would be the most bang for the buck. There is a non-zero chance that the TDT is a blind avenue, and if so, discovering it early would save the SI's mission of preventing x-risk from AGI lots of time and money. Litany of Tarsky and such.


If the folks at SIAI respect Briggs enough to reason "well, if she can't do it, no one can" and move on to the next thing, that could be bang for buck.

If the folks at SIAI respect Briggs enough to reason "well, if she can't do it, no one can" and move on to the next thing, that could be bang for buck.

That seems incredibly unlikely. There's a rather huge prior improbably of the latter that needs to be overcome.


Yes if their respect were misplaced, it would not be bang for buck.

It costs twenty thousand dollars to get someone to write an example of this sort of paper? How long is it?

The TDT paper is particularly difficult to write, and will require many months of work even from someone as skilled as Rachael. It's hard to figure out which points to emphasize and how to present the ideas, and then it's doubly hard to figure out how to squeeze 200 pages worth of material into the 20-30 pages that is suitable for journal publication.

OK, so now I want to know how you got such a bargain!

Academics make peanuts, relatively speaking.

Tenured professors often make less than $100k a year. Adjunct professors tend to make much less than $20k per year per school they're teaching at.

If the aim is to get peer reviewed journal articles for the sake of it then $20 000 is a lot of money. If the aim is to get peer reviewed journal articles to raise the profile of FAI issues or to raise the reputation of SI (with philosophers) then I suggest Rachael Briggs is worth the money - she's very well respected in academic philosophy circles and a single paper from her will garner far more attention than (I suspect, literally) dozens of papers from other people.

The paper I wrote for the DoD this last year cost them $2,000 a page (only about a quarter went to me), and so that seems in line with general academic funding.


Why a page?

That is, why am I measuring their expenditure in terms of pages, rather than months? Because Alicorn asked how long the paper was, and I couldn't help but do that calculation when I saw the grant proposal.


Exactly. I thought you meant the actually payed per page, something I have heard is costume with course-books in the states. Have always found it to be rather strange.

If her consulting rates are like mine, this will buy about 12 days of her time.

One day of your time would pay for one month of my life. Funny how much that hurts, even though I already know there are trillions of dollars zipping back and forth in the world. I can only hope that I cross the threshold of fundability as soon as possible.

If it helps, that doesn't reflect what I'm paid - I'm a salaried employee of a consultancy.

That is more in line with what I imagine to be the socioeconomic level of the regulars on this site.

And I guess what you said wasn't in itself "hurtful", it just forcibly reminded me of the evil aspect of my own situation, which (optimistically expressed) is the terribly long time it's taking to get into a position where I can really start to act.

At that rate, that's an annual salary of over $600,000. No way a pre-tenure academic philosopher is that expensive!

I get (52⋅5−8−20)⋅($20000/12) ≈ $386666.67 - this of course is an absolute upper bound on how much I can make for my employer, not an estimate. But agreed, chargeout rates for philosophers may differ.

Here is a comparison with a real funding agency, American Educational Research Association:

Awards Awards for Research Grants are up to $20,000 for 1-year projects, or up to $35,000 for 2-year projects. In accordance with AERA's agreement with the funding agencies, institutions may not charge indirect costs or overhead on these awards. Approximately 15 Research Grants will be awarded per year.

Reporting requirements All Research Grantees will be required to submit a brief (3-6 pages) progress report mid-way through the grant period. A final report will be submitted at the end of the grant period. The final report should be an article based on the proposed research and of the quality and in the format for submission to a journal for publication.

Note that they do not even require a publication, only a submission.

Hang on, is this document saying that they expect to be able to completely cover the salary costs and expenses of a full-time academic, for two years, for $35,000? That can't be right. Can anyone help?

Academics apply for multiple grants, and their salary is partly supported by each grant (and also by a salary they draw from the University they are affiliated with for teaching classes). Universities also "tax" grants at 50%+ rates.

For example, it is not uncommon to be supported by 5-6 grants each covering a bit less than 20% of your salary. The rough idea is if a grant covers X% of your salary, you will generally spend X% of your time working on the research covered by this grant.

OK, thank you! So now I really can't work out what this tells us about the SI grant.

Seconding the recommendation of Alicorn for any job that involves writing.

I wasn't proposing myself. Although depending on how much of the job is "research" and how much of it is "writing", and whether twenty thousand dollars was a typo, maybe I should. (But I don't have publishing credits outside of an essay in a "Pop Culture and Philosophy" book for a popular audience so maybe I wouldn't do regardless of ability to produce a finished product.)

Dear Luke,

I applaud your efforts to try to fund mainstream research. I have one reservation, namely that your requirement that the research be published in a particular list of journals is quite strong. Unless the person in question has a strong publication track record in these kinds of journals (something regular grant agencies also look for) it is very difficult to guarantee that the paper will in fact pass peer review in these journals.

Unless the person in question has a strong publication track record in these kinds of journals (something regular grant agencies also look for) it is very difficult to guarantee that the paper will in fact pass peer review in these journals.

The use of a high status name and academic affiliations sufficient to get accepted by a target journal is quite possibly the most valuable element of the service being purchased.

As well, when it will be published could be a big deal. A paper spending years in the review process is not all that unlikely- and so if the writer is only paid when it's actually accepted for submission, that could mean the money spends a long time in limbo.

Well, she agreed to the deal and is as much an academic as either you or Ilya, so either the deal has details not mentioned which make it seem less harsh, or she believes it's not so hard or slow as you do.

the deal has details not mentioned which make it seem less harsh

Very likely.

she believes it's not so hard or slow as you do

I am amused by the possibility that accepting a grant to do decision theory work is itself a poor decision.


One imagines Briggs's mental monologue: "maybe I should calculate the EV of this paper... Come on, Rachel, this is serious!"

In the side-by-side comparison, I'd prioritize ensuring that there is a maintained, coherent place to start to understand SI's case, and do this work second. This work could attract valuable attention, but little of that attention will turn into donations, recruitment etc without the first part.

Good strategy!

Seems to me that a better thing to do would be to (a) pick the topic that you want, (b) pick the dollar amount you are willing to spend on some research on that topic, and (c) advertise a paper prize funded at that amount.

Then you get potentially a lot of good research on the topic from lots of different perspectives.

Even better if you can find either a sympathetic editor of a good technical journal that does special editions (I would have suggested Synthese, but some recent political nonsense has probably decreased their value) or make a deal with an academic press to publish the best eight or ten papers together. Alternatively, you give out the award and then release the copyright back to the authors with some comments and encouragement to publish in visible journals.

You might do better with one very good researcher that you choose rather getting contributions from whoever is willing to work on spec.

Yes, that is possible. I suspect, though, that many good, motivated researchers would be overlooked by more targeted approaches. Partly that is just because targeting one researcher necessarily excludes all other researchers. Partly it is because not every good, interested researcher has a lot of name-recognition. Also, a paper prize need not target a single academic discipline, and since the topics at issue are interdisciplinary, advertising a paper prize across philosophy, computer science, statistics, economics, psychology, etc. seems like a good idea to me.

But perhaps a better answer is to do a bit of both? SI could take a $20k investment and split it to do an experimental comparison. I, for one, would like to see an experimental comparison of the quality and quantity of research produced by a $10k focused grant and a $10k paper prize.