Impactful Forecasting Prize for forecast writeups on curated Metaculus questions

elifland; sam_atis; yagudin

cross-posted to EA Forum

TLDR

We’re giving out $4,000 to the best forecast writeups submitted via this form on these Metaculus questions by March 11 to encourage more people to forecast on impactful questions and write up their reasoning.

Motivation

We believe that forecasting on impactful questions is a great candidate for an activity more EA-interested people should partake in, given that it:

Provides direct value when done on decision-relevant questions.
Improves and demonstrates good judgment and research skills.
Can be fun: provides a concrete and gamified framing for activities that look similar to research.
Leads to generally learning more about the world.
Helps match up bright, like-minded collaborators.

This is informed by personal experience: we have learned a lot and found great collaborators through forecasting.

However, it’s difficult to start doing impactful forecasting right now. Metaculus has lots of questions, most of which aren’t selected for impact. It can be overwhelming to navigate and find questions that are both impactful and interesting. It may also be difficult to know where to start in an analysis. Additionally, it can be scary to share your thoughts publically without a push and if it’s good reasoning, the incentives are usually against you.^[1]

How to participate

We curated 25 Metaculus questions for prediction in this Airtable. Either Eli Lifland or Misha Yagudin has forecasted on each of these to provide a starting point for further analysis. The table has cause area tags to allow filtering for interest and expertise.

To participate: do the following by March 11 2022, 11:59 PM Anywhere On Earth:

Make a forecast on one of the selected questions and write up your reasoning.
Write up the reasoning for your forecast, on Metaculus or elsewhere (e.g. a forum/blog post).
1. We require you to share your reasoning publically, unless there are infohazards in which case we will consider private submissions.
Fill out this form with a link to your writeup and contact info; anonymous writeups and email addresses are okay.

You may submit up to one writeup per question; if you submit multiple, your final one will be used. You may submit entries for as many questions as you please.

We will host meetups in this gather.town space to chat about the contest and facilitate forecasting on the questions together. The meetups will be on Wed Feb 16 and Wed Mar 2, 6:30 - 8:30 PM UTC. See this ICS file for calendar events.

Prize details

We will distribute a total of $4,000 to the best forecast writeups of questions in the curated Airtable. Eli and Misha will judge which writeups are most valuable. Sam Glover may help with preliminary evaluations. The first prize will be a maximum of $2,000 and we will distribute prizes for a maximum of 15 writeups.

We will rate each analysis individually, and there is no limit on the amount of prizes a participant can win. Collaborative writeups are allowed and encouraged: writeups which are a collaboration by multiple forecasters should select one person as contact, and may distribute the prize among themselves as they please.

The primary criterion we will use for judging entries is how much it changes our mind compared to our initial forecasts. In that sense, this can be viewed as a large forecasting amplification experiment.

The following caveats apply:

We may adjust based on a subjective notion of how strongly held our previous views were, such that changing a strongly held view is rewarded more than changing a weakly held view.
Writeups that are less counterfactually valuable, e.g. widespread news updates that would be shared anyway, will be less likely to receive prizes.
We may reward clarity and conciseness, and reserve the right to skim especially long writeups.
We will attempt to avoid double counting similar insights across writeups on multiple questions, e.g. across the QuALITY questions resolving in 2025 and 2040.
If multiple entrants share very similar arguments, we will give more credit to the one shared first.
We reserve the right to diverge from the main criterion in other ways that we think will lead to the most fair distribution of prizes.

You may also submit writeups on these questions written before this post for retroactive prizes, though we expect to not award many of these as we chose the questions in part due to neglectedness of contributions. We will review entries and announce the winners by March 31.

Who should participate

If you’re reading this and feel interested to any extent, we’d encourage you to browse the Airtable and try forecasting on at least 1 question.

That being said, to give examples of our target audience:

College students interested in decision making with plenty of free time.
Professionals with experience related to one of the highlighted questions.
Aspiring generalist researchers at any stage in their career.

Question selection

We selected 50 candidate questions by browsing Metaculus, then subjectively rated questions on 4 dimensions from 1-5:

Decision importance: The importance of the decisions which will be affected by this question. Should combine cause area importance + importance within cause area.
Decision relevance: How much of an impact would this have on actual decisions if the forecast changed by a substantial amount? This factor is re-used from Nuño Sempere’s An estimate of the value of Metaculus questions.
Ease of contribution: How easy will it be for a "median generalist forecaster" to make a contribution to the analysis on this question within a few hours? e.g. questions requiring lots of domain expertise or background reading would score low here.
Neglectedness of contributions: How few contributions have there been on this specific question so far? How in need of attention is it? This should be subjectively evaluated using the existing count of forecasts and quantity + quality of comments/writeups.^[2]

A curation score was calculated, weighing decision importance at twice the other three due to it feeling like the most important factor.^[3] We chose a set of 25 questions based mainly on the curation score, but also including a diversity of cause areas and question types.

We also included 4 questions we wrote ourselves: these two on QuALITY performance in 3 and 18 years for informing AI strategy and these two on climate change.

We may add a few impactful questions to the curated Airtable in the next week or so. If so, we will write a comment announcing their addition.

Acknowledgments

Thanks to Nuño Sempere for reviewing an earlier version of the curated questions and selection criteria and to Jehan Azad for feedback on this post. All mistakes are our own.

See also Bottlenecks to more impactful crowd forecasting. ↩︎
A more thorough evaluation could also investigate neglectedness of similar questions and the topic area generally, rather than just the Metaculus question considered. ↩︎
A more thorough selection framework might be better mathematically motivated, like the ITN framework. ↩︎

LESSWRONG
LW