Software Engineer at Ought
Someone who is near the top of the leaderboard is both accurate and highly experienced
I think this unfortunately isn't true right now, and just copying the community prediction would place very highly (I'm guessing if made as soon as the community prediction appeared and updated every day, easily top 3 (edit: top 10)). See my comment below for more details.
You can look at someone's track record in detail, but we're also planning to roll out a more ways to compare people with each other.
I'm very glad to hear this. I really enjoy Metaculus but my main gripe with it has always been (as others have pointed out) a lack of way to distinguish between quality and quantity. I'm looking forward to a more comprehensive selection of metrics to help with this!
If the user is interested in getting into the top ranks, this strategy won't be anything like enough.
I think this isn't true empirically for a reasonable interpretation of top ranks. For example, I'm ranked 5th on questions that have resolved in the past 3 months due to predicting on almost every question.
Looking at my track record, for questions resolved in the last 3 months, evaluated at all times, here's how my log score looks compared to the community:
So if anything, I've done a bit worse than the community overall, and am in 5th by virtue of predicting on all questions. It's likely that the predictors significantly in front of me are that far ahead in part due to having predicted on (a) questions that have resolved recently but closed before I was active and (b) a longer portion of the lifespan for questions that were open before I became active.
I discovered that the question set changes when I evaluate at "resolve time" and filter for the past 3 months, not sure why exactly. Numbers at resolve time:
I think this weakens my case substantially, though I still think a bot that just predicts the community as soon as it becomes visible and updates every day would currently be at least top 10.
Anything much worse than that, yes, people could have negative overall scores - which, if they've predicted on a decent number of questions, is pretty strong evidence that they really suck at forecasting
I agree that this should have some effect of being less welcoming to newcomers, but I'm curious to what extent. I have seen plenty of people with worse brier scores than the median continuing to predict on GJO rather than being demoralized and quitting (disclaimer: survivorship bias).
There's also a Metaculus question about this:
It looks like people can change their predictions after they initially submit them. Is this history recorded somewhere, or just the current distribution?
We do store the history. You can view them by going https://elicit.org/binary then searching for the question, e.g. https://elicit.org/binary?binaryQuestions.search=Will%20there%20be%20more%20than%2050. Although as noted by Oli, we currently only display predictions that haven't been withdrawn.
Is there an option to have people "lock in" their answer? (Maybe they can still edit/delete for a short time after they submit or before a cutoff date/time)
Not planning on supporting this on our end in the near future, but could be a cool feature down the line.
Is there a way to see in one place all the predictions I've submitted an answer to?
As of right now, not if you make the predictions via LW. You can view questions that you've submitted a prediction on via Elicit at https://elicit.org/binary?binaryQuestions.hasPredicted=true if you're logged in, and we're working on allowing for account linking so your LW predictions would show up in the same place.
The first version of account linking will be contacting someone at Ought then us manually running a script.
Edit: the first version of account linking is ready, email firstname.lastname@example.org with your LW username and Elicit email and I can link them.
Epistemic status: extremely uncertain
I created my Elicit forecast by:
[I work for Ought.]
I must admit I haven't followed the discussions you're referring to but if I were to spend more time forecasting this question I would look into them.
I didn't include effects of COVID in my forecast as it looks like the Zillow Home Value Index for Seattle has remained relatively steady since March (2% drop). I'm skeptical that there are likely to be large effects from COVID in the future when there hasn't been a large effect from COVID thus far,
A few reasons I could be wrong:
My forecast is based on:
I don't have a background in quantum computing, so there's a chance I'm misinterpreting the question in some way, but I learned a lot doing the research for the forecast (like that there's a lot of controversy regarding whether quantum supremacy has been achieved yet).
Amusingly, during my research I stumbled upon this Metaculus question about when a >49 qubit quantum computer would be created which resolved ambiguously due to the issue of how well-controlled the qubits are. For the purposes of this forecast I assumed it would resolve based on the raw number of qubits, without adjusting for control.
My forecast is based on historical data from Zillow. I explained my reasoning in the notes. The summary is that housing prices haven't changed very much in Seattle since April 2019 (on the whole it's risen 1%). On the other hand, prices in more expensive areas have stayed the same or declined slightly. I settled on a boring median of the price staying the same. Due to how stable the prices have been recently, I think most of the variation will come from the individual house and which neighborhood it's in, with an outside chance of large Seattle home value fluctuations.
I think it's >1% likely that the one of the first few surveys Rohin conducted would result in a fraction of >0.5.
Evidence from When Will AI Exceed Human Performance?, in the form of median survey responses of researchers who published at ICML and NIPS in 2015:
These seem like fairly safe lower bounds compared to the population of researchers Rohin would evaluate, since concern regarding safety has increased since 2015 and the survey included all AI researchers rather than only those whose work is related to AGI.
These responses are more directly related to the answer to Question 3 ("Does X agree that there is at least one concern such that we have not yet solved it and we should not build superintelligent AGI until we do solve it?") than Question 2 ("Does X broadly understand the main concerns of the safety community?"). I feel very uncertain about the percentage that would pass Question 2, but think it is more likely to be the "bottleneck" than Question 3.
Given these considerations, I increased the probability before 2023 to 10%, with 8% below the lower bound. I moved the median | not never up to 2035 as a higher probability pretty soon also means a sooner median. I decreased the probability of “never” to 20%, since the “not enough people update on it / consensus building takes forever / the population I chose just doesn't pay attention to safety for some reason” condition seems not as likely.
I also added an extra bin to ensure that the probability continues to decrease on the right side of the distribution.
Note: I'm interning at Ought and thus am ineligible for prizes.