tenthkrige — LessWrong

LESSWRONG
LW

Good points well made. I'm not sure what you mean by "my expected log score is maximized" (and would like to know), but in any case it's probably your average world rather than your median world that does it?

Are language models good at making predictions?

tenthkrige2y32

Very interesting!

From eyeballing the graphs, it looks like the average Brier score is barely below 0.25. This indicates that GPT-4 is better than a dart-throwing monkey (i.e. predicting a random %age, score of 0.33), and barely better than chance (always predicting 50%, score of 0.25).

It would be interesting to see the decompositions for those two naive strategies for that set of questions, and compare to the sub-scores GPT-4 got.

You could also check if GPT-4 is significantly better than chance.

A proposed method for forecasting transformative AI

tenthkrige3y28

dieontrasted

Typo?

Metaculus and medians

tenthkrige3y20

people who are focused on providing—and incentivized to provide—estimates of the expected number of cases

Can you say more about this? Would users forecast a single number? Would they get scored on how close their number is to the actual number? Could they give confidence intervals?

Assigning probabilities to metaphysical ideas

tenthkrige4y10

I think that's how I'd use this as well.

Assigning probabilities to metaphysical ideas

tenthkrige4y20

I don't think that solves the problem though. There are a lot of people, and many of them believe very unlikely models. Any model we (lesswrong-ish) people spend time discussing is going to be vastly more likely than a randomly selected human-thought-about model. I realise this is getting close to reference class tennis, sorry.

Assigning probabilities to metaphysical ideas

tenthkrige4y20

Cool idea. Any model we actually spend time talking about is going to be vastly above the base rate, though. Because most human-considered models are very nonsensical/unlikely.

Values Form a Shifting Landscape (and why you might care)

tenthkrige4y10

At first I was dubious about the framing of a "shifting" n-dimensional landscape, because in a sense the landscape is fixed in 2n dimensions (I think?), but you've convinced me this is a useful tool to think about/discuss these issues. Thanks for writing this!

Ethics in Many Worlds

tenthkrige5y10

Epistemic status: gross over-simplification, and based on what I remember from reading this 6 months ago.

This paper resolved many quesitons I had left with MWI. Relevantly here, I think it argues that the number of worlds doesn't grow because there was already an infinity of them through space.

Observing an experiment is then equivalent to locating yourself in space. Worlds splitting is the process where identical regions of the universe become different.

Covid 2/11: As Expected

tenthkrige5y170

The scoring system incentivizes predicting your true credence, (gory details here).

I think Metaculus rewarding participation is one of the reasons it has participation. Metaculus can discriminate good predictors from bad predictors because it has their track record (I agree this is not the same as discriminating good/bad predictions). This info is incorporated in the Metaculus prediction, which is hidden by default, but you can unlock with on-site fake currency.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments