Wiki Contributions


Using prediction markets to generate LessWrong posts

Given the success of this experiment, we should propose a modified version of futarchy where laws are similarly written letter by letter!

elifland's Shortform

Thanks, I agree with this and it's probably not good branding anyway. 

I was thinking the "challenge" was just doing the intervention (e.g. being vegan), but agree that the framing is confusing since it refers to something different in the clinical context. I will edit my shortforms to reflect this updated view.

elifland's Shortform

[crossposted from EA Forum]

Reflecting a little on my shortform from a few years ago, I think I wasn't ambitious enough in trying to actually move this forward.

I want there to be an org that does "human challenge"-style RCTs across lots of important questions that are extremely hard to get at otherwise, including (top 2 are repeated from previous shortform):

  1. Health effects of veganism
  2. Health effects of restricting sleep
  3. Productivity of remote vs. in-person work
  4. Productivity effects of blocking out focused/deep work

Edited to add: I no longer think "human challenge" is really the best way to refer to this idea (see comment that convinced me); I mean to say something like "large scale RCTs of important things on volunteers who sign up on an app to randomly try or not try an intervention." I'm open to suggestions on succinct ways to refer to this.

I'd be very excited about such an org existing. I think it could even grow to become an effective megaproject, pending further analysis on how much it could increase wisdom relative to power. But, I don't think it's a good personal fit for me to found given my current interests and skills. 

However, I think I could plausibly provide some useful advice/help to anyone who is interested in founding a many-domain human-challenge org. If you are interested in founding such an org or know someone who might be and want my advice, let me know. (I will also be linking this shortform to some people who might be able to help set this up.)


Some further inspiration I'm drawing on to be excited about this org:

  1. Freakonomics' RCT on measuring the effects of big life changes like quitting your job or breaking up with your partner. This makes me optimistic about the feasibility of getting lots of people to sign up.
  2. Holden's note on doing these type of experiments with digital people. He mentions some difficulties with running these types of RCTs today, but I think an org specializing in them could help.

Votes/considerations on why this is a good or bad idea are also appreciated!

elifland's Shortform

(epistemic status: exploratory)

I think more people into LessWrong in high school - college should consider trying Battlecode. It's somewhat similar to The Darwin Game which was pretty popular on here and I think generally the type of people who like LessWrong will both enjoy and be good at Battlecode. (edited to add:  A short description of Battlecode is that you write a bot to beat other bots at a turn-based strategy game. Each unit executes its own code so communication/coordination is often one of the most interesting parts.)

I did it with friends for 6 years (junior year of high school - end of undergrad), and I think it at least helped me gain legible expertise in strategizing and coding quickly, but plausibly also helped me pick up skills in these areas as well as teamwork.

If any students are interested (I believe PhD students can qualify as well but may not be worth their time), there's still 2/3 weeks left in this year's game which is plenty of time. If you're curious to learn more about my experiences with Battlecode, see the README and postmortem here.

Feel free to comment or DM me if you have any questions.

Conversation on technology forecasting and gradualism

Your prior is for discontinuities throughout the entire development of a technology, so shouldn't your prior be for discontinuity at any point during the development of AI, rather than discontinuity at or around the specific point when AI becomes AGI? It seems this would be much lower, though we could then adjust upward based on the particulars of why we think a discontinuity is more likely at AGI.

Forecasting Thread: AI Timelines

Holden Karnofsky wrote on Cold Takes:

I estimate that there is more than a 10% chance we'll see transformative AI within 15 years (by 2036); a ~50% chance we'll see it within 40 years (by 2060); and a ~2/3 chance we'll see it this century (by 2100).

I copied these bins to create Holden's approximate forecasted distribution (note that Holden's forecast is for Transformative AI rather than human-level AGI):

Compared to the upvote-weighted mixture in the OP, it puts more probability on longer timelines, with a median of 2060 vs. 2047 and 1/3 vs. 1/5 on after 2100. Holden gives a 10% chance by 2036 while the mixture gives approximately 30%. Snapshot is here.

What will be the aftermath of the US intelligence lab leak report?

It's very likely that when the US intelligence community reports on 25. August on their data about the orgins of the COVID-19 they will conclude that it was a lab leak.

Are you open to betting on this? GJOpen community is at 9% that the report will conclude that lab leak is more likely than not, I’m at 12%.

In particular, my actual credence in lab leak is higher (~45%) but I’m guessing the most likely outcome of the report is that it’s inconclusive, and that political pressures will play a large role in the outcome.

Covid 2/11: As Expected

Someone who is near the top of the leaderboard is both accurate and highly experienced

I think this unfortunately isn't true right now, and just copying the community prediction would place very highly (I'm guessing if made as soon as the community prediction appeared and updated every day, easily top 3 (edit: top 10)). See my comment below for more details.

You can look at someone's track record in detail, but we're also planning to roll out a more ways to compare people with each other.

I'm very glad to hear this. I really enjoy Metaculus but my main gripe with it has always been (as others have pointed out) a lack of way to distinguish between quality and quantity. I'm looking forward to a more comprehensive selection of metrics to help with this!

Covid 2/11: As Expected

If the user is interested in getting into the top ranks, this strategy won't be anything like enough.

I think this isn't true empirically for a reasonable interpretation of top ranks. For example, I'm ranked 5th on questions that have resolved in the past 3 months due to predicting on almost every question.

Looking at my track record, for questions resolved in the last 3 months, evaluated at all times, here's how my log score looks compared to the community:

  • Binary questions (N=19): me: -.072 vs. community: -.045
  • Continuous questions (N=20): me: 2.35 vs. community: 2.33

So if anything, I've done a bit worse than the community overall, and am in 5th by virtue of predicting on all questions. It's likely that the predictors significantly in front of me are that far ahead in part due to having predicted on (a) questions that have resolved recently but closed before I was active and (b) a longer portion of the lifespan for questions that were open before I became active.


I discovered that the question set changes when I evaluate at "resolve time" and filter for the past 3 months, not sure why exactly. Numbers at resolve time:

  • Binary questions (N=102): me: .598 vs. community: .566
  • Continuous questions (N=92): me: 2.95 vs. community: 2.86

I think this weakens my case substantially, though I still think a bot that just predicts the community as soon as it becomes visible and updates every day would currently be at least top 10.

Anything much worse than that, yes, people could have negative overall scores - which, if they've predicted on a decent number of questions, is pretty strong evidence that they really suck at forecasting

I agree that this should have some effect of being less welcoming to newcomers, but I'm curious to what extent. I have seen plenty of people with worse brier scores than the median continuing to predict on GJO rather than being demoralized and quitting (disclaimer: survivorship bias).

Covid 11/26: Thanksgiving

There's also a Metaculus question about this:

Load More