1 min read11th Jan 20227 comments
Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.
This is a special post for quick takes by elifland. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

New to LessWrong?

7 comments, sorted by Click to highlight new comments since: Today at 10:05 AM

[crossposted from EA Forum]

Reflecting a little on my shortform from a few years ago, I think I wasn't ambitious enough in trying to actually move this forward.

I want there to be an org that does "human challenge"-style RCTs across lots of important questions that are extremely hard to get at otherwise, including (top 2 are repeated from previous shortform):

  1. Health effects of veganism
  2. Health effects of restricting sleep
  3. Productivity of remote vs. in-person work
  4. Productivity effects of blocking out focused/deep work

Edited to add: I no longer think "human challenge" is really the best way to refer to this idea (see comment that convinced me); I mean to say something like "large scale RCTs of important things on volunteers who sign up on an app to randomly try or not try an intervention." I'm open to suggestions on succinct ways to refer to this.

I'd be very excited about such an org existing. I think it could even grow to become an effective megaproject, pending further analysis on how much it could increase wisdom relative to power. But, I don't think it's a good personal fit for me to found given my current interests and skills. 

However, I think I could plausibly provide some useful advice/help to anyone who is interested in founding a many-domain human-challenge org. If you are interested in founding such an org or know someone who might be and want my advice, let me know. (I will also be linking this shortform to some people who might be able to help set this up.)

--

Some further inspiration I'm drawing on to be excited about this org:

  1. Freakonomics' RCT on measuring the effects of big life changes like quitting your job or breaking up with your partner. This makes me optimistic about the feasibility of getting lots of people to sign up.
  2. Holden's note on doing these type of experiments with digital people. He mentions some difficulties with running these types of RCTs today, but I think an org specializing in them could help.

Votes/considerations on why this is a good or bad idea are also appreciated!

I'm confused why these would be described as "challenge" RCTs, and worry that the term will create broader confusion in the movement to support challenge trials for disease. In the usual clinical context, the word "challenge" in "human challenge trial" refers to the step of introducing the "challenge" of a bad thing (e.g., an infectious agent) to the subject, to see if the treatment protects them from it. I don't know what a "challenge" trial testing the effects of veganism looks like?

(I'm generally positive on the idea of trialing more things; my confusion+comment is just restricted to the naming being proposed here.)

Thanks, I agree with this and it's probably not good branding anyway. 

I was thinking the "challenge" was just doing the intervention (e.g. being vegan), but agree that the framing is confusing since it refers to something different in the clinical context. I will edit my shortforms to reflect this updated view.

Just made a bet with Jeremy Gillen that may be of interest to some LWers, would be curious for opinions:

[cross-posting from blog]

I made a spreadsheet for forecasting the 10th/50th/90th percentile for how you think GPT-4.5 will do on various benchmarks (given 6 months after the release to allow for actually being applied to the benchmark, and post-training enhancements). Copy it here to register your forecasts.

If you’d prefer, you could also use it to predict for GPT-5, or for the state-of-the-art at a certain time e.g. end of 2024 (my predictions would be pretty similar for GPT-4.5, and end of 2024).

You can see my forecasts made with ~2 hours of total effort on Feb 17 in this sheet; I won’t describe them further here in order to avoid anchoring.

There might be a similar tournament on Metaculus soon, but not sure on the timeline for that (and spreadsheet might be lower friction). If someone wants to take the time to make a form for predicting, tracking and resolving the forecasts, be my guest and I’ll link it here.

(epistemic status: exploratory)

I think more people into LessWrong in high school - college should consider trying Battlecode. It's somewhat similar to The Darwin Game which was pretty popular on here and I think generally the type of people who like LessWrong will both enjoy and be good at Battlecode. (edited to add:  A short description of Battlecode is that you write a bot to beat other bots at a turn-based strategy game. Each unit executes its own code so communication/coordination is often one of the most interesting parts.)

I did it with friends for 6 years (junior year of high school - end of undergrad), and I think it at least helped me gain legible expertise in strategizing and coding quickly, but plausibly also helped me pick up skills in these areas as well as teamwork.

If any students are interested (I believe PhD students can qualify as well but may not be worth their time), there's still 2/3 weeks left in this year's game which is plenty of time. If you're curious to learn more about my experiences with Battlecode, see the README and postmortem here.

Feel free to comment or DM me if you have any questions.

On the same line but more commercial is the game Screeps, which has both ongoing and seasonal servers run by the developers as well as private servers (you can run your own).