LESSWRONG
LW

Leveling Up: advice & resources for junior alignment researchers
ProductivityResearch TasteAIPractical
Frontpage

50

11 heuristics for choosing (alignment) research projects

by Orpheus16, danesherbs
27th Jan 2023
1 min read
5

50

ProductivityResearch TasteAIPractical
Frontpage

50

Previous:
Resources that (I think) new alignment researchers should know about
9 comments70 karma
Next:
Naive Hypotheses on AI Alignment
29 comments98 karma
Log in to save where you left off
11 heuristics for choosing (alignment) research projects
17Raemon
2WilliamKiely
2WilliamKiely
2Lone Pine
1Yonatan Cale
New Comment
5 comments, sorted by
top scoring
Click to highlight new comments since: Today at 9:02 PM
[-]Raemon3y1713

Man I really like how short this post is.

Reply
[-]WilliamKiely3y21

Re: 1: Do Dane's Guestimate models ever yield >1 microdoom estimates for solo research projects? That sounds like a lot.

Reply
[-]WilliamKiely3y20

IIRC Linch estimated in an EA Forum post that we should spend up to ~$100M to reduce x-risk by 1 basis point, i.e. ~$1M per microdoom. Maybe nanodooms would be a better unit.

Reply
[-]Lone Pine3y20

If your efforts improve the situation by 1 nanodoom, you've saved 8 people alive today.

Reply
[-]Yonatan Cale3y10

This seems like great advice, thanks!

I'd be interested in an example for what "a believable story in which this project reduces AI x-risk" looks like, if Dane (or someone else) would like to share.

Reply
Moderation Log
More from Orpheus16
View more
Curated and popular this week
5Comments

I recently spoke with Dane Sherburn about some of the most valuable things he learned as a SERI-MATS scholar.

Here are 11 heuristics he uses to prioritize between research projects:

  1. Impact: Can I actually tell myself a believable story in which this project reduces AI x-risk? (Or better yet; can I make a guesstimate model that helps me estimate the microdooms averted from this project?)
  2. Clarity of research question: Can I easily explain my core research question in a few sentences?
  3. Relevance of research approach: Will my research project actually help me reduce uncertainty on my research question? When I imagine the possible results, are there scenarios where I actually update? Or do I already know (with high probability) what I’m likely to learn?
  4. Mentorship: Would my mentor be able to give me meaningful guidance on this project? If not, would I be able to find one who could?
  5. Feedback loops: Will I be able to get feedback within the first week? First day? Will I have to wait several weeks or months before I know if things are working?
  6. Efficiency: How efficiently will I be able to collect information or run experiments? Will I need to spend a lot of time fine-tuning models? Is there a way to do something similar with pretrained models, so I can run experiments 10-100X more quickly?
  7. Resources: WilI this project need datasets? Large models? Compute? Money? How likely is it that I’ll get the resources I need, and how long will it take?
  8. Excitement: How much does the project subjectively excite me? Do I feel energized about the project?
  9. Timespan: How long would it take to do this project? Would it fit into a window of time that I’m actually willing to devote to it?
  10. Downsides/capabilities externalities: To what extent does the project have capabilities externalities? Could it increase x-risk?
  11. Leaveability: How easy would it be to leave this project if I realize it’s not working out, or I find something better?