What should my college major be if I want to do AI alignment research? — LessWrong

x

What should my college major be if I want to do AI alignment research? — LessWrong

7 comments, sorted by

Click to highlight new comments since: Today at 2:14 PM

[-]mesaoptimizer3y20

There seem to be three key factors that would influence your decision:

Your belief about how valuable the problem is to work on
Your belief about how hard it is to solve this problem and how well the current alignment community is doing to solve the problem
Your belief about how long we have until we run out of time

Based on your LW comment history, you probably already have rough models about the alignment problem that inform these three beliefs of yours. I think it would be helpful if you could go into detail about them so people can give you more specific advice, or perhaps help you answer another question further upstream of the one you asked.

[-]metachirality3y40

I think getting an extra person to do alignment research can give massive amounts of marginal utility considering how few people are doing it and how it will determine the fate of humanity. We're still in the stage where adding an extra person removes a scarily large amount from p(doom), like up to 10% for an especially good individual person, which probably averages to something much smaller but still scarily large when looking at your average new alignment researcher. This is especially true for agent foundations.
I think it's very possible to solve the alignment problem. Stuff like QACI, while not a full solution yet, make me think that this is conceivable and you could probably find a solution if you threw enough people at the problem.
I think we'll get a superintelligence at around 2050.

[-]mesaoptimizer3y10

2050? That's quite far off, and it makes sense that you are considering university given you expect to have about two decades.

Given such a scenario, I would recommend trying to do a computer science/math major, specifically focusing on the subjects listed in John Wentworth's Study Guide that you find interesting. I expect that three years of such optimized undergrad-level study will easily make someone at least SERI MATS scholar level (assuming they start out a high school student). Since you are interested in agent foundations, I expect you shall find John Wentworth's recommendations more useful since his work seems close to (but not quite) agent foundations.

Given your timelines, I expect doing an undergrad (that is, a bachelor's degree) would also give you traditional credentials, which are useful to survive in case you need a job to fund yourself.

Honestly, I recommend you simply dive right in if possible. One neglected but extremely useful resource I've found is Stampy. The AGI Safety Fundamentals technical course won't happen until September, it seems, but perhaps you can register your interest for it. You can begin reading the curriculum -- at least the stuff you aren't yet familiar with -- almost immediately. Dive deep into the stuff that interests you.

Well, I assume you have already done this, or something close to this, and if that is the case, you can ignore the previous paragraph. If possible, could you go into some detail as to why you expect we will get a superintelligence at around 2050? It seems awfully far to me, and I'm curious as to the reasoning behind your belief.

[-]metachirality3y*32

I've checked out John Wentworth's study guide before, mostly doing CS50.

Part of the reason I'm considering getting a degree is so I can get a job if I want and not have to bet on living rent-free with other rationalists or something.

The people I've talked to the most have timelines centering around 2030. However, I don't have a detailed picture of why because their reasons are capabilities exfohazards. From what I can tell, their reasons are tricks you can implement to get RSI even on hardware that exists right now, but I think most good-sounding tricks don't actually work (no one expected transformer models to be the closest to AGI in comparison with other architectures) and I think superintelligence is more contingent on compute and training data than they think. It also seems like other people in AI alignment disagree in a more optimistic direction. Now that I think about it though, I probably overestimated how long the timelines of optimistic alignment researchers were so it's probably more like 2040.

[-]mesaoptimizer3y10

Part of the reason I’m considering getting a degree is so I can get a job if I want and not have to bet on living rent-free with other rationalists or something.

Yeah, that's a hard problem. You seem smart: have you considered finding rationalists or rationalist-adjacent people who want to hire you part-time? I expect that the EA community in particular may have people willing to do so and that would give you both experience (to show future employers / clients), connections (to find more part-time / full-time jobs), and money.

Now that I think about it though, I probably overestimated how long the timelines of optimistic alignment researchers were so it’s probably more like 2040.

You just updated towards shortening your timelines by a decade due to what would be between 5 minutes to half an hour of tree-of-thought style reflection. Your reasoning seems entirely social (that is, dependent on other people's signalled beliefs) too, which is not something I would recommend if you want to do useful alignment research.

The problem with relying on social evidence for your beliefs about scientific problems is that you both end up with bad epistemics and end up taking negative expected value actions. First: if other people update their beliefs due to social evidence the same way you do, you are vulnerable to a cascade of belief changes (mundane examples: tulip craze, cryptocurrency hype, NFT hype, cult beliefs) in your social community. This is even worse for the alignment problem because of the significant amount of disagreement in the alignment research community itself about details about the problem. Relying on social reasoning in such an epistemic environment will leave you constantly uncertain due to how uncertain you percieve the community is about core parts of the problem. Next: if you do not have inside models of the alignment problem, you shall fail to update accurately given evidence about the difficulty about the problem. Even if you rely on other researchers who have inside / object-level models and update accurately, there is bound to be disagreement between them. Who do you decide to believe?

The first thing I recommend you do is to figure out your beliefs and model of the alignment problem using reasoning at the object-level, without relying on what anyone else thinks about the problem.

[-]metachirality3y10

I have a strong inside view of the alignment problem and what a solution would look like. The main reason why I don't have an as concrete inside view AI timeline is because I don't know enough about ML and I have to defer to get a specific decade. The biggest gap in my model of the alignment problem is what a solution to inner misalignment would look like, although I think it would be something like trying to find a way to avoid wireheading.

[-]mesaoptimizer3y10

My bad. I'm glad to hear you do have an inside view of the alignment problem.

If knowing enough about ML is your bottleneck, perhaps that's something you can directly focus on? I don't expect it to be hard for you -- perhaps only about six months -- to get to a point where you have coherent inside models about timelines.