tl;dr: If your reason for cramming AI knowledge into your brain is stress, then don't do it. You can still be useful, but walk away from the front lines where people are directly responsible. 

Disclaimer: 1) This is an obvious problem that has already been noticed and addressed by many LessWrong users 2) This is not an original solution but rather a specific framing of the problem and some food for thought 3) I could be gravely mistaken, and your best bet might be putting your all into research after all. However, you might just want to emerge from lurking and actually do something if that is the case. 4) The rather romantic and optimistic tone employed here is fait exprès and is not meant as an accurate description of the reality in which we are in right now.


The heroes who will save the world will be alignment researchers: when, dot by dot, the stars will grow dark and a dark dawn will rise, we will all have to buy them a beer. If you are not among this small band of heroes who will guarantee us the best future possible, you may feel an urgent need to promptly join them: I beg of you only to consider whether this is a good idea. 

If you think that you should become an alignment researcher in a matter of months,[1] I will not try to stop you.  But it's probably worth a few days worth of cranial compute to establish whether you are exploiting yourself in the best way you could.

I'll set the parameters of the problem. "Becoming an alignment researcher" is a spectrum: the more you learn about alignment, the more capable you are at navigating the front lines of the alignment project. Certainly, understanding the tenets of alignment is a laudable goal; but at what point will you be faced with diminishing returns? If you are not planning on single-handedly solving the alignment problem, are there not better uses of your time?[2] 

There are many instrumental goals that serve the terminal goal of completing the alignment project, and they might be more worthy of your time:  

  • The logistics of the alignment project: educating and hiring all the AI experts out there who could be working directly on the front lines. How many Von Neumanns are locked in Subsaharian Africa that could be shipped to the front lines of alignment in a few years? How many engineers at OpenAI could be hired to work on the alignment problem instead of on the next variation of GPT? 
  • The economics of the project: financing alignment research, finding an economic incentive for AI companies to stride forward more safely and spend R&D on alignment and optimizing how many ressources in the general economy are spent toward this goal.  Money is always useful: what could we use it for exactly? Is open-sourcing the alignment project possible? 
  • The politics of the project: If you know anybody in politics anywhere, it might be a good idea to try and convince them to pay attention to this AGI thing. If you know anybody who knows anybody in politics anywhere, that's good too. The power of government and its resources already hugely determine what the AI field looks like. 
  • The PR of the project: More people should probably be taking AGI risk seriously, and it should be prioritized over weaker risks such as climate change,[3] which tend to monopolize all the attention. Misconceptions should be cleared, the tenets of AI safety should be clearly understood, and the reality of how difficult it is to build AGI and to make it aligned should be well-known. Getting the crowds over on our side won't help too much directly because it is alignment researchers, not crowds, who will save the world; but it could increase the amount of smart people joining our ranks and it could help economically and politically as well. This also increases human dignity
  • Convincing AI researchers to join our side: lc notes here what an actually practical containment strategy would look like; how one can contact engineers at AI projects and actually try to convince them to join our side. Given how aware those people are already of the difficulties of alignment and the dangers AI poses, this is perhaps the most effective solution in this whole list. However, so as for it not to backfire and actually push people away, try discussing it first and organizing something more systematic and convincing. Given how few people in the world proportionally are working on AI, and given how much risk is concentrated in AI, convincing an individual to do a full 180 and go from speeding up AI development to cautioning against it can have a huge effect! 
  • The mental health aspect of the project: We must keep alignment researcher minds intact. There are some resources out there addressing this problem, but mental health is extremely specific to each human being, and so having as many "how to be okay" takes as possible is probably a good idea. The current landscape seems to be: "AI progress is careening toward a future in which AI systems are smarter than us, have orthogonal values, and are inscrutable black boxes." The current landscape justly terrifies many of us. Coming up with new truth-based reasons to keep fighting and remain composed is crucial. 
  • The "you are a human" aspect of the project:  You are more useful to us when you are not stressed. I won't deny that you are personally responsible for the entire destiny of the universe, if that's what you think, because I won't lie to you: but we have no use for a broken tool. Don't let responsibility be such a burden for you that you are incapable of being at your best. Work on the alignment problem only when you're at peace. You are a specifically human general intelligence, which means that your set of determining variables for problem-solving are specifically human. You cannot ignore the parts of your mind that make you human, so spend time on them too.[4] 
  • The practical project: There are practical things you could be doing right now that might disproportionately increase our odds of survival. Cooking pasta one night for an alignment researcher you know. Fixing their toilet or babysitting their kids. Being friendly to strangers. Picking up trash on the street. Gifting good books to children. I don't know. Making the world a little better will achieve the following: increase our odds of survival by making the front lines more bearable; increase human dignity; diminish distractions; solve part of the "you are a human" problem. Just be a good person. You'll be contributing to the practical project.[5]
  • Sit in a room. Your next logical action two minutes from now might not be contributing to the alignment project at all: may I suggest sitting alone in a room? Compose your thoughts! Let your mind roam free. Let it engage in play and take up stock. In realities in which the alignment problem is solved, it is most likely thanks to some lone researcher sitting in a room, dancing gracefully between ideas and spotting patterns before inspiration strikes. Boredom can produce miracles, and cramming knowledge often just drives you into burnout. To increase our log odds of survival, just do nothing. You have a lot to think about anyhow. 
  • The creative solution to the project: One of the many uses of sitting alone in a room is that that's the path to creative solutions. Humans are unimaginably unique, so if you have something you can contribute to the effort that nobody will have thought of before, I urge you to think particularly about that. People much smarter and more knowledgeable than us have thought about this problem, and the meta-problem around it, a lot: your only hope of not straying down some painfully obvious false path is having a unique take on the problem. 


If you feel stressed right now and have thus decided to spend your time scrantically[6]  reading LessWrong posts about AGI projections and alignment solutions while breathing heavily. . .  just don't. Don't become an alignment scientist today because you are stressed. Don't sacrifice doing what you love, because there's a good chance you can help us by doing what you love. Solve the "you are human" problem first and then perhaps solve one of the others, so that you are not directly involved on the front lines where responsibility is direct. You are just as responsible for the universe as the rest of us: but you are responsible for results, not effort, and that could mean walking away from the front lines. 

Ah, and if you're too stressed: breathe three times using the whole capacity of your lungs, smile at a mirror, then eat some chocolate. I bid you an excellent day. Spend some time looking at flowers or something. Then you can return to heroics. 


  1. ^

    It's not enough to become an alignment researcher. You must become a useful alignment researcher, which is of course an even harder target to attain. 

  2. ^

    I'm conflicted because there might be too many people writing and reading on the blog instead of spending hours in solitude attempting to actually find a solution to alignment. I deeply respect people who do the latter, and we need more of them. More on that later.

  3. ^

    "Weaker risks" does not mean they should not be addressed: if climate change starts hampering alignment research, like by slowing down development in poor countries, it should be proportionally paid attention to. How important minor risks are kind of depends on what your AGI forecast model is (there are a dozen on LessWrong). But the point is, almost all the existential risk is concentrated in AGI and so the importance of all other problems should be correlated with their relevance to the alignment problem. 

  4. ^

    The cool aspect of this problem is that you can't be stressed by it. The other problems are external in nature: but the whole point of this problem is that you  must be at peace for it to be solved, meaning that you can't rush your way through it, half-ass it, or have a breakdown while doing it. Take a walk outside or something. 

  5. ^

    False hope is a dangerous thing and I do not mean to supply that here. If we all recycle our pizza boxes, the world won't be saved. But taking away some distractions and burdens that alignment researchers may be plagued with seems like an excellent use of time. And being a good person is just generally a good thing: i.e. don't drop everything including your morals and give your all to the alignment project. There's a lot to say about arrogance of this kind: think of Raskolnikov from Crime and Punishment. AGI is not an excuse for you to forget basic duties.

  6. ^

    Rushing around LessWrong with its abundance of footnotes and references with the goal of learning something and clarifying the picture for you, will accomplish nothing but fragment your mind and increase your stress levels. Digest all knowledge.

New to LessWrong?

New Comment
14 comments, sorted by Click to highlight new comments since: Today at 2:02 PM

All the obvious alternate routes to participating to the alignment problem seem to have been mentioned here--are there any more I should write down? I'm aware this is a flawed post and would like to make it more complete as time goes.

I'm really glad you wrote this post, because Tsvi's post is different and touches on very different concepts! That post is mainly about fun and exploration being undervalued as a human being. Your post seems to have one goal: ensure that up-and-coming alignment researchers do not burn themselves out or hyperfocus on only one strategy for contributing to reducing AI extinction risk.

Note, this passage seems to be a bit... off to me.

This one is slightly different from the last because it is an injunction to take care of your mental health. You are more useful to us when you are not stressed. I won’t deny that you are personally responsible for the entire destiny of the universe, because I won’t lie to you: but we have no use for a broken tool.

People aren't broken tools. People have limited agency, and claiming they are "personally responsible for the entire destiny of the universe" is misleading. One must have an accurate sense of the agency and influence they have when it comes to reducing extinction risk if they want to be useful.

The notion that alignment researchers and people supporting them are "heroes" is a beautiful and intoxicating fantasy. One must be careful that it doesn't lead to corruption in our epistemics, just because we want to maintain our belief in this narrative.

The passage on "you are responsible for the entire destiny of the universe" was mostly addressing the way it seems many EAs feel about the nature of responsibility. We indeed have limited agency in the world but people around here tend to feel they are personally responsible for literally saving the world alone. The idea was not to frontally deny that or to run against heroic responsibility but rather to say that while the responsibility won't go away, there's no point in becoming consumed by it. You are a less effective tool if you are too heavily burdened by responsibility to function properly. I wrote it that way because I'm hoping the harsh and utilitarian tone will reach the target audience better than something more clichèd would. There's enough romanticization as it is here.

I definitely romanticized the alignment researchers being heroes part. I'll add a disclaimer to mention that the choice of words was meant to paint the specific approach, the specific picture that up-and-coming alignment researchers might have when they arrive here. 

As for which narrative to follow, this one might be as good as any. As the mental health post I referenced here mentioned, the "dying with dignity" approach Eliezer is following might not sit well with a number of people even when it is in line with his own predictions. I'm not sure to what degree what I described is a fantasy. In a universe where alignment is solved, would this picture be inacurate? 

Thanks for the feedback!

Out of curiosity, what role do you see yourself playing?

Interesting question. The sous-entendu might be "how much of this post was written for you" and the answer would be "probably a lot". I don't think I have the mind or time or stamina to work on the front lines, and so so far my most concrete plan is writing a few more LessWrong posts based on various helpful-seeming ideas. This post outlined a few options I and others in the same position as me have. Do you have any more ideas?

This is a good post, so I’d definitely encourage you to write up a few more posts.

I know very little about you, so it’d be hard for me to make good suggestions, but here’s two possibilities for your consideration:

  • Help other people figure out how they can contribute, particularly those looking to contribute in a non-technical way. If this is something you’d be interested in doing, I’d probably invest some more time in understanding the strategic landscape first (before someone starts advising, it’s important to have a robust model of what potential downside risks exist)
  • If you run out of post ideas, find others with things they’d like to write up if they had time and help them write it up

Hello! I thought about what you suggested and have been doing my best to understand the technicalities of alignment and the general coordination landscape, but that's still ongoing. I'll write more posts myself, but did you have anyone in mind for that last part, finding others who'd like posts written up? 

I didn’t particularly have anyone in mind unfortunately.

The heroes ... heroes ... heroics. 


If you notice that alignment is a problem and you think you can do something about it and you start doing something about it - you are about as heroic as somebody who starts swimming after falling into the water. 

Orwell's original title for 1984 was The last man in Europe, by which he meant that Winston, the hero of the novel, was the last sane man left in the entire continent. I would argue that because literally everyone else around him was insane and was essentially drowning in the water, he was a hero for swimming. The amount of people working on alignment in the world is far below 1% of the general population--I know it's a romanticized qualification, which is kind of the point here, but this runs under my definition of "hero". 

I mean what even is your definition of hero? 

Sacrificing or taking a significant risk of sacrifice to do what is right. 

Someone who wins a sporting competition is not a hero - even if it was very difficult and painful to do. Somebody who is correct, where most people are wrong is not a hero. 

I know we all want our heroes to be competent and get it done, but to me that's not what's heroic. 

When it comes to alignment researchers: 

If you are at the beginning of your career and you decide to become an alignment researcher, you are not sacrificing much if anything. AI is booming, alignment is booming - if you do actually relevant work, you will be at the frontline of the most important technology for years to come. 

If you are deeply embedded into the EA and rationalist community, you'll be high status where it matters to you. 

That doesn't mean your work is less important, but it does mean you are not being heroic. 

How about this as advice to be less stressed out: Don't think of your life as an epic drama. Just do what you think is necessary and don't fret about it.

The best example I can recall of what you're describing is the members of La Résistance in France during WW2. These people risked their lives and the lives of their family in order to blow up German rails, smuggle out Jews and kill key Gestapo operatives. They did not consider themselves heroes, because for them this was simply the logical course of action, the only way to act for a person with a shred of common sense. Most of them are dead now but along their lives they repeated that if France considered them to be heroes (which it did) that would defeat the point: that doing what they did should not be extraordinary, but common sense. 

You're right about the epic drama thing. Poetic flare can be useful in certain situations, I imagine, although it is a fine line between using that as motivation and spoiling rationality. (Poetry, as in beauty and fun, is a terminal goal of humanity so I would also advise against ignoring it entirely.) 

Excellent post. One part I disagree with though:

“ If you know anybody in politics anywhere, it might be a good idea to try and convince them to pay attention to this AGI thing” - It wouldn’t surprise me if this was net-negative and the default outcome of informing actors about AGI is for them to attempt to accelerate it.

Another part I’d disagree with is lionising technical researchers over everyone else.

The point of the post was to not lionize them over everyone else. The target audience I had in mind (which may not even exist at this point) was people who wanted to become alignment researchers because that's where the front lines are. My point is that that may not be the best idea in some cases. At the end of the day, if we solve the alignment problem it will be directly thanks to those researchers, that's what I mean. 

As for the politics thing that's interesting, I hadn't thought of it horribly backfiring in that way. I mean the goal would be to explain to them why alignment is necessary, which shouldn't be an impossible task. There's a lot of legal and economic power coming from the government, so just ignoring that actor seems like a mistake. 

Thanks for the feedback!