The Alignment Community Is Culturally Broken

sudo

LESSWRONG
LW

The Alignment Community Is Culturally Broken — LessWrong

142 The Alignment Community Is Culturally Broken

by sudo

13th Nov 2022

2 min read

142

Disclaimer: These are entirely my thoughts. I'm posting this before it's fully polished because it never will be.

Epistemic status: Moderately confident. Deliberately provocative title.

Apparently, the Bay Area rationalist community has a burnout problem. I have no idea if it's worse than base rate, but I've been told it's pretty bad. I suspect that the way burnout manifests in the rationalist community is uniquely screwed up.

I was crying the other night because our light cone is about to get ripped to shreds. I'm gonna do everything I can to do battle against the forces that threaten to destroy us. You've heard this story before. Short timelines. Tick. Tick. I've been taking alignment seriously for about a year now, and I'm ready to get serious. I've thought hard about what my strengths are. I've thought hard about what I'm capable of. I'm dropping out of Stanford, I've got something that looks like a plan, I've got the rocky theme song playing, and I'm ready to do this.

A few days later, I saw this post. And it reminded me of everything that bothers me about the EA community. Habryka covered the object level problems pretty well, but I need to communicate something a little more... delicate.

I understand that everyone is totally depressed because qualia is doomed. I understand that we really want to creatively reprioritize. I completely sympathize with this.

I want to address the central flaw of the argument in the Buying Time post, which is that actually, people can improve at things.

There's something deeply discouraging about being told "you're an X% researcher, and if X>Y, then you should stay in alignment. Otherwise, do a different intervention." No other effective/productive community does this. I don't know how to put this, but the vibes are deeply off.

The appropriate level of confidence to have about a statement like "I can tell how good of an alignment researcher you will be after a year of you doing alignment research" feels like it should be pretty low. At a year, there's almost certainly ways to improve that haven't been tried. Especially in a community so mimetically allergic to the idea of malleable human potential.

Here's a hypothesis. I in no way mean to imply that this is the only mechanism by which burnout happens in our community, but I think it's probably a pretty big one. It's not nice to be in a community that constantly hints that you might just not be good enough and that you can't get good enough.

Our community seems to love treating people like mass-produced automatons with a fixed and easily assessable "ability" attribute. (Maybe you flippantly read that sentence and went "yeah it's called g factor lulz." In that case, maybe reflect on good of a correlate g is in absolute terms for the things you care about.).

If we want to actually accomplish anything, we need to encourage people to make bigger bets, and to stop stacking up credentials so that fellow EAs think they have a chance. It's not hubris to believe in yourself.

AI Alignment FieldbuildingCommunity

Personal Blog

142

New Comment

68 comments, sorted by

top scoring

Click to highlight new comments since: Today at 4:13 PM

[-]philip_b3y4714

I am (was) an X% researcher, where X<Y. I wish I had given up on AI safety earlier. I suspect it would've been better for me if AI safety resources explicitly said things like "if you're less than Y, don't even try", although I'm not sure if I would've believed them. Now, I'm glad that I'm not trying to do AI safety anymore and instead I just work at a well paying relaxed job doing practical machine learning. So, I think pushing too many EAs into AI safety will lead to those EAs suffering much more, which happened to me, so I don't want that to happen and I don't want the AI Alignment community to stop saying "You should stay if and only if you're better than Y".

Actually, I wish there were more selfish-oriented resources for AI Alignment. Like, with normal universities and jobs, people analyze how to get into them, have a fulfilling career, earn good money, not burn out, etc. As a result, people can read this and properly analyze if it makes sense for them to try to get into jobs or universities for their own food. But with a career in AI safety, this is not the case. All the resources look out not only for the reader, but also for the whole EA project. I think this can easily burn people.

[-]TurnTrout3y299

In 2017, I remember reading 80K and thinking I was obviously unqualified for AI alignment work. I am glad that I did not heed that first impression. The best way to test goodness-of-fit is to try thinking about alignment and see if you're any good at it.

That said, I apparently am the only person of whom [community-respected friend of mine] initially had an unfavorable impression, which later became strongly positive.

[-]Chris_Leong3y*2026

Sorry to hear that you didn't make it as an AI Safety researcher, but thank you for trying.

You shouldn't feel any pressure, but have you considered trying to be involved in another way such as a) helping to train people trying to break into the field b) providing feedback on people's alignment proposals c) assisting in outreach (this one is more dependent on personal fit and is easier to do net harm)?

I think it's a shame how training up in AI Safety is often seen as an all-or-nothing bet, when many people have something valuable to contribute even if that's not through direct research.

[-]ViktoriaMalyasova3y72

Philip, but were the obstacles that made you stop technical (such as, after your funding ran out, you tried to get new funding or a job in alignment, but couldn't) or psychological (such as, you felt worried that you are not good enough)?

[-]sudo3y20

Oh man.

[-]sudo3y52

Yeah, this really sucks. I still think having more people try and fail is preferable to telling people to not try.

[-]LawrenceC3y3441

(Disclaimer: The Law of Equal and Opposite Advice probably applies to some people here.)

I think that by default, people should spend more time doing things, and less time worrying about if they're smart enough to do things. Then, if they don't enjoy working they should 100% feel free to do something else!

That is, try lots of things, fail fast, then try something else.

Here's some reasons why:

As you say, people can improve at things. I think this is less of a factor than you do, but it's probably still underrated on the margin. Taking time to study math or practice coding will make you better at coding; while effort is no guarantee of success, it's still generally related.
People can be pretty bad at judging aptitude. I think this is the main factor. I've definitely misjudged people's aptitude in the past, where people I've thought were pretty competent turned out to not so and vice versa. In general, the best predictor of whether someone can or cannot do a thing is seeing if they can do the thing.
Aptitude is often context-dependent. People are often more-or-less competent depending on the context. PhD programs and depressing jobs are two common factors that make people seem significantly less competent. Antidepressants also exist, and seem to be a contributing factor in helping people be more functional.
It's healthier to try than to worry. I think people often end up sad and unmotivated because they feel incompetent before they've put in a solid effort to do good things. Empirically, focusing too much on feelings of inferiority is really bad for productivity and momentum; I've often found that people can do many more things when they put in some effort than when they're beating themselves up.

Overall, I think people should lean more toward work trials/work trial tasks (and being more okay with not being hired after work trials). I also think we should have more just do the thing energy and more let's move on to the next thing energy, and less being stuck worrying about if you can do the thing energy.

==========

I do think the AI alignment community should try to be slightly more careful when dismissing people, since it can be quite off-putting or discouraging to new people. It also causes people to worry more about "does someone high status think I can do it?" instead of "can I actually do it?". This is the case especially when the judgments might be revised later.

For what it's worth, "you're not good enough" is sometimes stated outright, instead of implied. To share a concrete example: in my first CFAR workshop in 2015, I was told by a senior person in the community that I was probably not smart enough to work on AI Alignment (in particular, it was suggested that I was not good enough at math, as I avoided math in high school and never did any contest math). This was despite me having studied math and CS for ~2 hours a day for almost 7 months at that point, and was incredibly disheartening.Thankfully, other senior people in the community encouraged me to keep trying, which is why I stuck around at all. (In 2017, said senior person told me that they were wrong. Though, like, I haven't actually done that much impressive alignment research yet, so we'll see!)

Insofar as we should be cautious about implying that people are not good enough, we should be even more cautious about directly stating this to people.

[-]Davidmanheim3y2710

There's something deeply discouraging about being told "you're an X% researcher, and if X>Y, then you should stay in alignment. Otherwise, do a different intervention." No other effective/productive community does this. (Emphasis added.) I don't know how to put this, but the vibes are deeply off.

I think this is common, actually.

We apply the logic of only taking top people to other areas. Take medicine. The cost of doing medicine badly is significant, so tons of filters exist. Don't do well in organic chemistry? You can't be a doctor. Low GPA? Nope. Can't get a pretty good score on the MCAT? No again. Get into med school but can't get an internship? Not gonna be able to practice.

It's similar for many other high-stakes fields. The US military has a multi-decade long weeding out process to end up as a general. Most corporations effectively do the same. Academic research is brutal in similar ways. All of these systems are broken, but not because they have a filter, more because the filter works poorly, and moral mazes, etc.

Alignment work that's not good can be costly, and can easily be very net negative. But it's currently mostly happening outside of institutions with well-defined filters. So I agree that people should probably try to improve their skills if they want to help, but they should also self filter to some extent.

[-]LawrenceC3y74

I think this is common, actually.
We apply the logic of only taking top people to other areas. Take medicine. The cost of doing medicine badly is significant, so tons of filters exist. Don't do well in organic chemistry? You can't be a doctor. Low GPA? Nope. Can't get a pretty good score on the MCAT? No again. Get into med school but can't get an internship? Not gonna be able to practice.

I think most fields don't state it as explicitly as the intersection of EAs and AI Safety researchers tend to do. For example, I've definitely heard less explicit "yeah, you might not be smart enough"s in every single other community I've been in, even other communities that are very selected in other ways. Most other fields/communities tend to have more of a veneer of a growth mindset, I guess?

I do think it's true that filtering is important. Given this fact, it probably does make sense to encourage people to be realistic. But my guess is too many people run into the jarring "you might not be smart enough" attitude and self filter way too aggressively, which is what the post is pushing up against.

[-]MondSemmel3y4-1

All of these systems are broken, but not because they have a filter, more because the filter works poorly, and moral mazes, etc.

A complementary explanation is that if the system can't train people (because nobody there knows how to; or because it's genuinely hard, e.g. despite the Sequences we don't have a second Yudkowsky), then the only way to find competent people is to filter for outliers. E.g. if you can't meaningfully raise the IQ of your recruits, instead filter for higher-IQ recruits.

As pointed out in the linked essay, this strategy makes sense if outcomes are heavy-tailed, i.e. if exceptional people provide most of the value. E.g. if an exceptional general is 1000x as valuable as a good one, then it makes sense to filter for exceptional generals; not so much if the difference is only 3x.

[-]sudo3y86

How would you identify a second Yudkowsky? I really don’t like this trope.

By writing ability?

[-]MondSemmel3y30

For instance by being acknowledged by the first Yudkowsky as the second one. I was referring here mostly to the difficulty of trying to impart expertise from one person to another. Experts can write down and teach legible insights, but the rest of their expertise (which is often the most important stuff) is very hard to teach.

[-]M. Y. Zuo3y21

No military big enough to require multiple layers of general level positions filters for exceptional generals, they all filter for loyalty.

[-]MondSemmel3y20

So filter for exceptional loyalty. To which extent that's worth it depends on how relatively valuable an exceptionally loyal general is to a merely very loyal one, and on the degree to which you can train loyalty.

[-]jacob_cannell3y2119

Our community seems to love treating people like mass-produced automatons with a fixed and easily assessable "ability" attribute.

Have you considered the implied contradiction between the "culturally broken community" you describe and the beliefs - derived from that same community - which you espouse below?

I was crying the other night because our light cone is about to get ripped to shreds. I'm gonna do everything I can to do battle against the forces that threaten to destroy us.

Your doom beliefs derive from this "culturally broken community" - you probably did not derive them from first principles yourself. Is it broken in just the right ways to reach the correct conclusion that the world is doomed - even when experts knowledgeable in the relevant subject matters that would actually lead to doom find this laughable?

Consider a counterfactual version of yourself that never read HMPOR, the sequences, and LW during the critical periods of your upper cortical plasticity window, and instead spent all that time reading/studying deep learning, systems neuroscience, etc. Do you think that alternate version of yourself - after then reviewing the arguments that lead you to the conclusion that the "light cone is about to get ripped to shreds" - would reach that same conclusion? Do you think you would you win a debate against that version of yourself? Have you gone out of your way to read the best arguments of those who do not believe "our light cone is about to get ripped to shreds"? The sequences are built on a series of implicit assumptions about reality, some of which are likely incorrect based on recent evidence.

[-]Mau3y*1412

experts knowledgeable in the relevant subject matters that would actually lead to doom find this laughable

This seems overstated; plenty of AI/ML experts are concerned. [1] [2] [3] [4] [5] [6] [7] [8] [9]

Quoting from [1], a survey of researchers who published at top ML conferences:

The median respondent’s probability of x-risk from humans failing to control AI was 10%

Admittedly, that's a far cry from "the light cone is about to get ripped to shreds," but it's also pretty far from finding those concerns laughable. [Edited to add: another recent survey puts the median estimate of extinction-level extremely bad outcomes at 2%, lower but arguably still not laughable.]

[-]jacob_cannell3y55

To be clear I also am concerned, but at lower probability levels and mostly not about doom. The laughable part is the specific "our light cone is about to get ripped to shreds" by a paperclipper or the equivalent, because of an overconfident and mostly incorrect EY/LW/MIRI argument involving supposed complexity of value, failure of alignment approaches, fast takeoff, sharp left turn, etc.

I of course agree with Aaro Salosensaari that many of the concerned experts were/are downstream of LW. But this also works the other way to some degree: beliefs about AI risk will influence career decisions, so it's obviously not surprising that most working on AI capability research think risk is low and those working on AI safety/alignment think the risk is greater.

[-]Aaro Salosensaari3y54

Hyperbole aside, how many of those experts linked (and/or contributing to the 10% / 2% estimate) have arrived to their conclusion with a thought process that is "downstream" from the thoughtspace the parent commenter thinks suspect? Then it would not qualify as independent evidence or rebuttal, as it is included as the target of criticism.

[-]Mau3y*40

One specific concern people could have with this thoughtspace is the concern that it's hard to square with the knowledge that an AI PhD [edit: or rather, AI/ML expertise more broadly] provides. I took this point to be strongly suggested by the author's suggestions that "experts knowledgeable in the relevant subject matters that would actually lead to doom find this laughable" and that someone who spent their early years "reading/studying deep learning, systems neuroscience, etc." would not find risk arguments compelling. That's directly refuted by the surveys (though I agree that some other concerns about this thoughtspace aren't).

(However, it looks like the author was making a different point to what I first understood.)

[-]sudo3y50

I really don't want to entertain this "you're in a cult" stuff.

It's not very relevant to the post, and it's not very intellectually engaging either. I've dedicated enough cycles to this stuff.

[-]jacob_cannell3y710

That's not really what I'm saying: it's more like this community naturally creates nearby phyg-like attractors which take some individually varying effort to avoid. If you don't have any significant differences of opinion/viewpoint you may already be in the danger zone. There are numerous historical case examples of individuals spiraling too far in, if you know where to look.

[-]jefftk3y163

phyg

If you want to talk about cults, just say "cult".

[-]Ben Pace3y92

Hm? Seems a little pro-social to me to rot-13 things that you don't want to unfairly show up high in google search. (Though the first time you do it, is good to hyperlink to rot13.com so that everyone who reads can understand what you're saying.)

[-]jefftk3y110

About a decade ago people were worried that LW and cults would be associated through search [1] and and started using "phyg" instead. Having a secret ingroup word for "cult" to avoid being associated with it is actually much more culty than a few search results, and I wish we hadn't done it.

[1] https://www.lesswrong.com/posts/hxGEKxaHZEKT4fpms/our-phyg-is-not-exclusive-enough?commentId=4mSRMZxmopEj6NyrQ

[-]Ben Pace3y84

I disagree, I think the rest of the world has built a lot of superweapons around certain terms to the point where you can't really bring them up. I think it's a pretty clever strategy to be like "If you come here to search for drama, you cannot get to drama with search terms." For instance, I think if someone wanted to talk about some dynamics of enpvfz or frkhny zvfpbaqhpg in a community, they might be able to make a far less charged and tense discussion if everyone collectively agreed to put down the weapon "I might use a term while criticizing your behavior that means whenever a random person wants to look for dirt on you they can google this and use it to try to get you fired".

[-]Ben Pace3y20

Taking this line of reasoning one step further, it seems plausible to me that it’s pro social to do this sometimes anyway, just to create plausible deniability about why you’re doing it, similar to why it’s good to use Signal for lots of communications.

[-]sudo3y95

I thought they were calling me a flying Minecraft pig https://aether.fandom.com/wiki/Phyg

[-]Ben Pace3y162

Well, let me be the first to say that I don't think you're a passive mob that can be found in the aether.

[-]Raemon3y1811

One distinction I want to make here is between people who are really excited to work on AI Alignment (or, any particular high-impact-career), and who are motivated to stick with it for years (but who don't seem sufficiently competent), vs people who are doing it out of a vague sense of obligation, don't feel excited (and don't seem sufficiently competent).

For the first group, a) I can imagine them improving over time, b) if they're excited about it and find it fulfilling, like, great! It's the second group I feel most worried about, and I really worry about the vague existential angst driving people to throw themselves into careers they aren't actually well suited for. (and for creative research I suspect you do need a degree of enthusiasm in order to make it work)

[-]LawrenceC3y50

I think this distinction is very important. In my experiences EAs/Rationalists tend to underestimate the impact of personal fit; if you're completely unexcited and doing things only out of a vague sense of obligation, it's likely that the job just isn't for you, regardless of your level of competence.

[-]Slimepriestess3y30

people who are doing it out of a vague sense of obligation

I want to to put a bit of concreteness on this vague sense of obligation, because it doesn't actually seem that vague at all, it seems like a distinct set of mental gears, and the mental gears are just THE WORLD WILL STILL BURN and YOU ARE NOT GOOD ENOUGH.

If you earnestly believe that there is a high chance of human extinction and the destruction of everything of value in the world, then it probably feels like your only choices are to try preventing that regardless of pain or personal cost, or to gaslight yourself into believing it will all be okay.

"I want to take a break and do something fun for myself, but THE WORLD WILL STILL BURN. I don't know if I'm a good enough AI researcher, but if I go do any other things to help the world but we don't solve AI then THE WORLD WILL STILL BURN and render everything else meaningless."

The doomsday gauge is 2 minutes to midnight, and sure, maybe you won't succeed in moving the needle much or at all, and maybe doing that will cost you immensely, but given that the entire future is gated behind doomsday not happening, the only thing that actually matters in the world is moving that needle and anything else you could be doing is a waste of time, a betrayal of the future and your values. So people get stuck in a mindset of "I have to move the needle at all costs and regardless of personal discomfort or injury, trying to do anything else is meaningless because THE WORLD WILL STILL BURN so there's literally no point."

So you have a bunch of people who get themselves worked up and thinking that any time they spend on not saving the world is a personal failure, the stakes are too high to take a day off to spend time with your family, the stakes! The stakes! The stakes!

And then locking into that gear to make a perfect soul crushing trap, is YOU ARE NOT GOOD ENOUGH. Knowing you aren't Eliezer Yudkowsky or Nick Bostrom and never will be, you're just fundamentally less suited to this project and should do something else with your life to improve the world. Don't distract the actually important researchers or THE WORLD WILL BURN.

So on one hand you have the knowledge that THE WORLD WILL BURN and you probably can't do anything about it unless you throw your entire life into and jam your whole body into the gears, and on the other hand you have the knowledge that YOU AREN'T GOOD ENOUGH to stop it. How can you get good enough to stop the world from burning? Well first, you sacrifice everything else you value in life to Moloch, then you throw yourself into the gears and have a psychotic break.

[-]DaemonicSigil3y127

A quote from Eliezer's short fanfic Trust in God, or, The Riddle of Kyon that you may find interesting:

Sometimes, even my sense of normality shatters, and I start to think about things that you shouldn't think about. It doesn't help, but sometimes you think about these things anyway.

I stared out the window at the fragile sky and delicate ground and flimsy buildings full of irreplaceable people, and in my imagination, there was a grey curtain sweeping across the world. People saw it coming, and screamed; mothers clutched their children and children clutched at their mothers; and then the grey washed across them and they just weren't there any more. The grey curtain swept over my house, my mother and my father and my little sister -

Koizumi's hand rested on my shoulder and I jerked. Sweat had soaked the back of my shirt.

"Kyon," he said firmly. "Trying to visualize the full reality of the situation is not a good technique when dealing with Suzumiya-san."

How do you handle it, Koizumi!

"I'm not sure I can put it in words," said Koizumi. "From the first day I understood my situation, I instinctively knew that to think 'I am responsible for the whole world' is only self-indulgence even if it's true. Trying to self-consciously maintain an air of abnormality will only reduce my mind's ability to cope."

Also: I agree that people who want to do alignment research should just go ahead and do alignment research, without worrying about credentials or whether or not they're smart enough. On a problem as wickedly difficult as alignment, it's more important to be able to think of even a single actually-promising new approach than to be very intelligent and know lots of math. (Though even people who don't feel they're suited for thinking up new approaches can still work on the problem by joining an existing approach.)

The linked post talks about the large value of buying even six months of time, but six months ago it was May 2022. What has been accomplished in AI alignment since then? I think we urgently need to figure out how to make real progress on this problem, and it would be a tragedy if we turned away people who were genuinely enthusiastic about doing that for reasons of "efficiency". Allocating people to the tasks for which they have the most enthusiasm is efficient.

[-]frontier643y105

I don't really see the why for your assertions in your post here. For example:

It's not nice to be in a community that constantly hints that you might just not be good enough and that you can't get good enough.

Ok, it's not nice. Its understandable that many people don't want to think they're not good enough. But if they truly are not good enough then the effort they spend towards solving alignment in a way they can't contribute towards doesn't help solving alignment. The niceness of the situation has little bearing on how effective the protocols are.

If we want to actually accomplish anything, we need to encourage people to make bigger bets, and to stop stacking up credentials so that fellow EAs think they have a chance

Ok, this is a notion, maybe it's right, maybe it's not, I'm just not getting much from this post telling me why it's right.

[-]sudo3y52

Upvoted. I think these are legitimate critiques of my post. I feel strongly that most people can improve significantly more than anticipated. This is largely because most people do not try very hard to self improve.

[-]sudo3y31

To be more specific, I think that EAs are severely overconfident in our ability to gauge people’s potential, and that we often say things that create a sort of… memetic vibe that encourages this thinking.

I think in general having memes that lead to incorrect object level beliefs is bad, and there’s a case to be made that how EAs talk about people contributes to this.

[-]Kalmere3y64

Are you sure you want to drop out of Stanford?

You will have signifantly more prestige, capability to choose your own future, and leverage with that degree. And depending on your degree subject, some very useful skills.

I don't know your situation, but I recommend getting viewpoints from off this forum. Relatives, uni student support services, etc. This may be a symptom, not a cause. Harsh, but potentially helpful.

[-]sudo3y10

What's a symptom?

[-]Kalmere3y10

I'll rephrase. Wanting to take a drastic career turn could be a symptom of many other things. Degree burnout. Wanting to do a different subject. Depression. Chronic fatigue from an unknown medical cause. Wanting to get out of a situation (relationship, etc). I do not know you, so any guess I would make would not be helpful. But my gut feel is that it is worth getting second opinions from those close to you with more information. This is an online forum- I suggest you get second opinion before making drastic decisions. I know of several people taking PhDs without having things clear in their mind. That didn't work out well

[-]sudo3y30

Lol idk why people get the impression that I’m relying on LW for career advice.

I’m not.

[-]sudo3y10

I’ve made this decision offline a long time ago.

[-]Kalmere3y10

Your life, your choice. Just saying as a career machine learning specialist, just make sure your plans are robust. Leaving academica to go into what could be considered ML research raised major red flags to me. But I don't know your situation - you may have a golden opportunity. Job offer from OpenAI et al! Farewell and good luck.

[-]sudo3y10

[-]JakubK3y53

I want to address the central flaw of Akash+Olivia+Thomas's argument in the Buying Time post, which is that actually, people can improve at things.

Why do you think the Buying Time post suggested otherwise? Note that the authors did not make a blanket statement telling "bad" alignment researchers to stop working on alignment. One recommendation is for alignment researchers to consider working on technical research that concretely presents important alignment concepts or empirically demonstrates concerning alignment failures.

[-]Shmi3y55

I was crying the other night because our light cone is about to get ripped to shreds. I'm gonna do everything I can to do battle against the forces that threaten to destroy us.

If you find yourself in a stereotypically cultish situation like that, consider that you might have been brainwashed, like countless others before you, and countless others after you. "This time is different" is the worst argument in the world, because it is always wrong. "But really, this time the AI/Comet/Nibiru/... will kill us all!" Yeah. Sure. Every time.

[-]sudo3y91

Consider that I have carefully thought about this for a long time, and that I’m not going to completely override my reasoning and ape the heuristic “if internet stranger thinks I’m in a cult then I’m in a cult.”

[-]Shmi3y1110

that was not the heuristic I referred to. More like "what is a reference class of cults and does this particular movement pattern-match it? What meta-level (not object-level) considerations distinguish it from the rest of the reference class?" I assume that you "have carefully thought about this for a long time", and have reasonably good answers, whatever they are.

[-]jacob_cannell3y75

Humans continuously pick their own training data and generally aren't especially aware of the implicit bias this causes and consequent attractor dynamics. This could be the only bias that really matters strongly, and ironically it is not especially recognized in the one community supposedly especially concerned about cognitive biases.

[-]DaemonicSigil3y46

Debates of "who's in what reference class" tend to waste arbitrary amounts of time while going nowhere. A more helpful framing of your question might be "given that you're participating in a community that culturally reinforces this idea, are you sure you've fully accounted for confirmation bias and groupthink in your views on AI risk?". To me, LessWrong does not look like a cult, but that does not imply that it's immune to various epistemological problems like groupthink.

[-]TekhneMakre3y52

Upvoted.

Also a thing about that post which struck me as sort of crazy, and maybe I should have commented, was that it seemed to model things as like, there's different types of people; different types of people work on different things; you should figure out which type you are using thresholds and such. In particular this idea is silly because someone can do more than one thing in their life. On the margin, "buy time people" should be spending more time thinking about object-level technical alignment, more so than "technical alignment people" should be spending more time thinking about buying time.

[-][anonymous]3y92

I just want to clarify that there are also "create more alignment researchers" people, not just "buy time" people and "technical alignment" people. I am legally and morally obligated to avoid anything related to "buying time". And I also don't touch it with a ten foot pole because it seems much much much easier and safer to double the number of people working on alignment than to halve the annual R&D of the global AI industry.

[-]JakubK3y30

If halving the annual R&D of the global AI industry is equivalent to doubling the length of time before AGI, then I think that would be substantially more valuable than doubling the number of people working on alignment. I don't think "grow alignment researchers by X" and "lengthen timelines by X" are equally valuable.

[-]TekhneMakre3y00

There aren't types of people!

[-]Raemon3y61

maybe I should have commented, was that it seemed to model things as like, there's different types of people; different types of people work on different things

I do think people are in fact importantly different here. I think there exist unhealthy and inaccurate ways to think about it, but you need to contend with it somehow.

The way I normally think of this is "people have talent coefficients, which determine the rate at which they improve at various skills. You might have a basketball talent-coefficient of ".1", a badmitton talent coefficient of ".5", a drawing coefficient of "1" (this happens to be roughly true for me personally). So, an hour spent deliberate practicing drawing will result in 10x as much skill gain as an hour practicing basketball.

This is further complicated by "Learning is lumpy. The first 20 hours spent learning a thing typically has more low-hanging fruit than hours 21-50." (but also you can jump around to related skillsets gaining different types of related skills).

Also, the most determining factor (related to but not quite-the-same as talent coefficients) is "how much do you enjoy various things?". If you really like programming, you might be more motivated to spend hundreds of hours on it even if your rate-of-skill-gain is low, and that may result in you becoming quite competent.

The problem with doing original research is that feedback loops are often kinda bad, which makes it hard to improve.

This is all to say: different people are going to be differently suited for different things. The exact math of how that shakes out is somewhat complex. If you are excited to put a lot of hours in, it may be worth even if you don't seem initially great at it. But there are absolutely some people who will struggle so long and hard with something that it just really doesn't make sense to make a career out of it (especially when there are alternative careers worth pursuing).

[-]TekhneMakre3y31

Obviously different people are better or worse at doing and learning different things, but the implication that one is supposed to make a decision that's like "work on this, or work on that" seems wrong. Some sort of "make a career out of it" decision is maybe an unfortunate necessity in some ways for legibility and interoperability, but one can do things on the side.

[-]Raemon3y31

I don’t think the kind of work we’re talking about here is really possible without something close to ‘making a career if it’ - at least being a sustained, serious hobby for years.

[-]TekhneMakre3y20

How do you know that? How would anyone know that without testing it?

[-]Raemon3y6-1

My beliefs here are based on hearing from various researchers over the years what timescale good research takes. I've specifically heard that it's hard to evaluate research output for less than 6 months of work, and that 1-2 years is honestly more realistic.

John Wentworth claims, after a fair amount of attempting to train researchers and seeing how various research careers have gone, that people have about 5 years worth of bad ideas they need to get through before they start producing actually possibly-good-ideas. I've heard secondhand from another leading researcher that a wave of concentrated effort they oversaw from the community didn't produce any actually novel results. My understanding is Eliezer thinks there basically been no progress on the important problems.

My own epistemic-status here is secondhand, and there may be other people who disagree with the above take. but my sense is that there's been a lot of "try various ways of recruiting and training researchers over the years", and that it's at least nontrivial to get meaningful work done.

[-]TekhneMakre3y20

How does that imply that one has to "pick a career"? If anything that sounds like a five-year hobby is better than a two year "career".

[-]sudo3y30

It’s hard but not impossible to put 10k hours of deliberate practice into a hobby

[-]Raemon3y20

I think the amount of investment into a serious hobby is basically similar to a career change, so I don't really draw a distinction. It's enough investment, and has enough of a track-record of burnout, that I think it's totally worth strategizing about based on your own aptitudes.

(To be clear, I think "try it out for a month and see if it feels good to you" is a fine thing to do, my comments here are mostly targeted at people who are pushing themselves to do it out of consequentialism reasoning/obligation)

[-]TekhneMakre3y20

I think we agree that pushing oneself is very fraught. And we agree that one is at least fairly unlikely to push the boundaries of knowledge about AI alignment without "a lot" of effort.(Though maybe I think this a bit less than you? I don't think it's been adequately tested to take brilliant minds from very distinct disciplines and have them think seriously about alignment. How many psychologists, how many top-notch philosophers, how many cognitive scientists, how many animal behaviorists have seriously thought about alignment? Might there be relatively low-hanging fruit from the perspective of those bodies of knowledge?)

What I'm saying here is that career boundaries are things to be minimized, and the referenced post seemed to be career-boundary-maxing. One doesn't know what would happen if one made even a small hobby of AI alignment; maybe it would become fun + interesting / productive and become a large hobby. Even if the way one is going to contribute is not by solving the technical problem, it still helps quite a lot with other methods of helping to understand about the technical problem. So in any case, cutting off that exploration because one is the wrong type of guy is stupid, and advocating for doing that is stupid.

[-]mingyuan3y41

[fixed link to Habryka's comment]

[-]Gunnar_Zarncke3y30

I wasn't aware of the "you're < Y%" vibe. At least not explicitly. I felt that my edge wasn't with the advanced math early AI Safety work MIRI was doing. Maybe I'm not sensitive enough for the burn-out dynamics, but more likely, I wasn't exposed much to it, me being from Germany. I'm also pretty resilient/anti-fragile.

Anyway, I wanted to chime in to say that even if you are objectively below Y% on hard alignment math skill, there are so many areas where you can support AI Alignment, even as a researcher. This may not have been the case in the early days but is now: AI Alignment teams need organizers, writers, managers, cyber security professionals, data scientists, data engineers, software engineers, system administrators, frontend developers, designers, neuroscientists, psychologists, educators, communicators, and more. We no longer have these small, closely-knit AI teams. With a growing field, specialization increases. And all of these skills are valuable and likely needed for alignment to succeed and should get recognition.

Why do I say so? I have been in organizations growing from a handful to a few hundred people all my life. I'm in one such company right now and in parallel, I'm running the operations of a small AI Alignment project. I like being a researcher, but out of need, I take care of funding, outreach, and infrastructure. In bigger teams or the community at large, I see a need for the following:

Organizers for running events or small team daily work, as simple as coordinating meeting invites. Without this, teams often fall apart. Shared community helps with Common Good Commitment.
Writers who write summaries, essays, grant applications, and blog posts, whether technical, social, or organizational. And who promote Alignment Mindset. Where do you think LW posts come from?
Managers, no, not moral maze middle managers, but people who find the right people to work together and make sure work goes smoothly. Also, who establish Trustworthy Command and Research Closure. Not sure what else you call that role.
Cybersecurity professionals who secure the sensitive core of a project matching Strong Opsec as the project grows
Data scientists and data engineers who, well, deal with the amount of data that inevitably goes into and out of projects that have to deal with the real world earlier or later. Also, who maintain Research Closure.
Software engineers to build all the glue stuff between websites, build processes, data exchange, SaaS parts, and whatnot.
System administrators, dito, provisioning and administering the above while maintaining Strong Opsec.
Frontend developers - you want to see the results - graphs, data, videos, right?
Designers - and it should be easy to use and understand.
Neuroscientists, cognitive scientists, psychologists, psychiatrists, therapists, and pedagogues who provide hard evidence about the biological, psychological, or cognitive plausibility of models or behaviors of hypothetical or simulated agents or who can suggest directions to explore or validate.
Educators and communicators who reach out to interested parties and who promote Alignment Mindset and Common Good Commitment.
Accountants, finance professionals, financial advisors, and of course, sponsors who help establish, maintain and monitor Requisite Resource Levels.
Lawyers who set up a legal entity that aspires to Trustworthy Command, Research Closure, and Common Good Commitment.
Everybody else who can help by affirming the Common Good Commitment thus creating common knowledge that this is a worthy cause.

In the above, Trustworthy Command, Research Closure, Strong Opsec, Common Good Commitment, Alignment Mindset, Requisite Resource Levels refer to the Six Dimensions of Operational Adequacy in AGI Projects.

[-]peterslattery3y31

Hi, thanks for writing this. Sorry to hear that things are hard. I would really like if you can help me to understand these points:

A few days later, I saw this post. And it reminded me of everything that bothers me about the EA community. Habryka covered the object level problems pretty well, but I need to communicate something a little more... delicate.

What bothers you about the EA community specifically? At times, I am not sure if you are talking about the EA community, the AIS technical research community, the rationalist community or the Berkeley AIS community? I think of them all as being very different.

I want to address the central flaw of Akash+Olivia+Thomas's argument in the Buying Time post, which is that actually, people can improve at things.

I feel I don't properly understand what you think of this argument and why you think it is flawed.

[-]James_Miller3y20

If a fantastic programmer who could prove her skills in a coding interview doesn't have a degree from an elite college, could she get a job in alignment?

[-]Evan R. Murphy3y30

I don't think not going to an elite college has much at all to do with someone's ability to contribute to technical alignment research. If they are a fantastic programmer that is an indicator of some valuable skills. Not the only indicator, and if someone isn't a fantastic programmer but interested in alignment, I don't think they should automatically count themselves out.

The question of whether someone could get a job in alignment right now is very uncertain though. It's a very volatile situation since the implosion of FTX this past week and the leadership resignation from Future Fund, which had been a big source of funding for AI safety projects. Funding in AI alignment will likely be much more constrained at least for the next few months.

[-]James_Miller3y30

I hope technical alignment doesn't permanently lose people because of the (hopefully) temporary loss of funds. The CS student looking for a job who would like to go to alignment might instead be lost forever to big tech because she couldn't get an alignment job.

[-]Nicholas Kross3y10

As someone who's been feeling a similar ~~portfolio~~ cocktail of emotions about alignment work under short timelines: thank you.

Moderation Log