It seems likely to me that maybe 50% of people who start seriously studying or working on AI safety in the next year will be below the intelligence escape velocity, where they forever lag behind frontier AI systems in AI research ability. If I were working in capacity building, I would already start to deprioritize the earlier parts of the funnel for this reason. For reference, my TEDAI timelines are 2028.
Very little of the impact of people working in AI Safety is downstream of their research, so this seems wrong. Maybe you also think the same will be true for policy/communications/evals work?
Separately, I don't super buy this even for research. Capacity-building work attracts the smartest people, and those people will have been using the models to upgrade themselves anyways. It's not the case that you need to have 5+ years of research to usefully contribute, especially with models providing uplift to fill in the gaps.
Even if new people's impact isn't downstream of research, the inability to contribute to research significantly hinders their progress, because potential mentors and collaborators won't benefit from working with them, and they generally get fewer opportunities to advance their careers.
Yeah this argument applies much less to policy/communications work.
I agree that you don't need 5+ years of research to conntribute, but I expect that at some point soon, if you don't have some minimum amount of context on AI safety, your job (managing a bunch of LLM agents) can just be automated by an LLM agent.
and they generally get fewer opportunities to advance their careers.
I don't know what this means? Like, they won't get invited to talk to policy makers? Or won't be successful writers informing people about the risks? Are you talking about lab employment here?
I agree that you (hopefully) won't get that many new lab employees out of recent recruitment efforts, but that's a plus not a minus!
I guess one consideration for prioritizing direct work over capacity-building is AI timelines. Do you think this should be an important consideration?
I think it bites somewhat on work engaging with much younger age groups (i.e. high school students). Outside of that, I think empirically the turnaround time I've seen between people getting engaged and doing useful work is really short in many cases (like < 1 year, even for some university students), such that in even very short timelines, most capacity-building work still looks very good.
My big concern is, to what extent do capacity-building efforts create pipelines into AI capabilities work that shortens timelines? This happens to at least some extent, but I don't know how big an effect it is.
Loosely construed, the foundings of DeepMind, OpenAI, and Anthropic were the result of x-risk capacity-building. If you count those (and it's not clear that you should), then capacity-building has probably been strongly net negative to date.
I don't think it's obvious that even if you count those, capacity-building has been strongly net-negative to date, but I do think it's pretty plausible.
Like, if you were to count the costs as broadly as "all the labs are downstream of capacity-build work" then you also need to count the benefits as broadly. And a broadly known public track record of being concerned about these problems for a long time, and that it's motivated by altruism, and that you tried to solve the problem for a long time, and being one of the few memetic centers in the world that people draw on to figure out what to do about this whole AI situation is quite valuable, possibly more valuable than the acceleration-effects of things like Deepmind, OpenAI and Anthropic.
(that said, my actual take here is that the biggest issue with most capacity-building work is that it actively undermines the things that other capacity-building work has been highly successful at, so that ultimately some capacity building work is predictably extremely good for the world, and some is predictably extremely bad for the world).
the biggest issue with most capacity-building work is that it actively undermines the things that other capacity-building work has been highly successful at, so that ultimately some capacity building work is predictably extremely good for the world, and some is predictably extremely bad for the world
Wait what does this mean? Is there some kind of dichotomy I'm not aware of?
Maybe? I am not saying the dichotomy is common-knowledge, but I feel pretty confident predicting which capacity-building work will be quite bad in-expectation and which will be quite good (this doesn't mean there isn't variance within those categories with many orgs or people having sign-flipped impact from their reference class, but that I am happy to register predictions at the class level with like reasonably-high confidence).
I would then like to know which is which (DM is okay if you feel that would be somewhat controversial, it's also alright if you want to keep your opinions to yourself)
Sorry, I am not saying there is a classifier here that is like one sentence long. At a high level I think "is it largely funneling people into places where the incentives will point towards building more powerful AI systems and/or becoming personally more powerful, or is it putting people into positions where their primary incentives are to help other people make sense of what is going with some grounding in accuracy of their beliefs" is the best short classifier I have, but I didn't intend to communicate there is some super short description of the classifier!
I’ve heard variants of this argument, and I overall haven't found them that persuasive for reasons close to the ones Habryka gives-- I think if you carve things up such that capacity-building work has been responsible for e.g. speeding up the creation of frontier AI labs, you should also credit it for the broader movement focused on catastrophic risks, and my intuition is the counterfactual world without that movement would be worse off overall. I don’t buy that the acceleration has been substantial enough that the unaccelerated world would have bought a lot of time for societal improvements useful for addressing catastrophic risks; instead, it feels to me like the unaccelerated world would be facing these risks more blindly and with less time to usefully prepare.
I also think given the massive amount of non-GCR-related interest and resources in AI now, the forward-looking acceleration effects seem likely to be much smaller than any historic effect. I generally think the ratio of “meaningfully adding to the talent pool of people working on catastrophic risks” to “meaningfully accelerating AI capabilities” for most CB programs will look extremely favorable.
See Ryan Kidd on How MATS addresses “mass movement building” concerns. One useful-ish framing from that post is is that the safety is more neglected than capabilities, so you should expect the marginal FTE on safety to add more than the marginal FTE on capabilities.
I'll make up numbers to illustrate the point: let's say there are 10,000 capabilities researchers and 1000 AI safety researchers — that's a rough 10:1 ratio. Now say MATS funnels 500 people into capabilities and 500 into AI safety. Now it's 10,500:1500 ratio, or 7:1, which is much better. I think 50% of the MATS mentees going into capabilities is very pessimistic.
According to the MATS Alumni Impact Analysis, the ratios ares 78%:
If we (perhaps unfairly) ignore the independent researcher, then that's 3 to capabilities for every 100 to safety!
This also ignores the effects of having capabilities researchers more clued up / familiar with AI safety — you might think this is net-positive (bc they are marginally more likely to update on new evidence and pivot to safety) or net-positive (bc they misuse our technical insights). I think this probably shakes out to be net-positive.
But if locally robust alignment makes faster progress on capabilities without being a version of locally robust alignment that can become asymptotically robust alignment and give us a win node, then it's not only not as obvious, but might well be primarily made of backfire. I don't believe the most extreme version of this claim, but I used to, I know people who do. The "people going into capabilities" is also including "local alignment that doesn't generalize to asymptotic alignment".
The naive way to get asymptotic, and possibly the only way, is to have locally robust alignment that is reliably always ahead of impactfulness of deployment. But if impactfulness of deployment is always operating at 110% of local robustness, that seems like a recipe for disaster. So the question is ultimately one about how to stay in the local validity window indefinitely, even as that local window gets harder to be sure we've maintained.
Also there's the whole thing about how alignment is fundamentally about figuring out what to ask for, and being robust about that is a confusing question at best. Most of MATS seems too focused on the (very hard!) problems of achieving local-alignment-good-enough-to-keep-going to spend the necessary effort to succeed at the really hard parts later (which is not obviously a mistake on MATS's part, but might well be).
It's much easier to get reliable robustness for local alignment when you don't have to solve "what should a thing overwhelmingly smarter than us do?" yet, but if there's a hole coming where a thing moderately smarter than us has the ability to confuse us into choosing a plan that seems good to it but which is not what we want, we may fall off the manifold we would have wanted to be on if we had been able to see things coming without getting confused.
One additional reason that capacity-building for AI safety seems good right now, is that very soon I expect there to be a lot more funding available for AI safety work, from Anthropic donors (see Front-Load Giving Because of Anthropic Donors? and this comment of mine) as well as a broader societal wakeup about risks from AI.
When money becomes more available, the bottleneck becomes "good opportunities/people to spend money on", which is what capacity-building produces. Also starting asdap seems important -- capacity-building takes time to set up and bear fruit, and some kinds of capacity building have snowball-y effects (ie MATS).
TL;DR:
Cross-posted from Multiplier
I work on the capacity-building team on the Global Catastrophic Risks-half of Coefficient Giving (formerly known as Open Philanthropy). Our remit is, roughly, to increase the amount of talent aiming to prevent unprecedented, globally catastrophic events. These days, we’re mostly focused on AI, and we’ve funded a number of projects and grantees that readers of this post might be familiar with– including MATS, BlueDot Impact, Constellation, 80,000 Hours, CEA, the Curve, FAR.AI’s events, university groups, and many other workshops and projects.
The post aims to make the case that broadly, capacity-building work (including on AI risk) has been and continues to be extremely impactful, and to encourage people to consider pursuing relevant projects and careers.
This post is written from my personal perspective; that said, my sense is that a number of CG staff and others in the AI safety space share my views. I include some quotes from them at the end of this post.
I’m writing this post partly out of a desire to correct what I perceive as an asymmetry in terms of how excited I and others at Coefficient Giving are about this kind of work vs. how much people in the EA and AI safety communities seem excited to work on it. The capacity-building team is one of three major teams working on AI risk at Coefficient; we currently have 11 staff, which is ⅓ of the total AI grantmaking capacity, and gave away over $150M in 2025. I started my stint at Coefficient Giving in 2021, working half-time on technical AI safety grantmaking and half-time on capacity-building grantmaking; among other reasons, I ultimately switched to working full-time on capacity-building, because my sense was that team was several times (maybe an order of magnitude) more impactful. Things seem somewhat different to me now (I think the set of opportunities in technical AI safety grantmaking looks significantly better than it did in 2021), but my sense is capacity-building as an area of work is still massively underrated relative to its impact.
The case for capacity-building work
The naive case for this kind of work (often called the multiplier effect argument) goes something like this: say you can spend a little time doing direct work yourself, or spend that same amount of time getting one of your equally talented friends into direct work for the rest of their life. Getting your friend into direct work is most likely the more impactful option, because you get to “multiply” your lifetime impact (in this case, by almost a factor of 2) by getting a whole additional person to spend their career on work you think is important.
In fact, whether this argument goes through depends on a few premises: namely, how good the direct work you would have done would be, and how tractable it is to convince others who are similarly talented to you. I’m going to skip over the first premise for now (and attempt to address it in a later section) and present evidence that our team has collected over the years that makes me think that this work is very tractable– and in particular, that there are easy-to-execute interventions that reliably influence people’s career trajectories in substantial ways. A priori, you might think that people’s career choices happen randomly and chaotically enough that it’s difficult to make a substantive impact trying to change what people work on. But in fact, both anecdotal evidence we’ve observed and larger scale data collection we’ve attempted (both presented below) suggest that intentional efforts make a big difference to individual career trajectories (including the career trajectories of individuals who go on to do highly impactful work). I think that core stylized fact makes up the main case for why capacity-building work is worthwhile.
I will briefly note that while the below case is focused on successes from capacity-building, I do think this work has the potential for harm, though my overall view is that efforts in this space executed by thoughtful, high-context individuals will be very positive in expectation. I briefly discuss this in this appendix.
Surveys
In 2020 and 2023, our team ran two similar, in-depth surveys where we asked low-hundreds of people currently working on (or relatively likely to work on) impactful GCR work what influenced their career trajectories. Survey respondents included employees at AI labs, staff at key technical, policy, and capacity-building organizations in AI, and promising-seeming early career individuals. The aim of the surveys was to provide some evaluation of the impacts of the grants our team had made, as well as to generate some evidence informing Coefficient Giving’s views on capacity-building work as a whole.
The survey used a variety of prompts to elicit evidence from respondents about what had influenced their career choices. One of the sections asked respondents to unpromptedly list the top 4 influences that they thought were most important to their current career trajectory (these included things like “my partner”, “inherent curiosity”, etc).
In 2023, 60% of respondents listed a capacity-building program or organization that our team was funding in their top four influences, with the most common being university groups (listed by 25% of respondents), 80,000 Hours (listed by 20% of respondents), and EAG/EAGxes (listed by 12% of respondents).
See the table below for a longer list of the commonly listed influences, sorted manually into (somewhat subjectively decided) buckets. Note that:
Count
(of 329)
Testimonials
I’m not able to share the individual free-write responses from the survey above, but I recently personally asked some individuals who I think are doing high-impact work to tell me how they came to be doing that work, followed by what they thought the most important or counterfactual influences on their trajectories were.
Below, I include Claude summaries of their overall stories along with their description of the most important influences, lightly edited. Some notes on the testimonials I've included:
Neel Nanda (Senior Research Scientist at Google DeepMind)
“Here's a list of the salient influences on me:
Max Nadeau (Associate Program Officer (Technical AI Safety) at Coefficient Giving)
Claude’s summary:
Max got it into his head in high school that human-level AI was coming during his lifetime and that it was important to make sure the process went well, but he had no idea anyone was working on it. In college, he got connected with Stephen Casper, where he learned practical ML skills, and to someone who connected him to the people running the Impact Generator retreat [Asya note: this was a small GCR-focused workshop series run in the Bay in 2022], which he was later invited to. He talked to Tao Lin at that retreat, and Tao offered him a TA position at the ML bootcamp Redwood was running, with three weeks to learn the material. He thought he'd be in the Bay for three days, but stayed six weeks. TA'ing turned into an internship at Redwood, which he took a semester off college to do. While interning he got to know Ajeya, and by the time he graduated she offered him a job.
Max on what was most important:
Rachel Weinberg (founder and former head of The Curve, currently at AI Futures Project)
Claude’s summary:
Rachel got into effective altruism in high school through friends, and started a group at her university. She spent some time interning running retreats and ended-up helping with Future Forum, a futurism conference that required a last-minute venue switch. She took a semester off to study AI safety, but decided she wasn't interested in research, and did web dev for a while. After running Manifest 2024, she started The Curve, and is now working on other field-building projects.
Rachel on what was most important:
Marius Hobbhann (CEO and founder of Apollo Research)
Claude’s summary:
During his first week of university in 2015, someone handed him Superintelligence. He studied cognitive science, did a CS bachelor's in parallel, then a machine learning master's and PhD to prepare for AI safety work. In 2022 he started doing AI safety research on the side with a grant from the Long-Term Future Fund. He paused his PhD, did MATS in early 2023, concluded that deceptive alignment was the biggest problem and that no one was doing evals for it, and started Apollo, which he’s been running since.
Marius on what was most important:
Adam Kaufman (member of technical staff at Redwood Research)
Claude’s summary:
Adam knew from an early age that superintelligence would be scary if someone built it, but assumed it wasn't going to happen in his lifetime. When he got to college, he joined the AI Safety Fundamentals reading group that the Harvard AI Safety group (HAIST) was running, thought the people were extremely cool, and made most of his close friends there. He became increasingly convinced the problem was urgent as language models kept getting smarter. He met Buck Shlegeris at a HAIST retreat, talked to him, and applied to MATS. He did MATS at Redwood, enjoyed it so much he took time off school, and has been working there since.
Adam on what was most important:
Gabriel Wu (member of technical staff (alignment) at OpenAI)
Claude’s summary:
Gabe was given a copy of The Precipice when he started as a freshman at Harvard. There was no formal AI safety team at the time, but a group of 7-10 people would gather weekly to talk about x-risk in a dining hall, so he joined, and ended up going to a long workshop in Orinda [California]. He did REMIX [Asya note: this was a mechanistic interpretability bootcamp] the following winter, which introduced him to the Constellation community, and then applied for a Redwood internship for the next summer. After others graduated, he became the new director of HAIST (the Harvard AI Safety Team). He worked with the Alignment Research Center, applied to labs, and was eventually convinced by several people to join OpenAI.
Gabe on what was most important:
Catherine Brewer (Senior Program Associate (AI Governance) at Coefficient Giving)
Claude’s summary:
Catherine found 80,000 Hours before university through internet searching about careers, then read Doing Good Better. They engaged with the Oxford effective altruism university group, going to events and helping run programming. Through the group they made friends who were into AI safety and argued with them a bunch, which got them interested in AI safety. They applied for the ERA fellowship (then called CERI) after someone from the group told them to, and spent a summer thinking about AI safety with other people. Then they did the GovAI fellowship, which they found even more helpful, via meeting people and developing her own takes on relevant topics. After that they were interested in AI governance, and applied to Open Philanthropy when they were graduating.
Catherine on what was most important:
Aric Floyd (video host for AI in Context)
Claude’s summary:
Aric found GiveWell by Googling for the most effective charities in his late teens, but didn't find the broader effective altruism community until 2020, when a friend found an online student summit that CEA ran. He knew the people who led the Stanford effective altruism group, but never had time to get involved, and was then invited by those people to help with some community-building efforts at MIT. He was also invited to Icecone [Asya note: this was an AI-risk-focused workshop run in 2022], and came out of it persuaded that AI safety was a big deal, but less convinced that theoretical alignment work was the way to proceed. He did a bunch of short sprints of community-building work and met Chana Messinger while teaching at the Atlas Fellowship, and later the Apollo program in the UK. When 80K started thinking about video production, Chana brought him on because they'd worked well together before, and because Aric had prior experience in film & television acting. Aric had previously been encouraged by [experienced EA leaders / Will MacAskill, among others] to do public-facing content creation, and decided to give it a shot.
Aric on what was most important:
Ryan Kidd (Director of MATS)
Claude’s summary:
Ryan read HPMOR and LessWrong in high school, but he didn't anticipate near-term AGI until rediscovering the idea through effective altruism around 2020. He co-organized the effective altruism group at the University of Queensland during his physics PhD, where his interest in catastrophic risk evolved from climate change activism to nuclear winter modeling to AI risk after reading The Precipice. He completed the first AI Safety Fundamentals course, applied unsuccessfully to FHI and CLR, then did the SERI MATS pilot program. He attended Icecone [Asya note: this was a AI-risk-focused workshop run in 2022] in Berkeley, where he met Holden Karnofsky, Ajeya Cotra, Buck Shlegeris, and many future colleagues. While completing the MATS research phase with John Wentworth as his mentor, he sent the co-organizer a document explaining how he would improve the program and got invited to join the organizing team. He's co-led MATS with Christian Smith since late 2022.
What Ryan says was most significant (in order of importance):
What tends to work?
While some of the interventions affecting people’s career trajectories are fairly idiosyncratic, we’ve noticed a few broad categories that tend to be impactful on people’s careers (many of which are featured in the testimonials above).
Notably, unlike content, in our experience programs and events can have a sizable impact even if they don’t meet an exceedingly high-quality bar, making them a good bet for a wider range of people to work on. Generalizing from anecdotes, I speculate that programs and events (especially in-person ones with other participants at a similar point in their careers) often have the effect of causing someone to take changing their career more seriously as a possibility, whereas previously they had been engaging e.g. online in a fairly abstract or detached way.
What’s good to do now?
Our recent request for proposals gives some examples of the kinds of projects we’d be interested in seeing on the current margin. Briefly highlighting some specific things that I or others on my team think would be good, based on our sense of both what’s worked in the past and the current AI risk landscape:
Who should be doing this work?
The above makes the case for why you might think capacity-building work is valuable, but doesn’t in itself provide a point of comparison for what someone could be doing otherwise, (namely direct work, which itself could have its own capacity-building benefits, e.g. by creating evidence that there’s important work to be done in an area).
I don’t have a rigorous method of comparing the value of potential direct vs. CB interventions, and I think there’s room to make a variety of plausible cases. That said, I will share my intuitions, as well as the intuitions of some others at Coefficient.
I generally encourage people to think about their career choices at an individual level, but from an overall talent allocation perspective, my current take is that many of the marginal hires at larger organizations doing technical or policy work right now (including e.g. Apollo, Redwood, METR, RAND TASP, GovAI, Epoch, UKAISI, and Anthropic’s safety teams) would be capable of founding or being an early strategy-setting employee at a top capacity-building organization, and would have more impact by doing so.
I think individuals who are most well-suited to capacity-building work are those who are (some subset of) entrepreneurial, socially skilled, operationally strong, or strong communicators in the relevant subject areas. I think work running programs or events is particularly loaded on the first three of these, whereas e.g. producing content is much more loaded on the last.
What would doing this work look like?
If you think you might be someone who should plausibly be doing capacity-building work, here are some things you could consider:
Working at an organization doing good work in the space
There are a number of actively-hiring organizations that I think are doing impactful capacity-building work (see some of them in this filtered 80K job board), but here I’m going to plug some organizations where I feel a strong hire could be particularly impactful.
If you think you might be interested in any of the below but are on the fence, you can DM me or fill out this form and I’ll aim to take an at least 15-minute call with you (and longer if it seems useful; up to a limit of 20 such calls).
Constellation - CEO
Constellation is a research center and field-building organization located in Berkeley, California, that hosts a number of organizations and individuals doing impactful work in the AI safety space. In addition to running the space itself, it’s historically run programming through the space, including the Astra Fellowship, the Visiting Fellows Program, and a number of one-off workshops and events.
Given the dense concentration of high-context talent working there, I think Constellation has huge potential to be impactful both as a convening place for people doing this work, and as a host of a number of programs and events, including (potentially) ones aiming to engage policymakers, AI lab employees, and other high-stakes actors relevant to the AI space.
Constellation is looking for a new CEO who I expect to be the primary individual setting Constellation’s strategic direction. I think that position will be extremely impactful and I'd like them to get a strong hire.
Kairos – various early generalist positions
Kairos runs SPAR, a remote AI safety research mentorship program, provides advice and monetary support for AI safety university groups, and has taken on running workshops for promising young people. I think there’s massive amounts of evidence about the effectiveness of all three of these interventions (some of which you can see in the testimonials above), and I think university groups and workshops for young people in particular are (still) extremely neglected relative to their historic impact.
I think Kairos has a very strong leadership team and important, neglected priorities (plus, Agus is a great Tweeter), and I think it would be very impactful for them to have early hires who are strong generalists that could own priority areas-- they plan to open multiple new hiring rounds very soon, and you can fill out their General Expression of Interest form to be added to their potential candidate pool for those roles.
Starting or running your own capacity-building project or organization
Our team is always accepting applications for funding. This section above as well as our request for proposals describes some kinds of projects in AI capacity-building that we might be particularly excited to fund, but I also encourage people to form their own views about what might be effective and not anchor too strongly to past work.
Working on a capacity-building project part-time
We’ve seen a lot of successful capacity-building work start or run completely by people or organizations doing it on the side of their day-to-day work, including MATS (which was started by full-time Stanford students), a number of impactful workshops and events, and a lot of widely-read public communications.
Subscribing to Multiplier, a Substack with thoughts from our team (and other AI grantmaking staff at CG)
Letting our team know
If you think you might be interested or a good fit for this kind of work, but aren’t sure where to start, we would love it if you let us know by filling out this very short expression of interest form. We’ll reach out if there are projects or opportunities on our radar that we think might be a particularly good fit for you. (Note that we don’t expect to reach out to most respondents).
Social proof
This post is coming from my personal perspective, but my sense is my position here is directionally shared by at least some at CG and elsewhere in the AI safety space. I asked a few people who were not working on capacity-building, but I felt had substantial context on capacity-building efforts, to share their takes below:
Julian Hazell, AI governance and policy at Coefficient Giving
“As I've written about before, I'm really into capacity building.
Funny enough, a Coefficient Giving career development grant and the GovAI fellowship were very important inputs into my current career trajectory. I probably would've eventually found my way into AI governance work regardless, but these programs jumpstarted my career and turned me into a useful contributor much faster than I otherwise would've been.
On the grantmaking side, I funded a number of projects where capacity building was a core part of the theory of change, and I've seen results that have been genuinely exciting.
If I could wave a magic wand to reorganize talent allocation in the AI safety community at my whim, I'd move a decent number of people currently in research and policy roles into capacity building. I think it's that underrated.”
Trevor Levin, AI governance and policy at Coefficient Giving
“I co-sign this post. There's so much to do to make the world more ready for transformative AI, and the ecosystem is full of projects that need a founder or are a couple more great hires from being much more impactful. We desperately need more talented and motivated people to keep showing up. Also, for me and I think for many others, the work can be deeply rewarding -- it often has more social contact and shorter feedback loops than other types of work.”
Ryan Greenblatt, Chief Scientist at Redwood Research:
"I agree with Asya's post and think that capacity building work is underdone and underrated. One delta is that I would emphasize the importance of capacity building type work by people who are doing object level work in the field. Both that I think that doing object level work is complementary to capacity building but also that people doing object level work should spend a larger fraction of their time doing/helping with capacity building."
Buck Shlegeris, CEO of Redwood Research
Asya: I'd broadly be interested in you giving your take on the kind of work that my team funds.
Buck: I don’t know the current distribution.
Asya: Our biggest grantees are MATS, CEA, Constellation, BlueDot, LISA, Tarbell, 80K, FAR AI's events, a bunch of university groups, and a bunch of other stuff.
Buck: Many of those seem pretty good. I think that overall, trying to do capacity building where you try to cause people to think through a bunch of issues related to transformative AI, especially having people with scope-sensitive beliefs relate to it-- I think that kind of work has gone quite well historically and put us in probably a much better position than we'd be without it. I'm excited for that work happening on the margin and I feel like every year we're somewhat better off because of capacity-building that was done that year or the previous year. Or like projects done by those organizations. That all seems great.
Asya: A claim I make in my post is that ‘many of the marginal hires at larger organizations doing technical or policy work right now (including e.g. Apollo, Redwood, METR, RAND TASP, GovAI, Epoch, UKAISI, and Anthropic’s safety teams) would be capable of founding or being an early strategy-setting employee at a top capacity-building organization, and would have more impact by doing so.’ I'm curious for your immediate takes on that proposition.
Buck: I don't know how many of them have that capability. I think if they have that capability, they should strongly consider doing so.
Maybe something is like-- I think MATS and Redwood represented two different kinds of philosophies on how to increase the technical AI safety research done. And I think it's very unclear which one-- I think MATS looks at the very least competitive. It's been involved in the production of a huge amount of AI safety research that I'm happy exists. And a heuristic that would have suggested you shouldn't work on MATS early seems to have gotten wrecked by posterity.
Asya: Cool, those are the main questions I want to ask you. Any other commentary you'd want to include here?
Buck: Capacity-building work seems good. I encourage Redwood staff to participate in capacity-building work; I think it's worth their time on the margin. I'm going to be involved in a bunch of it myself.
Appendix
My post in large part focuses on the case for successes from capacity-building, but I do think there are a number of mechanisms through which work in the capacity-building category can do harm, e.g. by misrepresenting key ideas to broad audiences, alienating people who would otherwise have been sympathetic to this work, or empowering individuals who ultimately make the ecosystem worse. While I think these effects are real and material, my overall view is that the negative impacts in the space have likely been substantially outweighed by the positives, and my expectation is that most efforts in this space executed by thoughtful, high-context individuals will be very positive in expectation, such that I feel good about publishing broad encouragement to pursue this work on the current margin.
Without going into detail, my intuitions here come from an overall assessment of the work done by global catastrophic-risk focused groups over the years, which my personal best guess is have been very positive on net, even accounting for substantial negatives (e.g. the actions of Sam Bankman-Fried). That said, I’ve heard a number of arguments for why that may not be the case, or for why certain large classes of efforts may have been disproportionately harmful, which I largely won’t cover here– ultimately, addressing these is not the main focus of this post, and if this feels to you like a major crux around your views on this kind of work, I encourage you to come chat with me about it in-person sometime.
I will briefly say that I think it makes sense to think about capacity-building work on the level of individual interventions affecting specific groups of people, and that I think being skeptical of certain work is compatible with being excited about others-- given that this work is (according to me) very high-leverage, I'd encourage even broadly skeptical individuals to think about whether there are specific interventions that it would make sense for them to pursue.