I wish the median AI safety researcher were much more ambitious with the problems they choose to tackle. Unfortunately, job and funding incentives are biased against research ambition.
I’d thus like to celebrate the people who have taken risks to pursue ambitious AI safety research directions that have not panned out (or have yet to pan out!). These people will not have received riches and accolades from the field, or even their close peers.
For much of the work that falls in this category, it might be obvious to many others at the time that it isn’t going to lead anywhere. Unless their efforts were likely to be actively harmful, I still want to celebrate these people for their courage, and for pushing against the incentive gradient.
I was planning on making a long list of work that I thought fell into this category, but quickly felt uncomfortable including and excluding people’s work without more thorough analysis that I didn’t think was worth it. Maybe some day I will.
If you think that you’ve done work that falls into this category, thank you for your +EV. Humanity is grateful for your efforts.
I wish you had said ambitious AI safety research directions that you're glad (at least ex ante) that people have pursued; it's hard for me to otherwise know what I think of this.
I am glad ex ante about ARC's work, which counts. I think mech interp was probably pretty bad for the AI safety field overall (because of opportunity cost of people who worked on it), and I think its vibe of ambitiousness probably made the research go substantially worse (e.g. I wish people had either had the pragmatic interpretability attitude earlier, or that they'd been much more careful when making claims about their progress on ambitious goals). I can't think of ambitious research directions that I wish were more strongly incentivized by the AI safety community; I think LWers are overall too enthusiastic and credulous about novel ambitious research directions.
Unfortunately, job and funding incentives are biased against research ambition.
I agree with job incentives (because many jobs are at AI companies that end up pushing people to contribute to the problems that are currently important), but I don't really agree re funding incentives.
There are two (overlapping) groups that I wanted to thank and incentivise which I bundled together in the OP. In hindsight I probably should have separated them out.
1. Those that pursue research because they think it’s valuable and are foregoing big bucks and external credibility. This could be ambitious or incremental work.
2. Those that aim to solve fundamental problems that would lead to step/changes in our ability to control/guide AI.
Point 1 is uncontroversial. 2 is cruxy to the extent that people disagree about the expected value safety-wise of ambitious and incremental work.
My guess is that I think E(ambitious) - E(incremental) is larger than you do. Miscellaneous grab bag of ambitious work that I think is or was high EV:
- some mech interp (fwiw I’d call work that e.g. applies SAEs to a new problem or tries to minorly improve them incremental research rather than ambitious.)
- deep learning theory
- agent foundations
- Steven Byrnes brain-like AGI stuff
Then yeah I also claimed that jobs and funding are biased against ambitious research. As you mentioned, the case for jobs is clear. RE funding I think it’s very hard to evaluate ambitious proposals that typically don’t have good short-term milestones. It at least seems like CG are actively trying to overcome that. More varied funding sources would help too.
I think I'm glad Steven Byrnes is doing what he's doing, but notably he is in fact funded to do it! I don't know how hard it was for him to get funding.
I don't really feel excited for people being more encouraged to do deep learning theory or agent foundations. I do appreciate that people tried out agent foundations back in the day.
Oh, I didn’t mean to imply that getting funded at all for ambitious research wasn’t possible or was extremely difficult. The directions I mentioned above are all funded to some extent. Just that e.g. I expect Steven could get 10x+ financial compensation by joining a lab and doing less ambitious AI safety research (or, with less confidence, get more compensation and credit by starting a for-profit or even a non-profit doing less ambitious work). And that people making these sacrifices should be lauded.
Here's a brief attempt to ground/concretize people's understanding of how much funding CG provides for "more ambitious" AI safety research: I can quickly think of around $40m worth of grants we've made over the last year (a lot in the last few months, so not all disbursed/posted yet) that meet the following criteria (and probably there are more I didn't think of):
A few preemptive clarifications:
CG would like to spend more money on more ambitious work, and I am trying to make it happen.
Is the bottleneck lack of grantmaking capacity, especially for active grantmaking?
what could be done to change those job/funding incentives? are existing grantmakers too timid? is there room to start a new grantmaking org that makes more ambitious bets?
I’m not very clued in to the current grantmaker space. But I know that top researchers in interp are queried far less about funding proposals than what they can handle. I expect it’s the same for other fields.
My thoughts on this are pretty uncertain but here's some semi hot takes. Listed as they come to me in no particular attempt to make a coherent case for anything.
* We want ambitious but also good research. Easier said than done obviously.
* Having an ambitious research program is anti-correlated with experience. This is too bad, because research experience is extremely important and likely undervalued or hard to value (as is the case with the ever elusive taste, which I think one can develop through experience).
* Ambitious research is not worth much without all the other things that make good research good.
* It's not obvious to me the issue is primarily one of incentives.
* The main thing one should do to get more people doing ambitious research is to do good ambitious research yourself, and to show by example what good ambitious research looks like, and convince others through object level things that this is a good path to take.
* It's harder to do good ambitious research than it is to do good research that's less ambitious. That's a real tradeoff!
* I think funding opportunities are there for ambitious research to happen, but the bar is higher for it to be funded, and that seems reasonable?
* I'm still unclear on what people actually mean when they say ambitious research. For instance, sometimes "ambitious interp" is used for things I don't personally find all that ambitious. I have no doubt that things I think are ambitious, others find quite pedestrian. Maybe it's a taste thing.
My understanding is that ‘ambitious’ proposals are often highly illegible, such that only a few dozen humans are equipped to seriously evaluate them (even with substantial effort), and those people often have strong opinions that weigh unfavorably on their appraisal of proposals, as well as having directions they’re more excited about that weigh heavily on their time.
Like, when I imagine the people I’d like to see evaluating the proposals, it’s sort of the same 10-20 people I wish were doing everything, because research taste is extremely scarce, and perhaps the most important resource in grant making.
It’s also not very institutionally scalable.
(80 percent confidence; I’ve been around a lot of grant making and fieldbuilding, but haven’t been In The Room for an explicit funding decision regarding projects outside of the dominant, lab-supported ML paradigm of safety.)
Given this, it seems like we probably should invest more resources in getting people with that sort of research taste.
Fieldbuilders are trying this already (eg courting experienced scientists in other fields, building bridges to academia, etc). It’s not really clear to me how well this works (or how often research taste generalizes cross-field, since it’s a heavily context-dependent skill).
Further, the most likely kind of person to poach successfully from another field is an MLE, and they’re more likely to pursue various ML approaches that I expect our OP would consider ‘unambitious’.
MATS, at least through 2024 (and maybe still; I don’t know), put a lot of emphasis on trying to help people develop research taste. I think results were sort of mixed, because this is an extremely difficult thing to teach, or even describe in a way that someone who doesn’t feel motivated to learn it already will understand.
Fieldbuilders are trying this already (eg courting experienced scientists in other fields, building bridges to academia, etc). It’s not really clear to me how well this works (or how often research taste generalizes cross-field, since it’s a heavily context-dependent skill).
I don't think research taste in other fields implies very much alignment taste (e.g.).
MATS, at least through 2024 (and maybe still; I don’t know), put a lot of emphasis on trying to help people develop research taste. I think results were sort of mixed, because this is an extremely difficult thing to teach, or even describe in a way that someone who doesn’t feel motivated to learn it already will understand.
I think it failed because there weren't good enough feedback loops to assess whether people got any better at taste.
I agree with both of these things (although I’m not sure I would say MATS failed, rather than ‘it’s going worse than one might hope, and feedback loops are one major reason for that’).
So what did you mean by ‘getting people with that sort of research taste’? Like, somehow incentivizing people in that small group of 10-20 or so to spend more time evaluating grants?
I think funding incentives are really quite fucked up. I roughly stand behind my summary of funding incentives in this comment (not perfectly, but roughly): https://www.lesswrong.com/posts/wn5jTrtKkhspshA4c/michaeldickens-s-shortform?commentId=zoBMvdMAwpjTEY4st
Flowery words, and appreciated ones, but they won't stave off any burnout or shake anyone more established awake, let alone pay rent. If only it were otherwise.
When people try to be more ambitious it often makes their research worse, because it closes them off to interesting research directions that they don't yet understand how to scale.
I'm more excited about encouraging people to do creative and beautiful research. Here's an example of creative research, and here's an example of beautiful research, both from Jascha Sohl-Dickstein. In the long term I expect creative and beautiful research to uncover much more interesting and important phenomena.
To be clear, when I said “unfortunately, job and funding incentives are biased against research ambition”, this wasn’t a claim that this is an issue that can and should be easily fixed by AI safety grantmakers. Maybe it could be, I don’t know enough to make a call there. I’d make the same statement about probably any research field. Grantmaking to projects that don’t have short-term milestones, often involving researchers without legible experience, would be very difficult. Same goes for giving jobs to people that do that kind of research.
While I’m glad people are discussing grantmaking, this was simply an appreciation post for people who have done or are doing this kind of research without 7-figure salaries or much external credibility.
Who is the median AI safety researcher? Probably they are new to AI safety, and maybe they are new to research. In which case, I do not think they should be much more ambitious. If you mean the 'experience-weighted median researcher hour' or something similar then I'd likely agree.
A bit of a nitpick, but there's high value to narrowly-scoped research for getting tight feedback loops. On the other hand, favoring such problems can skew the community view of what is important and what research is in general. I see that as more of a comms problem than anything though.
My first sentence was very much a finger to the wind. Classifying people as AI safety researchers would be a post in itself.
Agreed that there‘s high value to narrowly-scoped research. I encourage people to do it, especially new researchers. But people should be mindful that they’re doing it at all and why they’re doing it.