Give me feedback! :)
I am a Manifund Regrantor. In addition to general grantmaking, I have requests for proposals in the following areas:
You might be interested in this breakdown of gender differences in the research interests of the 719 applicants to the MATS Summer 2024 and Winter 2024-25 Programs who shared their gender. The plot shows the difference between the percentage of male applicants who indicated interest in specific research directions from the percentage of female applicants who indicated interest in the same.
The most male-dominated research interest is mech interp, possibly due to the high male representation in software engineering (~80%), physics (~80%), and mathematics (~60%). The most female-dominated research interest is AI governance, possibly due to the high female representation in the humanities (~60%). Interestingly, cooperative AI was a female-dominated research interest, which seems to match the result from your survey where female respondents were less in favor of "controlling" AIs relative to men and more in favor of "coexistence" with AIs.
1% are "Working/interning on AI capabilities."
Erratum: previously, this statistic was "7%", which erroneously included two alumni who did not complete the program before Winter 2023-24, which is outside the scope of this report. Additionally, two of the three alumni from before Winter 2023-24 who selected "working/interning on AI capabilities" first completed our survey in Sep 2024 and were therefore not included in the data used for plots and statistics. If we include those two alumni, this statistic would be 3/74 = 4.1%, but this would be misrepresentative as several other alumni who completed the program before Winter 2023-24 filled in the survey during or after Sep 2024.
Scholars working on safety teams at scaling labs generally selected "working/interning on AI alignment/control"; some of these also selected "working/interning on AI capabilities", as noted. We are independently researching where each alumnus ended up working, as the data is incomplete from this survey (but usually publicly available), and will share separately.
Great suggestion! We'll publish this in our next alumni impact evaluation, given that we will have longer-term data (with more scholars) soon.
Cheers!
I think you might have implicitly assumed that my main crux here is whether or not take-off will be fast. I actually feel this is less decision-relevant for me than the other cruxes I listed, such as time-to-AGI or "sharp left turns." If take-off is fast, AI alignment/control does seem much harder and I'm honestly not sure what research is most effective; maybe attempts at reflectively stable or provable single-shot alignment seem crucial, or maybe we should just do the same stuff faster? I'm curious: what current AI safety research do you consider most impactful in fast take-off worlds?
To me, agent foundations research seems most useful in worlds where:
I just left a comment on PIBBSS' Manifund grant proposal (which I funded $25k) that people might find interesting.
Main points in favor of this grant
- My inside view is that PIBBSS mainly supports “blue sky” or “basic” research, some of which has a low chance of paying off, but might be critical in “worst case” alignment scenarios (e.g., where “alignment MVPs” don’t work, or “sharp left turns” and “intelligence explosions” are more likely than I expect). In contrast, of the technical research MATS supports, about half is basic research (e.g., interpretability, evals, agent foundations) and half is applied research (e.g., oversight + control, value alignment). I think the MATS portfolio is a better holistic strategy for furthering AI alignment. However, if one takes into account the research conducted at AI labs and supported by MATS, PIBBSS’ strategy makes a lot of sense: they are supporting a wide portfolio of blue sky research that is particularly neglected by existing institutions and might be very impactful in a range of possible “worst-case” AGI scenarios. I think this is a valid strategy in the current ecosystem/market and I support PIBBSS!
- In MATS’ recent post, “Talent Needs of Technical AI Safety Teams”, we detail an AI safety talent archetype we name “Connector”. Connectors bridge exploratory theory and empirical science, and sometimes instantiate new research paradigms. As we discussed in the post, finding and developing Connectors is hard, often their development time is on the order of years, and there is little demand on the AI safety job market for this role. However, Connectors can have an outsized impact on shaping the AI safety field and the few that make it are “household names” in AI safety and usually build organizations, teams, or grant infrastructure around them. I think that MATS is far from the ideal training ground for Connectors (although some do pass through!) as our program is only 10 weeks long (with an optional 4 month extension) rather than the ideal 12-24 months, we select scholars to fit established mentors’ preferences rather than on the basis of their original research ideas, and our curriculum and milestones generally focus on building object-level scientific skills rather than research ideation and “gap-identifying”. It’s thus no surprise that most MATS scholars are “Iterator” archetypes. I think there is substantial value in a program like PIBBSS existing, to support the development of “Connectors” and pursue impact in a higher-variance way than MATS.
- PIBBSS seems to have decent track record for recruiting experienced academics in non-CS fields and helping them repurpose their advanced scientific skills to develop novel approaches to AI safety. Highlights for me include Adam Shai’s “computational mechanics” approach to interpretability and model cognition, Martín Soto’s “logical updatelessness” approach to decision theory, and Gabriel Weil’s “tort law” approach to making AI labs liable for their potential harms on the long-term future.
- I don’t know Lucas Teixeira (Research Director) very well, but I know and respect Dušan D. Nešić (Operations Director) a lot. I also highly endorsed Nora Ammann’s vision (albeit while endorsing a different vision for MATS). I see PIBBSS as a highly competent and EA-aligned organization, and I would be excited to see them grow!
- I think PIBBSS would benefit from funding from diverse sources, as mainstream AI safety funders have pivoted more towards applied technical research (or more governance-relevant basic research like evals). I think Manifund regrantors are well-positioned to endorse more speculative basic research, but I don’t really know how to evalutate such research myself, so I’d rather defer to experts. PIBBSS seems well-positioned to provide this expertise! I know that Nora had quite deep models of this while Research Director and in talking with Dusan, I have had a similar impression. I hope to talk with Lucas soon!
Donor's main reservations
- It seems that PIBBSS might be pivoting away from higher variance blue sky research to focus on more mainstream AI interpretability. While this might create more opportunities for funding, I think this would be a mistake. The AI safety ecosystem needs a home for “weird ideas” and PIBBSS seems the most reputable, competent, EA-aligned place for this! I encourage PIBBSS to “embrace the weird”, albeit while maintaining high academic standards for basic research, modelled off the best basic science institutions.
- I haven’t examined PIBBSS’ applicant selection process and I’m not entirely confident it is the best version it can be, given how hard MATS has found applicant selection and my intuitions around the difficulty of choosing a blue sky research portfolio. I strongly encourage PIBBSS to publicly post and seek feedback on their applicant selection and research prioritization processes, so that the AI safety ecosystem can offer useful insight. I would also be open to discussing these more with PIBBSS, though I expect this would be less useful.
- My donation is not very counterfactual here, given PIBBSS’ large budget and track record. However, there has been a trend in typical large AI safety funders away from agent foundations and interpretability, so I think my grant is still meaningful.
Process for deciding amount
I decided to donate the project’s minimum funding ($25k) so that other donors would have time to consider the project’s merits and potentially contribute. Given the large budget and track record of PIBBSS, I think my funds are less counterfactual here than for smaller, more speculative projects, so I only donated the minimum. I might donate significantly more to PIBBSS later if I can’t find better grants, or if PIBBSS is unsuccessful in fundraising.
Conflicts of interest
I don't believe there are any conflicts of interest to declare.
I don't think I'd change it, but my priorities have shifted. Also, many of the projects I suggested now exist, as indicated in my comments!
More contests like ELK with well-operationalized research problems (i.e., clearly explain what builder/breaker steps look like), clear metrics of success, and have a well-considered target audience (who is being incentivized to apply and why?) and user journey (where do prize winners go next?).
We've seen a profusion of empirical ML hackathons and contests recently.
Why does the AI safety community need help founding projects?