MATS Winter 2023-24 Retrospective

utilistrutil; LauraVaughan; deus_ex_maki; Christian Smith; Juan Gil; Henry Sleight; Matthew Wearden; Ryan Kidd

MATS Winter 2023-24 Retrospective — LessWrong

90 MATS Winter 2023-24 Retrospective

by utilistrutil, LauraVaughan, deus_ex_maki, Christian Smith, Juan Gil, Henry Sleight, Matthew Wearden, Ryan Kidd

11th May 2024

59 min read

90

AI Alignment FieldbuildingMATS ProgramPostmortems & Retrospectives

Personal Blog

90

New Comment

28 comments, sorted by

top scoring

Click to highlight new comments since: Today at 9:55 PM

[-]Orpheus162y259

Somewhat striking that the top 3 orgs on the career interest survey are Anthropic, DeepMind, and OpenAI.

I personally suspect that these are not the most impactful places for most MATS scholars to work (relative to say, UKAISI/USAISI, METR, starting new orgs/projects).

Regardless, curious if you have any thoughts on this & if it reflects anything about the culture/epistemics in MATS.

(And to be clear, I think the labs do have alignment teams that care about making progress & I suspect that there are some cases where joining a frontier lab alignment team is the most impactful thing for a scholar.)

[-]Ryan Kidd2y100

I think the high interest in working at scaling labs relative to governance or nonprofit organizations can be explained by:

Most of the scholars in this cohort were working on research agendas for which there are world-leading teams based at scaling labs (e.g., 44% interpretability, 17% oversight/control). Fewer total scholars were working on evals/demos (18%), agent foundations (8%), and formal verification (3%). Therefore, I would not be surprised if many scholars wanted to pursue interpretability or oversight/control at scaling labs.
There seems to be an increasing trend in the AI safety community towards the belief that most useful alignment research will occur at scaling labs (particularly once there are automated research assistants) and external auditors with privileged frontier model access (e.g., METR, Apollo, AISIs). This view seems particularly strongly held by proponents of the "AI control" metastrategy.
Anecdotally, scholars seemed generally in favor of careers at an AISI or evals org, but would prefer to continue pursuing their current research agenda (which might be overdetermined given the large selection pressure they faced to get into MATS to work on that agenda).
Starting new technical AI safety orgs/projects seems quite difficult in the current funding ecosystem. I know of many alumni who have founded or are trying to found projects who express substantial difficulties with securing sufficient funding.

Note that the career fair survey might tell us little about how likely scholars are to start new projects as it was primarily seeking interest in which organizations should attend, not in whether scholars should join orgs vs. found their own.

[-]Orpheus162y227

Thanks for these explanations– I think they're reasonable & insightful. A few thoughts:

Most of the scholars in this cohort were working on research agendas for which there are world-leading teams based at scaling labs

I suspect there's probably some bidirectional causality here. People want to work at scaling labs because they're interested in the research that scaling labs are doing, and people want to focus on the research the scaling labs are doing because they want to work at scaling labs.

There seems to be an increasing trend in the AI safety community towards the belief that most useful alignment research will occur at scaling labs

I think this is true among a subset of the AI safety community but I don't think this characterizes the AI safety community as a whole. For example, another (even stronger IMO) trend in the AI safety community has been towards the belief that policy work & technical governance work is more important than many folks previously expected it to be (see EG Paul joining USAISI, MIRI shifting to technical governance, UKAISI being established, and not to mention the general surge in interest among policymakers).

One perspective on this could be "well, MATS is a technical research program, and we're adding some governance mentors, so shrug." Another perspective on this could be "well, it seems like perhaps MATS is shifting more slowly than one might've imagined, resulting in a culture/ecosystem/mentor cohort/selection process/fellow cohort that disproportionately wants to join scaling labs."

RE shifting more slowly or having a disproportionate focus, note that the ERA fellowship has prioritized toward governance and technical governance– 2/3 of their fellows will be focused on governance + technical governance projects. I'm not necessarily saying this is what would be best for MATS, but it at least points out that we should be seeing MATS' focus on incubating "technical researchers that want to work at scaling labs" as something that's part of its design.

I might be a bit "biased" in that I work in AI policy and my worldview generally suggests that AI policy (as well as technical governance) is extremely neglected. I personally think it's harder to make the case that giving scaling labs better alignment talent is as neglected– it's still quite important, but scaling labs are extremely popular & I think their ability to hire (and pay for) top technical talent is much stronger than that of governments.

Anecdotally, scholars seemed generally in favor of careers at an AISI or evals org, but would prefer to continue pursuing their current research agenda

Again, I think my primary response here is something like the research interests of the MATS cohort are a function of the program and its selection process– not an immutable characteristic of the world. The ERA example is a "strong" example of prioritizing people with other interests, but I imagine there are plenty of "weaker" things MATS could be doing to select/prioritize fellows who had an interest in governance & technical governance. (Or put differently, my guess is that there are ways in which the current selection process and mentor pool disproportionately attracts/favors those who are interested in the kinds of topics you mentioned).

If I could wave a magic wand, I would probably have MATS add many more governance & technical governance mentors and shift to something closer to ERA's breakdown. This would admittedly be a rather big shift for MATS, and perhaps current employees/leaders/funders wouldn't want to do it. I think it ought to be seriously considered, though, and if I were a MATS exec person or a MATS funder I would probably be pushing for this. Or at least asking some serious questions along the lines of "do we really feel like the most impactful thing a training program could be doing right now is serving as an upskilling program for the scaling labs?" (With all due respect to the importance of getting great people to the scaling labs, acknowledging the importance of technical research at scaling labs, agreeing with some of Neel's points etc.)

[-]Ryan Kidd2y*13-2

It seems plausible to me that at least some MATS scholars are somewhat motivated by a desire to work at scaling labs for money, status, etc. However, the value alignment of scholars towards principally reducing AI risk seems generally very high. In Winter 2023-24, our most empirical research dominated cohort, mentors rated the median scholar's value alignment at 8/10 and 85% of scholars were rated 6/10 or above, where 5/10 was “Motivated in part, but would potentially switch focus entirely if it became too personally inconvenient.” To me this is a very encouraging statistic, but I’m sympathetic to concerns that well-intentioned young researchers who join scaling labs might experience value drift, or find it difficult to promote safety culture internally or sound the alarm if necessary; we are consequently planning a “lab safety culture” workshop in Summer. Notably, only 3.7% of surveyed MATS alumni say they are working on AI capabilities; in one case, an alumnus joined a scaling lab capabilties team and transferred to working on safety projects as soon as they were able. As with all things, maximizing our impact is about striking the right balance between trust and caution and I’m encouraged by the high apparent value alignment of our alumni and scholars.

We additionally believe:

Advancing researchers to get hired at lab safety teams is generally good;
We would prefer that the people on lab safety teams have more research experience and are more value-aligned, all else equal, and we think MATS improves scholars on these dimensions;
We would prefer lab safety teams to be larger, and it seems likely that MATS helps create a stronger applicant pool for these jobs, resulting in more hires overall;
MATS creates a pipeline for senior researchers on safety teams to hire people they have worked with for up to 6.5 months in-program, observing their compentency and value alignment;
Even if MATS alumni defect to work on pure capabilities, we would still prefer them to be more value-aligned than otherwise (though of course this has to be weighed against the boost MATS gave to their research abilities).

Regarding “AI control,” I suspect you might be underestimating the support that this metastrategy has garnered in the technical AI safety community, particularly among prosaic AGI safety thought leaders. I see Paul’s decision to leave ARC in favor of the US AISI as a potential endorsement of the AI control paradigm over intent alignment, rather than necessarily an endorsement of an immediate AI pause (I would update against this if he pushes more for a pause than for evals and regulations). I do not support AI control to the exclusion of other metastrategies (including intent alignment and Pause AI), but I consider it a vital and growing component of my strategy portfolio.

It’s true that many AI safety projects are pivoting towards AI governance. I think the establishment of AISIs is wonderful; I am in contact with MATS alumni Alan Cooney and Max Kauffman at the UK AISI and similarly want to help the US AISI with hiring. I would have been excited for Vivek Hebbar’s, Jeremy Gillen’s, Peter Barnett’s, James Lucassen’s, and Thomas Kwa’s research in empirical agent foundations to continue at MIRI, but I am also excited about the new technical governance focus that MATS alumni Lisa Thiergart and Peter Barnett are exploring. I additionally have supported AI safety org accelerator Catalyze Impact as an advisor and Manifund Regrantor and advised several MATS alumni founding AI safety projects; it's not easy to attract or train good founders!

MATS has been interested in supporting more AI governance research since Winter 2022-23, when we supported Richard Ngo and Daniel Kokotajlo (although both declined to accept scholars past the training program) and offered support to several more AI gov researchers. In Summer 2023, we reached out to seven handpicked governance/strategy mentors (some of which you recommended, Akash), though only one was interested in mentoring. In Winter 2023-24 we tried again, with little success. In preparation for the upcoming Summer 2024 and Winter 2024-25 Programs, we reached out to 25 AI gov/policy/natsec researchers (who we asked to also share with their networks) and received expressions of interest from 7 further AI gov researchers. As you can see from our website, MATS is supporting four AI gov mentors in Summer 2024 (six if you count Matija Franklin and Philip Moreira Tomei, who are primarily working on value alignment). We’ve additionally reached out to RAND, IAPS, and others to provide general support. MATS is considering a larger pivot, but available mentors are clearly a limiting constraint. Please contact me if you’re an AI gov researcher and want to mentor!

Part of the reason that AI gov mentors are harder to find is that programs like the RAND TASP, GovAI, IAPS, Horizon, ERA, etc. fellowships seem to be doing a great job collectively of leveraging the available talent. It’s also possible that AI gov researchers are discouraged from mentoring at MATS because of our obvious associations with AI alignment (it’s in the name) and the Berkeley longtermist/rationalist scene (we’re talking on LessWrong and operate in Berkeley). We are currently considering ways to support AI gov researchers who don’t want to affiliate with the alignment, x-risk, longtermist, or rationalist communities.

I’ll additionally note that MATS has historically supported much research that indirectly contributes to AI gov/policy, such as Owain Evans’, Beth Barnes’, and Francis Rhys Ward’s capabilities evals, Evan Hubinger’s alignment evals, Jeffrey Ladish’s capabilities demos, Jesse Clifton’s and Caspar Oesterheldt’s cooperation mechanisms, etc.

[-]habryka2y1816

In Winter 2023-24, our most empirical research dominated cohort, mentors rated the median scholar's value alignment at 8/10 and 85% of scholars were rated 6/10 or above, where 5/10 was “Motivated in part, but would potentially switch focus entirely if it became too personally inconvenient.”

Wait, aren't many of those mentors themselves working at scaling labs or working very closely with them? So this doesn't feel like a very comforting response to the concern of "I am worried these people want to work at scaling labs because it's a high-prestige and career-advancing thing to do", if the people whose judgements you are using to evaluate have themselves chosen the exact path that I am concerned about.

[-]Ryan Kidd2y81

Of the scholars ranked 5/10 and lower on value alignment, 63% worked with a mentor at a scaling lab, compared with 27% of the scholars ranked 6/10 and higher. The average scaling lab mentors rated their scholars' value alignment at 7.3/10 and rated 78% of their scholars at 6/10 and higher, compared to 8.0/10 and 90% for the average non-scaling lab mentor. This indicates that our scaling lab mentors were more discerning of value alignment on average than non-scaling lab mentors, or had a higher base rate of low-value alignment scholars (probably both).

I also want to push back a bit against an implicit framing of the average scaling lab safety researcher we support as being relatively unconcerned about value alignment or the positive impact of their research; this seems manifestly false from my conversations with mentors, their scholars, and the broader community.

[-]habryka2y1312

implicit framing of the average scaling lab safety researcher we support as being relatively unconcerned about value alignment or the positive impact of their research

Huh, not sure where you are picking this up. I am of course very concerned about the ability of researchers at scaling labs being capable of evaluating their positive impact in respect to their choice of working at a scaling lab (their job does after all depend on them not believing that is harmful), but of course they are not unconcerned about their positive impact.

[-]habryka2y95

This indicates that our scaling lab mentors were more discerning of value alignment on average than non-scaling lab mentors, or had a higher base rate of low-value alignment scholars (probably both).

The second hypothesis here seems much more likely (and my guess is your mentors would agree). My guess is after properly controlling for that you would find a mild to moderate negative correlation here.

But also, more importantly, the set of scholars from which MATS is drawing is heavily skewed towards the kind of person who would work at scaling labs (especially since funding has been heavily skewing towards funding the kind of research that can occur at scaling labs).

[-]Orpheus162y40

Thanks for this (very thorough) answer. I'm especially excited to see that you've reached out to 25 AI gov researchers & already have four governance mentors for summer 2024. (Minor: I think the post mentioned that you plan to have at least 2, but it seems like there are already 4 confirmed and you're open to more; apologies if I misread something though.)

A few quick responses to other stuff:

I appreciate a lot of the other content presented. It feels to me like a lot of it is addressing the claim "it is net positive for MATS to upskill people who end up working at scaling labs", whereas I think the claims I made were a bit different. (Specifically, I think I was going for more "Do you think this is the best thing for MATS to be focusing on, relative to governance/policy"and "Do you think there are some cultural things that ought to be examined to figure out why scaling labs are so much more attractive than options that at-least-to-me seem more impactful in expectation").
RE AI control, I don't think I'm necessarily underestimating its popularity as a metastrategy. I'm broadly aware that a large fraction of the Bay Area technical folks are excited about control. However, I think when characterizing the AI safety community as a whole (not just technical people), the shift toward governance/policy macrostrategies is (much) stronger than the shift toward the control macrostrategy. (Separately, I think I'm more excited about foundational work in AI control that looks more like the kind of thing that Buck/Ryan have written about is separate from typical prosaic work (e.g., interpretability), even though lots of typical prosaic work could be argued to be connected to the control macrostrategy.)
+1 that AI governance mentors might be harder to find for some of the reasons you listed.

[-]Ryan Kidd2y41

Do you think there are some cultural things that ought to be examined to figure out why scaling labs are so much more attractive than options that at-least-to-me seem more impactful in expectation?

As a naive guess, I would consider the main reasons to be:

People seeking jobs in AI safety often want to take on "heroic responsibility." Work on evals and policy, while essential, might be seen as "passing the buck" onto others, often at scaling labs, who have to "solve the wicked problem of AI alignment/control" (quotes indicate my caricature of a hypothetical person). Anecdotally, I've often heard people in-community disparage AI safety strategies that primarily "buy time" without "substantially increasing the odds AGI is aligned." Programs like MATS emphasizing the importance of AI governance and including AI strategy workshops might help shift this mindset, if it exists.
Roles in AI gov/policy, while impactful at reducing AI risk, likely have worse quality-of-life features (e.g., wages, benefits, work culture) than similarly impactful roles in scaling labs. People seeking jobs in AI safety might choose between two high-impact roles based on these salient features without considering how many others making the same decisions will affect the talent flow en masse. Programs like MATS might contribute to this problem, but only if the labs keep hiring talent (unlikely given poor returns on scale) and the AI gov/policy orgs don't make attractive offers (unlikely given METR and Apollo pay pretty good wages, high status, and work cultures comparable to labs; AISIs might be limited because government roles don't typically pay well, but it seems there are substantial status benefits to working there).
AI risk might be particularly appealing as a cause area to people who are dispositionally and experientially suited to technical work and scaling labs might be the most impactful place to do many varieties of technical work. Programs like MATS are definitely not a detriment here, as they mostly attract individuals who were already going to work in technical careers, expose them to governance-adjacent research like evals, and recommend potential careers in AI gov/policy.

[-]Ryan Kidd2y30

Cheers, Akash! Yep, our confirmed mentor list updated in the days after publishing this retrospective. Our website remains the best up-to-date source for our Summer/Winter plans.

Do you think this is the best thing for MATS to be focusing on, relative to governance/policy?

MATS is not currently bottlenecked on funding for our current Summer plans and hopefully won't be for Winter either. If further interested high-impact AI gov mentors appear in the next month or two (and some already seem to be appearing), we will boost this component of our Winter research portfolio. If ERA disappeared tomorrow, we would do our best to support many of their AI gov mentors. In my opinion, MATS is currently not sacrificing opportunities to significantly benefit AI governance and policy; rather, we are rate-limited by factors outside of our control and are taking substantial steps to circumvent these, including:

Substantial outreach to potential AI gov mentors;
Pursuing institutional partnerships with key AI gov/policy orgs;
Offering institutional support and advice to other training programs;
Considering alternative program forms less associated with rationality/longtermism;
Connecting scholars and alumni with recommended opportunities in AI gov/policy;
Regularly recommending scholars and alumni to AI gov/policy org hiring managers.

We appreciate further advice to this end!

Do you think there are some cultural things that ought to be examined to figure out why scaling labs are so much more attractive than options that at-least-to-me seem more impactful in expectation?

I think this is a good question, but it might be misleading in isolation. I would additionally ask:

"How many people are the AISIs, METR, and Apollo currently hiring and are they mainly for technical or policy roles? Do we expect this to change?"
"Are the available job opportunities for AI gov researchers and junior policy staffers sufficient to justify pursuing this as a primary career pathway if one is already experienced at ML and particularly well-suited (e.g., dispositionally) for empirical research?"
"Is there a large demand for AI gov researchers with technical experience in AI safety and familiarity with AI threat models, or will most roles go to experienced policy researchers, including those transitioning from other fields? If the former, where should researchers gain technical experience? If the latter, should we be pushing junior AI gov training programs or retraining bootcamps/workshops for experienced professionals?"
"Are existing talent pipelines into AI gov/policy meeting the needs of established research organizations and think tanks (e.g., RAND, GovAI, TFS, IAPS, IFP, etc.)? If not, where can programs like MATS/ERA/etc. best add value?"
"Is there a demand for more organizations like CAIP? If so, what experience do the founders require?"

[-]Austin Chen2y2-2

Starting new technical AI safety orgs/projects seems quite difficult in the current funding ecosystem. I know of many alumni who have founded or are trying to found projects who express substantial difficulties with securing sufficient funding.

Interesting - what's like the minimum funding ask to get a new org off the ground? I think something like $300k would be enough to cover ~9 mo of salary and compute for a team of ~3, and that seems quite reasonable to raise in this current ecosystem for pre-seeding a org.

[-]Ryan Kidd2y23

Yeah, that amount seems reasonable, if on the low side, for founding a small org. What makes you think $300k is reasonably easy to raise in this current ecosystem? Also, I'll note that larger orgs need significantly more.

[-]Neel Nanda2y6-9

(EDIT: I just saw Ryan posted a comment a few minutes before mine, I agree substantially with it)

As a Google DeepMind employee I'm obviously pretty biased, but this seems pretty reasonable to me, assuming it's about alignment/similar teams at those labs? (If it's about capabilities teams, I agree that's bad!)

I think the alignment teams generally do good and useful work, especially those in a position to publish on it. And it seems extremely important that whoever makes AGI has a world-class alignment team! And some kinds of alignment research can only really be done with direct access to frontier models. MATS scholars tend to be pretty early in their alignment research career, and I also expect frontier lab alignment teams are a better place to learn technical skills especially engineering, and generally have a higher talent density there.

UK AISI/US AISI/METR seem like solid options for evals, but basically just work on evals, and Ryan says down thread that only 18% of scholars work on evals/demos. And I think it's valuable both for frontier labs to have good evals teams and for there to be good external evaluators (especially in government), I can see good arguments favouring either option.

44% of scholars did interpretability, where in my opinion the Anthropic team is clearly a fantastic option, and I like to think DeepMind is also a decent option, as is OpenAI. Apollo and various academic labs are the main other places you can do mech interp. So those career preferences seem pretty reasonable to me there for interp scholars.

17% are on oversight/control, and for oversight I think you generally want a lot of compute and access to frontier models? I am less sure for control, and think Redwood is doing good work there, but as far as I'm aware they're not hiring.

This is all assuming that scholars want to keep working in the same field they did MATS for, which in my experience is often but not always true.

I'm personally quite skeptical of inexperienced researchers trying to start new orgs - starting a new org and having it succeed is really, really hard, and much easier with more experience! So people preferring to get jobs seems great by my lights

[-]Orpheus162y72

Thanks, Neel! I responded in greater detail to Ryan's comment but just wanted to note here that I appreciate yours as well & agree with a lot of it.

My main response to this is something like "Given that MATS selects the mentors and selects the fellows, MATS has a lot of influence over what the fellows are interested in. My guess is that MATS' current mentor pool & selection process overweights interpretability and underweights governance + technical governance, relative to what I think would be ideal."

[-]Neel Nanda2y23

I see this is strongly disagree voted - I don't mind, but I'd be curious for people to reply with which parts they disagree with! (Or at least disagree react to specific lines). I make a lot of claims in that comment, though I personally think they're all pretty reasonable. The one about not wanting inexperienced researchers to start orgs, or "alignment teams at scaling labs are good actually" might be spiciest?

[-]Orpheus162y135

Thank you for explaining the shift from scholar support to research management— I found that quite interesting and I don’t think I would’ve intuitively assumed that the research management frame would be more helpful.

I do wonder if as the summer progresses, the role of the RM should shift from writing the reports for mentors to helping the fellows prepare their own reports for mentors. IMO, fellows getting into the habit of providing these updates & learning how to “manage up” when it comes to mentors seems important. I suspect something in the cluster of “being able to communicate well with mentors//manage your mentor+collaborator relationships” is one of the most important “soft skills” for research success. I suspect a transition from “tell your RM things that they include in their report” to “work with your RM to write your own report” would help instill this skill.

[-]LauraVaughan2y70

This is a good question. I agree that “managing up” is a very important skill in general! It’s one of the primary focuses of our research manager training.

However, I want to acknowledge whether to focus on this with scholars seems to be a question of tradeoffs regarding MATS’s priorities: to what extent are we prioritizing scholars upskilling in deep technical/research understanding available at MATS, versus them upskilling in generalizable soft skills (that they could theoretically learn elsewhere)? If we were to theoretically prioritize solely the former, maximizing time and efficiency between scholars and mentors through taking this burden off of them seems better as this allows us time to improve research skills + time spent on their projects. If we were to prioritize soft skills, though, focusing on them managing up well seems like a good option. (And FWIW, we already do the latter indirectly - but not as a structured offering. We focus much more on things like project management + unblocking scholars with our remaining time.)

To me MATS has primarily been about providing the best environment we can for AI safety mentorship, and increasing the amount of AI safety talent+collabs in the world. I can see an argument here that teaching scholars to manage up does in fact benefit their trajectories holistically, but I would want to balance this against the tradeoff of marginal time spent helping them directly in the counterfactual, be it project management or otherwise preparing for meeting with their mentor. During the main, 10-week phase of MATS, scholars are incredibly time crunched to get a research project done. This pushes me slightly against the idea of spending much concentrated effort on this during the main phase, but not necessarily against some amount of time on this.

That all being said, this seems like a potentially good fit for:

a workshop towards the late-middle-or-end of the main 10-week phase, or
sometime over the 4-month extension phase, where scholars continue working with their mentors in an increasingly independent fashion.

…and maybe some time spent on this towards the end of the 10-week phase in 1-1s, but I’d want to allow wiggle room for prioritizing more critical work as needed.

[-]Henry Sleight2y69

the research management frame would be more helpful.

I think btw it gets more value than scholar support because it's a proactive service we offer to all scholars on a given stream, rather than waiting for them to only come to us when there's a problem.

the role of the RM should shift from writing the reports for mentors to helping the fellows prepare their own reports for mentors.

I spend a fair amount of time on my projects helping people prep for meetings with their supervisors, yeah. I also used to have scholars edit my written reports before sending to Ethan.

[-]OliverHH2y83

I'm noticing there are still many interp mentors for the current round of MATS -- was the "fewer mech interp mentors" change implemented for this cohort, or will that start in Winter or later?

[-]Ryan Kidd2y110

Last program, 44% of scholar research was on interpretability, 18% on evals/demos, 17% on oversight/control, etc. In summer, we intend for 35% of scholar research to be on interpretability, 17% on evals/demos, 27% on oversight/control, etc., based on our available mentor pool and research priorities. Interpretability will still be the largest research track and still has the greatest interest from potential mentors and applicants. The plot below shows the research interests of 1331 MATS applicants and 54 potential mentors who have applied for our Summer 2024 or Winter 2024-25 Programs.

[-]Neel Nanda2y106

Note that number of scholars is a much more important metric than number of mentors when it comes to evaluating MATS resources, as scholar per mentors varies a bunch (eg over winter I had 10 scholars, which is much more than most mentors). Harder to evaluate from the outside though!

[-]Erik Jenner2y60

I don't know the answer to your actual question, but I'll note there are slightly fewer mech interp mentors than mentors listed in the "AI interpretability" area (though all of them are at least doing "model internals"). I'd say Stephen Casper and I aren't focused on interpretability in any narrow sense, and Nandi Schoots' projects also sound closer to science of deep learning than mech interp. Assuming we count everyone else, that leaves 11 out of 39 mentors, which is slightly less than ~8 out of 23 from the previous cohort (though maybe not by much).

[-]Sheikh Abdur Raheem Ali2y60

I love this report! Shed a tear at not seeing Microsoft on the organization interest chart though 🥲. We could be a better Bing T_T.

[-]Ryan Kidd2y20

Oh, I think we forgot to ask scholars if they wanted Microsoft at the career fair. Is Microsoft hiring AI safety researchers?

[-]Sheikh Abdur Raheem Ali2y111

Yes, here’s an open position: Research Scientist - Responsible & OpenAI Research. Of course, responsible AI differs from interpretability, activation engineering, or formal methods (e.g., safeguarded AI, singular learning theory, agent foundations). I’ll admit we are doing less of that than I’d prefer, partially because OpenAI shares some of its ‘secret safety sauce’ with us, though not all, and not immediately.

Note from our annual report that we are employing 1% fewer people than this time last year, so headcount is a very scarce resource. However, the news reported we invested ~£2.5b in setting up a new AI hub in London under Jordan Hoffman, with 600 new seats allocated to it (officially, I can neither confirm nor deny these numbers).

I’m visiting there this June after EAG London. We’re the only member of the Frontier Model Forum without an alignment team. MATS scholars would be excellent hires for such a team, should one be established. Some time ago, a few colleagues helped me draft a white paper to internally gather momentum and suggest to leadership that starting one there might be beneficial. Unfortunately, I am not permitted to discuss the responses or any future plans regarding this matter.

[-]Ryan Kidd2y30

This is potentially exciting news! You should definitely visit the LISA office, where many MATS extension program scholars are currently located.

[-]Sheikh Abdur Raheem Ali2y20

I’m a LISA member already!

Moderation Log