Copying over a comment from the EA forum (and my response) because it speaks to something that was in some earlier drafts, that I expect to come up, and that is worth just going ahead and addressing imo.
IMO it would help to see a concrete list of MIRI's outputs and budget for the last several years. My understanding is that MIRI has intentionally withheld most of its work from the public eye for fear of infohazards, which might be reasonable for soliciting funding from large private donors but seems like a poor strategy for raising substantial public money, both prudentially and epistemically.
If there are particular projects you think are too dangerous to describe, it would still help to give a sense of what the others were, a cost breakdown for those, anything you can say about the more dangerous ones (e.g. number of work hours that went into them, what class of project they were, whether they're still live, any downstream effect you can point to, and so on).
My response:
(Speaking in my capacity as someone who currently works for MIRI)
I think the degree to which we withheld work from the public for fear of accelerating progress toward ASI might be a little overrepresented in the above. We adopted a stance of closed-by-default research years ago for that reason, but that's not why e.g. we don't publish concrete and exhaustive lists of outputs and budget.
We do publish some lists of some outputs, and we do publish some degree of budgetary breakdowns, in some years.
But mainly, we think of ourselves as asking for money from only one of the two kinds of donors. MIRI feels that it's pretty important to maintain strategic and tactical flexibility, to be able to do a bunch of different weird things that we think each have a small chance of working out without exhaustive justification (or post-hoc litigation) of each one, and to avoid the trap of focusing only on clearly legible short chains of this—>that (as opposed to trying both legible and less-legible things).
(A colleague of mine once joked that "wages are for people who can demonstrate the value of their labor within a single hour; I can't do that, which is why I'm on a salary." A similar principle applies here.)
In the past, funding MIRI led to outputs like our alignment research publications and the 2020/2021 research push (that didn't pan out). In the more recent past, funding MIRI has led to outputs like the work of our technical governance team, and the book (and its associated launch campaign and various public impacts).
That's enough for some donors—"If I fund these people, my money will go into various experiments that are all aimed at ameliorating existential risk from ASI, with a lean toward the sorts of things that no one else is trying, which means high variance and lots of stuff that doesn't pan out and the occasional home run."
Other donors are looking to more clearly purchase a specific known product, and those donors should rightly send fewer of their dollars to MIRI, because MIRI has never been and does not intend to ever be quite so clear and concrete and locked-in.
(One might ask "okay, well, why post on the EA forum, which is overwhelmingly populated by the other kind of donor, who wants to track the measurable effectiveness of their dollars?" and the answer is "mostly for the small number who are interested in MIRI-like efforts anyway, and also for historical reasons since the EA and rationality and AI safety communities share so much history." Definitely we do not feel entitled to anyone's dollars, and the hesitations of any donor who doesn't want to send their money toward MIRI-like efforts are valid.)
Most of the seven Extended Discussions under chapter 4 of Nate and Eliezer's book's supplementals are basically an expansion of this thesis (which I also agree with and think is true).
Example: suppose someone says "I can imagine an atomic copy of ourselves which isn't conscious, therefore consciousness is non-physical." and I say "No, I can't imagine that."
Or the followup by Logan Strohl, even more directly on this
Just noting for the audience that the edits which Anna references in her reply to CronoDAS, as if they had substantively changed the meaning of my original comment, were to add:
It did not originally specify undisclosed conflicts of interest in any way that the new version doesn't. Both versions contained the same core (true) claim: that multiple of the staff members common to both CFAR!2017 and CFAR!2025 often had various (i.e. not only the AI stuff) agendas which would bump participant best interests to second, third, or even lower on the priority ladder.
I've also added, just now, a clarifying edit to a higher comment: "Some of these staff members are completely blind to some centrally important axes of care." This seemed important to add, given that Anna is below making claims of having seen, modeled, and addressed the problems (a refrain I have heard from her, directly, in multiple epochs, and taken damage from naively trusting more than once). More (abstract, philosophical) detail on my views about this sort of dynamic here.
I claim to be as-aware and as-sensitive-to of all of these considerations as you are. I think I am being as specific as possible, given constraints (many of which I wish were not there; I have a preference for speaking more clearly than I can here).
I know of one parent that puts three dollars aside each time they violate the bodily sovereignty of their infant - taking something out of their mouth, or restricting where they can go
It's me, by the way. Happy to identify myself.
(I have more agreement than disagreement with the authors on many points, here.)
It's going to depend a lot on the social bubble/which group of friends. It's not outrageous for the social circles I run in, which are pretty liberal/West Coast, but it would be outrageous for some bubbles I consider to be otherwise fun and fine and healthy.
Mainly it leans into the archetype of games like Truth or Dare, or Hot Seat, which are sort of canonically teenage party games and thus often trying to loosen those particular strictures.