hi! i'm tammy :3
i research the QACI plan for formal-goal AI alignment at orthogonal.
check out my blog and my twitter.
even the very vague general notion that the government is regulating at all could maybe help make investment in AI more frisky, which is a good thing.
the main risk i'm worried about is that it brings more attention to AI and causes more people to think of clever AI engineering tricks.
one solution to this problem is to simply never use that capability (running expensive computations) at all, or to not use it before the iterated counterfactual researchers have developed proofs that any expensive computation they run is safe, or before they have very slowly and carefully built dath-ilan-style corrigible aligned AGI.
yes, the eventual outcome is hard to predict. but by plan looks like the kind of plan that would fail in Xrisky rather than Srisky ways, when it fails.
i don't use the Thing-line nomenclature very much anymore and i only use U/X/S.
i am concerned about the other paths as well but i'm hopeful we can figure them out within the QACI counterfactuals.
i'll say that while i'm absolutely horrified at the possibility of S-risks, i think they're somewhat small, and that the work i'm doing now (fairly S-risk-resistant alignment) is pretty convergent to both S-risk and X-risk reduction.
in particular, an aligned AI sells more of its lightcone to get baby-eating aliens to eat their babies less, and in general a properly aligned AI will try its hardest to ensure what we care about (including reducing suffering) is satisfied, so alignment is convergent to both.
but some wonkier approaches could be pretty scary.
(note that another reason i don't think about S-risks too much is that i don't think my mental health could handle worrying about them a lot, and i need all the mental health i can get to solve alignment.)
i think yudkowsky is trying to convey the fact that reality is the line on the right, not the line on the left:
see also my favorite part from AGI Ruin:
Trolley problems are not an interesting subproblem in all of this; if there are any survivors, you solved alignment. At this point, I no longer care how it works, I don't care how you got there, I am cause-agnostic about whatever methodology you used, all I am looking at is prospective results, all I want is that we have justifiable cause to believe of a pivotally useful AGI 'this will not kill literally everyone'. Anybody telling you I'm asking for stricter 'alignment' than this has failed at reading comprehension. The big ask from AGI alignment, the basic challenge I am saying is too difficult, is to obtain by any strategy whatsoever a significant chance of there being any survivors.
he sees some people say "oh no what if AI misalignment causes some people to die but not others" (typically "what if some group in control of the AI survives but everyone else dies or becomes subservient") and he's trying to get across the information that unaligned AI isn't selective, it does kill actually literally everyone, and if you have even just a few survivors you have already solved almost all of alignment and you're not far from being able to save actually everyone. (don't remember where he said that, but he definitely did say somewhere something along the lines of "if you did get a few survivors, you solved so much of the problem that your solution can be modified to get everyone to survive")
what do you mean my intensive property, and why do you think i don't want that?