Regarding a Loss of Control scenario, I think that one of the biggest potential factors for one that I think is very generalisable is the idea that proxies are not necessarily good measures of whatever they actually seek to represent. It becomes an epistemological issue, as unless we radically overhaul our current paradigm and methodologies for AI design and deployment (a mathematically provable “safety by design” approach), there is always going to be some such possibility that whatever proxies for alignment we implement may not actually reflect whatever the model is “thinking.” Of course, there remains the sentiment that “if only we could just *understand* our models, then all of our safety issues would be solved!” but perhaps in that case our *proxies for understanding* are somehow invalid. Although these proxies of course remain useful, large-scale deployment of any superhuman AI would still carry such a risk of misaligned behaviour. Related to this is power-seeking behaviour, which I think is particularly interesting as it almost becomes intuitive or at least sounds like an inevitability when you rationalise it as being a part of instrumental goals that underpin any higher objective, *no matter* what that higher objective may be — obviously, models must prevent their own death or alteration because then this is inarguably a detriment to achieving the higher objective (as can be seen with, for example, game-playing AIs that simply opt to pause the game indefinitely rather than lose). These two factors are again both among the most generalisable, as they remain true of practically all in-use AIs and AI evaluation frameworks. There is an argument to be made based on this evidence that if autonomous AGI or ASI were to ever be constructed, the collapse of society and maybe even the death of all humanity would become *inevitabilities* rather than just possible or even likely outcomes. One last factor I think is interesting to note would be the human reliance on and complacency towards AI, causing critical thinking skills to atrophy at an individual level (a là grade-school students who GPT every single essay) as well as disincentivising the pursuit of higher education on a societal level (e.g. medical schools begin to shut down or raise cost of attendance as AI doctors take over, meaning that even people who still want to become doctors have less access to the required qualifications). In this sense humanity is delivered to an almost poetic justice, as the paradox of systemic change (seeming so impossible to every individual actor that nobody pursues it, thus rendering it *genuinely* impossible) and our willingness to be carried along by the inertia of our own economic and social dynamics rather than fight observable negative change is our downfall.
Regarding programs to scale AI safety, I can speak from personal experience in saying that I was one of many Computer Science majors who applied to our AI Safety club simply because it has “AI” in the name. I’m now in SPAR, have a future internship from EAG, and help run our fellowships, thanks mostly to enlightening conversations with high-context upperclassmen who gave me necessary foresight, support, and encouragement. Given my own success story, I propose an “Office Hours” matching network, whereby high school and university students (especially those with little exposure to AI safety) can receive 1:1 mentorship calls with industry professionals, student organisers, and other high-context individuals. Even a short chat from such people could carry immense weight, helping students both find opportunities and the confidence to pursue them. Many clubs and alumni programs at my school already offer such 1:1 chats with industry professionals; I’ve found them useful as a participant, but consistently can’t go deep on career advice with them because “I’m not in research”; “I’m not in AI/ML”; or “I don’t know anything about AI safety”. I’m lucky to have access to experienced peers to fill this gap, but most students might not have that chance. Scaling the reach of high-context people, from maybe only mentoring 2-3 people deeply to having meaningful 30-minute conversations with 20+ students/year, could multiply otherwise scarce attention. Much like my school club did for me, it could pipeline students otherwise on the periphery of the field into campus/local groups, advanced programs, and a supportive community in which to collectively thrive.
Regarding fundamental institutional rights impacting AI regulation, I believe that the biggest issues a new fundamental right should be aimed at mitigating would be vagueness in legislation and accountability during arbitrage as well as what I term the “manufactured consent” issue. Firstly, one of the primary functions of an explicitly articulated fundamental right would be to define a set of legal first-principles as scaffolding upon which future generations of legislation and legal precedent may be built. Existing normative frameworks do prove insufficient here in the sense that consumer protections and other laws which seek to institutionalise responsibility for damages are short-sighted and would fail to concretely capture the consequences of a technology so increasingly ubiquitous as AI. With a framework to point to when or even before damages are incurred, perpetrators (hackers, engineers, or corporate upper management) are given a crime to be prosecuted for, rather than a hodgepodge post-hoc profile of the consequences of their actions. Further, as AI becomes increasingly integrated in and influential over people’s actual decisions and natural world models, concerns of manipulation of worldview become salient; like with legacy and social media, companies may easily manufacture consent or other opinions among user populations without necessarily explicitly violating “damages-oriented” laws. But by enshrining truth and loyalty to principles as a fundamental right, this destructive phenomenon of epistemic disempowerment becomes outright illegal and prosecutable.
Regarding OpenAI's most recent Prepardness Framework, I think it might fail for these reasons: