It seems that the alignment problem is unlikely to be solved before the emergence of the first misaligned AGI.
This means, humanity has made a deadly mistake.
But what is the nature of the mistake, and when did it happen?
From my understanding of the history of the field, I draw the following conclusions.
Firstly, there wasn’t a single event, a single mistake that has created the current perilous situation. Several trends contributed to it, each of them seems to be inevitable:
- Researchers are fascinated with the idea of AGI (and want to brag about their research)
- Gamers love realistic 3d graphics, driving progress in GPUs
- The massively parallel hardware for 3d graphics can be easily utilized for artificial neural networks, due to the similar nature of the computation
- The Internet generates massive amounts of training data for the AI, and makes it profitable.
- As it turned out, AGI is surprisingly easy to create. Feed enough data and compute to a relatively simple artificial neural network, and it becomes smart enough and general enough to beat humans on many cognitive tasks. And it’s trivial to make it smarter: just give it more data and compute.
It means, there is no realistic possibility to significantly slow down the AGI progress by fixing some big mistake, or by slowing down one of the trends.
Secondly, some people and companies have had a disproportionately big impact on accelerating the progress towards AGI. Among them:
- Hardware: Nvidia, Google...
- Algorithms: Schmidhuber, Hutter, Hochreiter, LeCun, Krizhevsky…
- Funding and management: Page, Brin, Musk, Hassabis, Altman…
- Cultural influence: Vinge, M.Banks, M.More, Kurzweil…
While AGI was inevitable with or without them, the collective action of these guys probably has shortened the timelines by years, if not by decades, without contributing much to alignment research. And now we're running out of time. As much as I respect many of them, they deserve some blame.
Thirdly, the progress towards the alignment solution was mostly driven by a few random nerds of the “herding cats” variety, with some assistance from a few random philanthropists. There is no virtuous cycle where alignment research gives you gold nuggets which you invest into hiring more alignment researchers. There is no "Google-Inc-for-Alignment". The alignment research community is likely to remain as a "bunch of random nerds", even if some of the aforementioned guys decide to flood the field with big money.
So, unless the alignment solution is much easier than creating any kind of AGI (which is very unlikely), the first recursively-self-improving AGI will be misaligned. That's the reality of our situation.
Thus, it is not enough to focus on alignment research. We must prepare ourselves for the world inhabited by misaligned AGIs.
The approaching storm
It's fair to say that AGI is already created. ChatGPT and PaLM-E are both sufficiently general to deserve the label.
It's also fair to say that the year 2022 was the first year of the Technological Singularity (in the Vingean meaning of the term: beyond the Singularity, most of our long-term predictions will fail). Way too many AI breakthroughs have happened in a single year, some of which are already affecting the lives of millions of people:
BIG-bench, Chinchilla AI, Perceiver AR, PaLM, GATO, GNAL, AlphaCode, OPT, Stable Diffusion, GitHub Copilot, Code as Policies, DALL-E 2, Midjourney, ChatGPT, the first AI-designed GPUs...
The AGIs are not capable of fast recursive self-improvement yet. And the Technological Singularity only just started to accelerate.
We must invest some deep thought into preparing to survive in the approaching storm.
We have yet to have an airtight solution, but there is enough of approaches explored that could increase the ETA(doom). Maybe when we'll have a proto-AGI for testing things on, we can refine them enough to increase ETA to few years, and then few years more, etc.
Also, people did not take AI risks seriously when AI was not spotlight. Now interest in AI safety increases rapidly. But so does interest in AI capabilities, sadly.