I want to bring up a point that I almost never hear talked about in AGI discussions. But to me feels like the only route for humans to have a good future. I’m putting this out for people that already largely share my view on the trajectory of AGI. If you don’t agree with the main premises but are interested, there are lots of other posts that go into why these might be true.
A) AGI seems inevitable.
B) Seems impossible that humans (as they are now) don’t lose control soon after AGI. All the arguments for us retaining control don’t seem to understand that AI isn’t just another tool. I haven’t seen any that grapple with what it really means for a machine to be intelligent.
C) It seems very hard that AGI will be aligned with what humans care about. These systems are just so alien. Maybe we can align it for a little bit but it will be unstable. Very hard to see how alignment is maintained with a thing that is way smarter than us and is evolving on its own.
D) Even if I’m wrong about B or C, humans are not intelligent/wise enough to deal with our current technology level, much less super powerful AI.
Let's say we manage this incredibly difficult task of aligning or controlling AI to humans’ will. There are many amazing humans but also many many awful ones. The awful ones will continue to do awful things with way more leverage. This scenario seems pretty disastrous to me. We don’t want super powerful humans without an increase in wisdom.
To me the conclusion from A+B+C+D is: There is no good outcome (for us) without humans themselves also becoming super intelligent.
So I believe our goal should be to ensure humans are in control long enough to augment our mind with extra capability. (or upload but that seems further off) I’m not sure how this will work but I feel like the things that neuralink or science.xyz are doing, developing brain computer interfaces, are steps in that direction. We also need to figure out scalable technological ways to work on trauma/psychology/fulfilling needs/reducing fears. Humans will somehow have to connect with machines to become much wiser, much more intelligent, and much more enlightened. Maybe we can become something like the amygdala of the neo-neo-cortex.
There are two important timelines in competition here, length of time till we can upgrade, and length of time we can maintain control. We need to upgrade before we lose control. Unfortunately, in my view, on the current trajectory we will lose control before we are able to upgrade. I think we must work to make sure this isn’t the case.
Time Till Upgrade:
- My current estimate is ~15 years. (very big error bars here)
- Ways to shorten
- AI that helps people do this science
- AGI that is good at science and is aligned long enough to help us on this
- More people doing this kind of research
- More funding
- More status to this kind of research
- Maybe better interfaces to the current models will help in the short run and make people more productive thus speeding this development
Time Left With Control:
- My current estimate is ~6 years
- AGI ~3-4 years (less big error bars)
- Loss of control 2-3 years after AGI (pretty big error bars)
- Ways it could be longer
- AI research slows down
- Hope for safety
- Hope we aren’t as close as it seems
- Hope for a slowness to implement agentic behavior
- Competing Agents
- Alignment is pretty good and defense is easier than offense
- ?
In short, one of the most underrepresented ways to work on AI safety is to work on BCI.
The only way forward is through!
I suspect that humans will turn out to be relatively simple to encode - quite small amounts of low-resolution memory that we draw on, with detailed understanding maps - smaller than LLMs that we're creating. Added to which there is an array of motivation factors that will be quite universal but of varying levels of intensity in different dimensions for each individual.
If that take on things is correct then it may be that emulating a human by training a skeleton AI using constant video streaming etc over a 10-20 year period (about how long neurons last before replacement) to optimally better predict behaviour of the human being modelled will eventually arrive at an AI with almost exactly the same beliefs and behaviours as the human being emulated.
Without physically carving up brains and attempting to transcribe synaptic weightings etc that might prove the most viable means of effective up-loading and creation of highly aligned AI with human like values. And perhaps would create something closer to being our true children-of-the-mind
For AGI alignment; seems like there will at minimum need to be a perhaps multiple blind & independent hierarchies of increasingly smart AIs continually checking and assuring that next level up AIs are maintaining alignment with active monitoring of activities, because as AIs get smarter their ability to fool monitoring systems will likely grow as the relative gulf between monitored and monitoring intelligence grows.
I think a wide array of AIs is a bad idea. If there is a non-zero chance that an AI goes 'murder clippy' and ends humans, then that probability is additive - more independent AIs = higher chance of doom.
That's the premise of Greg Egan's "Jewel" stories. I think it's wrong. A person who never saw a spider will still get scared when seeing one for the first time, because human... (read more)