I want to bring up a point that I almost never hear talked about in AGI discussions. But to me feels like the only route for humans to have a good future. I’m putting this out for people that already largely share my view on the trajectory of AGI. If you don’t agree with the main premises but are interested, there are lots of other posts that go into why these might be true.
A) AGI seems inevitable.
B) Seems impossible that humans (as they are now) don’t lose control soon after AGI. All the arguments for us retaining control don’t seem to understand that AI isn’t just another tool. I haven’t seen any that grapple with what it really means for a machine to be intelligent.
C) It seems very hard that AGI will be aligned with what humans care about. These systems are just so alien. Maybe we can align it for a little bit but it will be unstable. Very hard to see how alignment is maintained with a thing that is way smarter than us and is evolving on its own.
D) Even if I’m wrong about B or C, humans are not intelligent/wise enough to deal with our current technology level, much less super powerful AI.
Let's say we manage this incredibly difficult task of aligning or controlling AI to humans’ will. There are many amazing humans but also many many awful ones. The awful ones will continue to do awful things with way more leverage. This scenario seems pretty disastrous to me. We don’t want super powerful humans without an increase in wisdom.
To me the conclusion from A+B+C+D is: There is no good outcome (for us) without humans themselves also becoming super intelligent.
So I believe our goal should be to ensure humans are in control long enough to augment our mind with extra capability. (or upload but that seems further off) I’m not sure how this will work but I feel like the things that neuralink or science.xyz are doing, developing brain computer interfaces, are steps in that direction. We also need to figure out scalable technological ways to work on trauma/psychology/fulfilling needs/reducing fears. Humans will somehow have to connect with machines to become much wiser, much more intelligent, and much more enlightened. Maybe we can become something like the amygdala of the neo-neo-cortex.
There are two important timelines in competition here, length of time till we can upgrade, and length of time we can maintain control. We need to upgrade before we lose control. Unfortunately, in my view, on the current trajectory we will lose control before we are able to upgrade. I think we must work to make sure this isn’t the case.
Time Till Upgrade:
- My current estimate is ~15 years. (very big error bars here)
- Ways to shorten
- AI that helps people do this science
- AGI that is good at science and is aligned long enough to help us on this
- More people doing this kind of research
- More funding
- More status to this kind of research
- Maybe better interfaces to the current models will help in the short run and make people more productive thus speeding this development
Time Left With Control:
- My current estimate is ~6 years
- AGI ~3-4 years (less big error bars)
- Loss of control 2-3 years after AGI (pretty big error bars)
- Ways it could be longer
- AI research slows down
- Hope for safety
- Hope we aren’t as close as it seems
- Hope for a slowness to implement agentic behavior
- Competing Agents
- Alignment is pretty good and defense is easier than offense
- ?
In short, one of the most underrepresented ways to work on AI safety is to work on BCI.
The only way forward is through!
I strongly agree that we should upgrade in this sense.
I also think that a lot of this work might be initially doable with high-end non-invasive BCIs (which is also somewhat less risky, but can also be done much faster). High-end EEG seems already be used successfully to decode the images the person is looking at: https://www.biorxiv.org/content/10.1101/787101v3 And the computer can adjust its audio-visual output to aim for particular EEG changes in real-time (so fairly tight coupling is possible, which carries with it both opportunities and risks).
I have a possible post sitting in the Drafts, and it says the following among other things:
Speaking from the experimental viewpoint, we should ponder feasible experiments in creating hybrid consciousness between tightly coupled biological entities and electronic circuits. Such experiments might start shedding some empirical light into the capacity of electronic circuits to support subjective experience and might constitute initial steps towards acquiring the ability to eventually be able "to look inside the other entity's subjective realm".
[ ]
Having Neuralink-like BCIs is not a hard requirement in this sense. A sufficiently tight coupling can probably be achieved by taking EEG and polygraph-like signals from the biological entity and giving appropriately sculpted audio-visual signals from the electronic entity. I think it's highly likely that such non-invasive coupling will be sufficient for initial experiments. Tight closed loops of this kind represent formidable safety issues even with non-invasive connectivity, and since this line of research assumes that human volunteers will try this at some point, while observing the resulting subjective experiences and reporting on them, ethical and safety considerations will have to be dealt with.
Nevertheless, assuming that one finds a way for such experiments to go ahead, one can try various things. E.g. one can train a variety of differently architected electronic circuits to approximate the same input-output function, and see if the observed subjective experiences differ substantially depending on the architecture of the electronic circuit in question. A positive answer would be the first step to figuring out how activity of an electronic circuit can be directly associated with subjective experiences.
If people start organizing for this kind of work, I'd love to collaborate.