Rationalist Should Win. Not Dying with Dignity and Funding WBE.

CitizenTen

This is somewhat a response to the Dying with Dignity post.

Rationalist should win. If you are taking actions that lead to failure, you aren’t being rational, you aren’t playing for dignity points, you are losing. (More applause lights here)

But the core argument needs to be addressed. AI alignment looks really hard. We have more unsolved problems than you can shake a stick at (here, here, etc.) and it seems there are good reasons to think that AGI will by default be unaligned. (Mind-space is big after all) The argument continues. Currently, we have more money than ideas. We’re trying everything we can think of, nobody has any clear solutions or even an idea for a real path forward. It’s so bad that we aren’t even sure if we know how confused we are about the problem. And Machine Learning progress is barreling down a hill, setting very short deadlines on solving this problem. Basically, unless the heavens themselves part and give us a Deus ex machina, we’re dead, flayed, and roasted alive a million times over.

So I have to ask. Are we doing everything we can think of? Set a 5 minute timer, forget everything you know about AI alignment, computer science, and make yourself as nontechnical as possible. Ask yourself. If you were to build an AGI, how would you go about doing it? Give yourself full license to think as scrappy as possible. Gloves are off. Cheating is allowed. It doesn’t matter how you do it, as long as it works.

My first thought. This sounds hard. Can I copy someone else’s work? And we can! We genuinely can! We have AGI’s running around all the time. They're called Humans! This strategy is generally called human emulation.

So we know a strategy that will work. We have actual evidence this is true. Human’s exist and are (generally) aligned with human values. And emulating a human lets us sidestep all the tricky alignment/philosophical problems we're stuck on. We treat the emulation as a black box, run it at 10x speed and there you go. Problem solved^[1]. And I’m part of a community that wants to build AGI. So what’s the current funding situation for this? The Dying with Dignity post seems to assume we’re trying everything we can think of. And have more money than ideas. So it’s gotta be pretty high. 50 million? 20 million? 5 million? 1 million? *Checks notes* Zero dollars!? And not only are we not funding it, but nobody even bothered to really try in the last 10 years? This area of research is so underfunded, that people are able to say with a straight face that an UNDERGRADUATE deciding to work on this problem for a month is around 8% of the entire WORLDLY research in this area. Or a comment that a huge step in the field could (probably) been solved for a million dollars, but ran out of money.

Then there's Openworm. They’re trying to run an emulation for the simplest organism’s nervous system they can find. Take a look at their website. If you tell me that this is the best humanity has to offer when we get serious about winning, and you're able to say that with a straight face, you deserve to get shot. Twice.^[2]

So a few^[3] arguments I’ve been able to dig up on why we aren’t funding brain emulations. One is that yes, it will work, but the technology isn’t there. By the time we develop full emulations, we’ll already have developed the technology to build unaligned AGI. Emulation won't be developed in time to save us. The second one is developing brain emulation technology will make it easier to build unaligned AGI sooner than aligned ones.

To the first argument. The fact that the technology is not there. Have you heard of the Manhattan project?

In 1938 the theory behind nuclear fission was developed.
In 1942 the Manhattan project was started.
In 1946 the problem was solved.

Get that straight. We went from literal theory, of like yes, this is allowed within the rules of physic, to developing a fission bomb in less than 8 years. And most of the real research was done in four. Research can move fast when we’re serious about it. You might complain, we don’t have Manhattan project sized resources! But yes. Yes we do.^[4] Also, I hear we’re desperate to fund large scale mega projects.

Why do I bring up the Manhattan project? I think it’s the closest to what development in brain emulation would look like if we were serious about developing emulation technology in time for it to be usable. Here's the scenario both projects were in when they started and why I think emulation research could be scalability.

We know it's theoretically possible.
We have a crisp understanding of what the goal is.
We know exactly what the end result looks like.

Knowing exactly what the goal is and what a solution looks like, goes a long way toward making it possible to research something quickly. We want all the neurons in the brain to be emulated on a computer. How hard is this? I dunno. But in comparison to a problem such as, trying to solve world peace, or solve philosophy, or solve AI alignment, it looks extremely tractable in comparison. Here, it's pretty clear if what's being tried works or not. This means you get fast feedback loops and hopefully real progress.

On the second argument. Developing brain emulation technology will make it easier to make an unaligned intelligence. And therefore we shouldn't do this. To quote someone else, “I'm standing there thinking: Does it even occur to you that this might be a time to try for something other than the default outcome?” Get this. We know a strategy will work. Everything else we’re trying is sorta a crap shoot^[5]. Ergo, we won’t research this area at all, as it will make building an unaligned intelligence easier. Well no sh*t! There’s no world that exists in which humans can build an aligned intelligence without also being able to build an unaligned one. Furthermore, this would only be an important consideration if there’s no other paths toward building an unaligned AGI that people are working on. This is definitely wrong. The argument also presupposes people will only look at the brain to improve ML technique in worlds where we fund brain emulations. Which has already shown to be wrong. Modern neural networks are somewhat based on the brain/neurons. Furthermore, organizations can keep secrets. The Manhattan project was classified. Why can’t this one be as well? (If the concern is really that great)

Basically, I think that research with regards to brain emulation is way underfunded relative to its potential. It's just completely neglected. It's tractable. And (probably) scalable. If we're playing to win, it's an obvious choice. Or even worse, if we're currently screwed, and need a miracle, this is a real Hail Mary that could work. Rationalist should win, damn it.

^{^}
I mean. As much as giving full power to a single human's goals/values "solves" the problem. Though it sure beats dying from an completely unaligned one.
^{^}
No malice intended toward the Openworm project. (They're doing good work) Just making the point that's it's completely underfunded relative to potential expected value.
^{^}
https://intelligence.org/files/SS11Workshop.pdf
https://www.lesswrong.com/posts/evtKwDCgtQQ7ozLn4/randal-koene-on-brain-understanding-before-whole-brain
https://www.lesswrong.com/posts/PTkd8nazvH9HQpwP8/building-brain-inspired-agi-is-infinitely-easier-than
^{^}
The Manhattan project cost around 2 billion in 1945, which is around 30 billion today. Furthermore, around 90% of the cost of the project went into construction for building plants and getting the fission materials, something I don’t imagine large scale research into brain emulation's need. Also it was run by the government, so efficiency might have been worse in comparison to a private organization that did the same thing. (High uncertainty the last point )
^{^}
Sorry current AI researchers. It looks like that from the outside. I really hope I'm wrong and you're actually on the path toward finding a silver bullet.

I definitely agree that emulating a simple worm is worth all the resources that a project like that can absorb. Maybe start with something simpler than a worm. An Openmoeba.

Are we doing everything we can think of?

Doing everything we can think of means doing a lot of things that are bad ideas that do more harm than good.

So we know a strategy that will work. We have actual evidence this is true. Human’s exist and are (generally) aligned with human values.

If you are employing Bob and Dave and Bob provides more value for your business than Dave, you can't simply use the resources of Bob to copy Dave.

The inability of humans to be copied produces slack that we can use to get good things in spite of optimization pressures.

I agree with you almost perfectly. I'd been working on a (very long-shot) plan for it, myself, but having recently realized that other people may be working on it, too, I've started looking for them. Do you (or anyone reading this) know of anybody seriously willing to do, or already engaged in, this avenue of work? Specifically, working towards WBE, with the explicit intent of averting unaligned AGI disaster.

I'm also interested, have you made any progress since your comment?

Please note that Eliezer's view is not at all the consensus view; there are comparably many alignment researchers who believe that we are likely to succeed, and that current approaches are directly on the path to alignment. While WBE is an interesting possibility to keep in mind, it is not correct to strategize as if we are >90% doomed and need to take a "Hail Mary" regardless of the risks.

Brain is the most complex information exchange system in the known Universe. Whole Brain Emulation is going to be really hard. I would probably go with a different solution. I think myopic AI has potential.

EDIT: It may also be worth considering building an AI with no long-term memory. If you want it to do a thing, you put in some parameters ("build a house that looks like this"), and they are automatically wiped out once the goal is achieved. Since the neural structure in fundamentally static (not sure how to build it, but it should be possible?), the AI cannot rewrite itself to not lose memory. If it doesn't remember things, it probably can't come up with a plan to prevent itself from being reset/turned off, or kill all humans, or build a new AI with no limitations. And then you also reset the whole thing every day just in case.

"Complex" doesn't imply "hard to emulate". We likely won't need to understand the encoded systems, just the behavior of the neurons. In high school I wrote a simple simulator of charged particles - the rules I needed to encode were simple, but it displayed behavior I hadn't programmed in, nor expected, but which were, in fact, real phenomena that really happen.

I would argue that the most complex information exchange system in the known Universe will be "hard to emulating". I don't see how it can be any other way. We already understand the neurons well enough to emulate them. This is not nearly enough. You will not be able to do whole brain emulation without understanding of the inner workings of the system.

I have, completely seriously, been thinking that uploading a human who is aligned with human values (Eliezer Yudkowsky?) might be the safest option for creating FAI. Though I didn't think of it is a serious possibility technically speaking, in the kind of timeframe we have available (and still not sure it is. how do I evaluate that?).

I have also been thinking about furthering my understanding of Friendliness. Human ethics. What would we want FAI to do? (the part of it that is possible for a human to understand, anyway) since that seems important both for AI alignment and for human alignment.

An alternative idea I've had is if it's possible for a human to reason well about human ethics and to answer difficult moral questions about the kind of decisions FAI would need to make, we explore the mental processes of such humans (who are good moral reasoners) and "scale them up" to build FAI.
Maybe literally (brain-like AGI?).

I don't know about human values, but Yudkovsky is certainly not aligned with my values on at least a couple significant points. Still beats either Clippy or Xi Jinping of course.

"So we know a strategy that will work. We have actual evidence this is true. Human’s exist and are (generally) aligned with human values. "

The above is false. Humans aren't really aligned with human values. Most humans are heavily constrained in their actions. When we see very unconstrained humans (Vladimir Putin, Adolf Hitler, Joseph Stalin, Xi Jinping, Mao Zedong, Deng Xiaoping) a large proportion are not aligned with human values.

(I've stayed with the moderns, but a review of ancient rulers will yield similar results.)

The people you list are all in contexts where they are forced to juggle many forces having different goals, capacity for violence, sizes, etc. Leaders like these gain and keep power by {cajoling, rewarding, punishing, threatening} the military, capitalists, businessmen, revolutionaries, farmers, peasants, bureaucrats, and foreign nations; and by making themselves Basilisks of violence (I think that's why Hitler and Mussolini loved each other). It's not obvious to me that these forces *made* e.g. Hitler what he was, but they seem pretty different in nature, and plausibly also in effect, from the incentives of a genius in a vat in a basement with a few genius buddies with a mission to save the world.

Yeah. Why I said "generally." Obviously if you emulate Stalin bad things will happen. Slightly off topic but I'd say those leader's weren't "unconstrained" and revealed something about human nature, but were constrained as any ruler and had to follow the Dictator's Handbook to stay in power.

It is very unclear that generally humans are unlike Stalin. Maybe! But most humans have far too little power to reveal their preferences-with-lots-of-power. And we seem to have sayings like "power corrupts", but it's not at all clear to me whether power corrupts, only the corrupt can gain power, or power simply reveals.

It bears mention that, compared to the median predicted unaligned AGI, I'd hands-down accept Hitler as supreme overlord. It seems probable that humans would still exist under Hitler, and in a fairly recognizable form, even if there were many troubling things about their existence. Furthermore, I suspect that an average human would be better than Hitler, and I'm fairly optimistic that most individuals striving to prevent the AGI apocalypse would make for downright pleasant overseers (or whatever).