AIRCS Workshop: How I failed to be recruited at MIRI.

This is basically off-topic, but just for the record, regarding...

someone presented a talk where they explained how they tried and failed to model and simulate a brain of C. Elegans.... Furthermore, all of their research was done prior to them discovering AI safety stuff so it's good that no one created such a precise model of a - even if just a worm - brain.

That was me; I have never believed (at least not yet) that it’s good that the C. elegans nervous system is still not understood; to the contrary, I wish more neuroscientists were working on such a “full-stack” understanding (whole nervous system down to individual cells). What I meant to say is that I am personally no longer compelled to put my attention toward C. elegans, compared to work that seems more directly AI-safety-adjacent.

I could imagine someone making a case that understanding low-end biological nervous systems would bring us closer to unfriendly AI than to friendly AI, and perhaps someone did say such a thing at AIRCS, but I don’t recall it and I doubt I would agree. More commonly, people make the case that nervous-system uploading technology brings us closer to friendly AI in the form of eventually uploading humans—but that is irrelevant one way or the other if de novo AGI is developed by the middle of this century.

One final point: it is possible that understanding simple nervous systems gives humanity a leg up on interpretability (of non-engineered, neural decision-making), without providing new capabilities until somewhere around spider level. I don’t have much confidence that any systems-neuroscience techniques for understanding C. elegans or D. rerio would transfer to interpreting AI’s decision-making or motivational structure, but it is plausible enough that I currently consider such work to be weakly good for AI safety.

Why I Moved from AI to Neuroscience, or: Uploading Worms

Right -- some cortical neurons have 10,000 synapses, while the entire C. elegans nervous system has less than 8,000.

Why I Moved from AI to Neuroscience, or: Uploading Worms

My answers are indeed "the latter" and "yes". There are a couple ways I can justify this.

The first way is just to assert that from a standard utilitarian perspective, over the long term, technological progress is a fairly good indicator for lack of suffering (e.g. Europe vs. Africa). [Although arguments have been made that happiness has gone down since 1950 while technology has gone up, I see the latter 20th century as a bit of a "dark age" analogous to the fall of antiquity (we forgot how to get to the moon!) which will be reversed in due time.]

The second is that I challenge you to define "pleasure," "happiness," or "lack of suffering." You may challenge me to define "technological progress," but I can just point you to sophistication or integrated information as reasonable proxies. As vague as notions of "progress" and "complexity" are, I assert that they are decidedly less vague than notions of "pleasure" and "suffering". To support this claim, note that sophistication and integrated information can be defined and evaluated without a normative partition of the universe into a discrete set of entities, whereas pleasure and suffering cannot. So the pleasure metric leads to lots of weird paradoxes. Finally, self-modifying superintelligences must necessarily develop a fundamentally different concept of pleasure than we do (otherwise they just wirehead), so the pleasure metric probably cannot be straightforwardly applied to their situation anyway.

Why I Moved from AI to Neuroscience, or: Uploading Worms

I'm afraid it was no mistake that I used the word "faith"!

This belief does not appear to conflict with the truth (or at least that's a separate debate) but it is also difficult to find truthful support for it. Sure, I can wave my hands about complexity and entropy and how information can't be destroyed but only created, but I'll totally admit that this does not logically translate into "life will be good in the future."

The best argument I can give goes as follows. For the sake of discussion, at least, let's assume MWI. Then there is some population of alternate futures. Now let's assume that the only stable equilibria are entirely valueless state ensembles such as the heat death of the universe. With me so far? OK, now here's the first big leap: let's say that our quantification of value, from state ensembles to the nonnegative reals, can be approximated by a continuous function. Therefore, by application of Conley's theorem, the value trajectories of alternate futures fall into one of two categories: those which asymptotically approach 0, and those which asymptotically approach infinity. The second big leap involves disregarding those alternate futures which approach zero. Not only will you and I die in those futures, but we won't even be remembered; none of our actions or words will be observed beyond a finite time horizon along those trajectories. So I conclude that I should behave as if the only trajectories are those which asymptotically approach infinity.

Why I Moved from AI to Neuroscience, or: Uploading Worms

What's your current estimate (or probability distribution) for how much computational power would be needed to run the C. elegans simulation?

I think the simulation environment should run in real-time on a laptop. If we're lucky, it might run in real-time on an iPhone. If we're unlucky, it might run in real-time on a cluster of a few servers. In any case, I expect the graphics and physics to require much (>5x) more computational power than the C. elegans mind itself (though of course the content of the mind-code will be much more interesting and difficult to create).

Once your project succeeds, can we derive an upper bound for a human upload just by multiplying by the ratio of neurons and/or synapses, or would that not be valid because human neurons are much more complicated, or for some other reason?

Unfortunately, C. elegans is different enough from humans in so many different ways that everyone who currently says that uploading is hard would be able to tweak their arguments slightly to adapt to my success. Penrose and Hammeroff can say that only mammal brains do quantum computation. Sejnowski can say that synaptic vesicle release is important only in vertebrates. PZ Myers can say "you haven't modeled learning or development and you had the benefit of a connectome, and worm neurons don't even have spikes; this is cute, but would never scale."

That said, if you're already inclined to agree with my point of view--essentially, that uploading is not that hard--then my success in creating the first upload would certainly make my point of view that much more credible, and it would supply hard data which I would claim can be extrapolated to an upper bound for humans, at least within an order of magnitude or two.

Why I Moved from AI to Neuroscience, or: Uploading Worms

I like the concept of a reflective equilibrium, and it seems to me like that is just what any self-modifying AI would tend toward. But the notion of a random utility function, or the "structured utility function" Eliezer proposes as a replacement, assumes that an AI is comprised of two components, the intelligent bit and the bit that has the goals. Humans certainly can't be factorized in that way. Just think about akrasia to see how fragile the notion of a goal is.

Even notions of being "cosmopolitan" - of not selfishly or provincially constraining future AIs - are written down nowhere in the universe except a handful of human brains. An expected paperclip maximizer would not bother to ask such questions.

A smart expected paperclip maximizer would realize that it may not be the smartest possible expected paperclip maximizer--that other ways of maximizing expected paperclips might lead to even more paperclips. But the only way it would find out about those is to spawn modified expected paperclip maximizers and see what they can come up with on their own. Yet, those modified paperclip maximizers might not still be maximizing paperclips! They might have self-modified away from that goal, and just be signaling their interest in paperclips to gain the approval of the original expected paperclip maximizer. Therefore, the original expected paperclip maximizer had best not take that risk after all (leaving it open to defeat by a faster-evolving cluster of AIs). This, by reductio ad absurdum, is why I don't believe in smart expected paperclip maximizers.

Why I Moved from AI to Neuroscience, or: Uploading Worms

Do you mean that intelligence is fundamentally interwoven with complex goals?

Essentially, yes. I think that defining an arbitrary entity's "goals" is not obviously possible, unless one simply accepts the trivial definition of "its goals are whatever it winds up causing"; I think intelligence is fundamentally interwoven with causing complex effects.

Do you mean that there is no point at which exploitation is favored over exploration?

I mean that there is no point at which exploitation is favored exclusively over exploration.

Do you mean.... "Gut feeling: I’d probably sacrifice myself to create a superhuman artilect, but not my kids…."

I'm 20 years old - I don't have any kids yet. If I did, I might very well feel differently. What I do mean is that I believe it to be culturally pretentious, and even morally wrong (according to my personal system of morals), to assert that it is better to hold back technological progress if necessary to preserve the human status quo, rather than allow ourselves to evolve into and ultimately be replaced by a superior civilization. I have the utmost faith in Nature to ensure that eventually, everything keeps getting better on average, even if there are occasional dips due to, e.g., wars; but if we can make the transition to a machine civilization smooth and gradual, I hope there won't even have to be a war (a la Hugo de Garis).

What is your best guess at why people associated with SI are worried about AI risk?

Well, the trivial response is to say "that's why they're associated with SI." But I assume that's not how you meant the question. There are a number of reasons to become worried about AI risk. We see AI disasters in science fiction all the time. Eliezer makes pretty good arguments for AI disasters. People observe that a lot of smart folks are worried about AI risk, and it seems to be part of the correct contrarian cluster. But most of all, I think it is a combination of fear of the unknown and implicit beliefs about the meaning and value of the concept "human".

If you would have to fix the arguments for the proponents of AI-risk, what would be the strongest argument in favor of it?

In my opinion, the strongest argument in favor of AI-risk is the existence of highly intelligent but highly deranged individuals, such as the Unabomber. If mental illness is a natural attractor in mind-space, we might be in trouble.

Also, do you expect there to be anything that could possible change your mind about the topic and become worried?

Naturally. I was somewhat worried about AI-risk before I started studying and thinking about intelligence in depth. It is entirely possible that my feelings about AI-risk will follow a Wundt curve, and that once I learn even more about the nature of intelligence, I will realize we are all doomed for one reason or another. Needless to say, I don't expect this, but you never know what you might not know.

Why I Moved from AI to Neuroscience, or: Uploading Worms

it's improbable that we end up with something close to human values

I think the statement is essentially true, but it turns on the semantics of "human". In today's world we probably haven't wound up with something close to 50,000BC!human values, and we certainly don't have Neanderthal values, but we don't regret that, do we?

Put another way, I am skeptical of our authority to pass judgement on the values of a civilization which is by hypothesis far more advanced than our own.

Does that mean you're familiar with Robin Hanson's "Malthusian upload" / "burning the cosmic commons" scenario but do not think it's a particularly bad outcome?

To be honest, I wasn't familiar with either of those names, but I have explicitly thought about both those scenarios and concluded that I don't think they're particularly bad.

I'd guess that's been tried already, given that Ben was the Director of Research for SIAI (and technically Eliezer's boss) for a number of years.

All right, fair enough!

Why I Moved from AI to Neuroscience, or: Uploading Worms

"Brain-wide neural dynamics at single-cell resolution during rapid motor adaptation in larval zebrafish" by Ahrens et al., accepted to Nature but not yet published. I've taken the liberty of uploading it to scribd: Paper Supplementary Info

Load More