Wiki Contributions


Thinking is like kicking a rock down a lane as you walk. If the object is oddly shaped, it may tumble oddly and go off in some oblique direction even if you impelled it forcefully straight. Without care, you're liable to leave the object by the wayside and replace it with another, or with nothing. Tendencies of the object's motion are produced both by the landscape--the slopes and the textures--and by the way you impel it, in big or little steps, with topspin or sidespin. The object may get stuck in a pothole or by the curb, and there's no guarantee you'll have the patience, or the small-foot-ness, needed to free it. People may look at you funny, as though you're acting like a child, and you're certain to leave the project behind without a bit of obliviousness and stand-offishness in you. It's not clear whether you're accomplishing anything with each step, let alone whether there is or what might be the ultimate payoff.

Privacy and Manipulation

It's definitely a huge red flag if someone is pressuring you out of sharing something with your closest friends.

Privacy and Manipulation

Another dimension to get more good and less bad: you can ask for, and Alice can tell you, her reasons for wanting secrecy. Combined with explicitly saying that you're not taking an absolutist posture, this can sometimes give you more leeway to do reasonable stuff.

Morality is Scary
at least some researchers don't seem to consider that part of "alignment".

It's part of alignment. Also, it seems mostly separate from the part about "how do you even have consequentialism powerful enough to make, say, nanotech, without killing everyone as a side-effect?", and the latter seems not too related to the former.

Biology-Inspired AGI Timelines: The Trick That Never Works

Seems right, IDK. But still, that's a different kind of uncertainty than uncertainty about, like, the shape of algorithm-space.

Morality is Scary

So on the one hand you have values that are easily, trivially compatible, such as "I want to spend 1000 years climbing the mountains of Mars" or "I want to host blood-sports with my uncoerced friends with the holodeck safety on".

On the other hand you have insoluble, or at least apparently insoluble, conflicts: B wants to torture people, C wants there to be no torture anywhere at all. C wants to monitor everyone everywhere forever to check that they aren't torturing anyone or plotting to torture anyone, D wants privacy. E and F both want to be the best in the universe at quantum soccer, even if they have to kneecap everyone else to get that. Etc.

It's simply false that you can just put people in the throne as emperor of the universe, and they'll justly compromise about all conflicts. Or even do anything remotely like that.

How many people have conflictual values that they, effectively, value lexicographically more than their other values? Does decision theory imply that compromise will be chosen by sufficiently well-informed agents who do not have lexicographically valued conflictual values?

Morality is Scary

> All the rest is an act of shared imagination. It’s a dream we weave around a status game.
> They’re part of the dream of reality in which they exist, a dream that feels no less obvious and true to them than ours does to us.
> Moral ‘truths’ are acts of imagination. They’re ideas we play games with.

IDK, I feel like you could say the same sentences truthfully about math, and if you "went with the overall vibe" of them, you might be confused and mistakenly think math was "arbitrary" or "meaningless", or doesn't have a determinate tendency, etc. Like, okay, if I say "one element of moral progress is increasing universalizability", and you say "that's just the thing your status cohort assigns high status", I'm like, well, sure, but that doesn't mean it doesn't also have other interesting properties, like being a tendency across many different peoples; like being correlated with the extent to which they're reflecting, sharing information, and building understanding; like resulting in reductionist-materialist local outcomes that have more of material local things that people otherwise generally seem to like (e.g. not being punched, having food, etc.); etc. It could be that morality has tendencies, but not without hormesis and mutually assured destrubtion and similar things that might be removed by aligned AI.

Biology-Inspired AGI Timelines: The Trick That Never Works

Hold on, I guess this actually means that for a natural interpretation of "entropy" in "generic uncertainty about maybe being wrong, without other extra premises, should increase the entropy of one's probability distribution over AGI," that statement is actually false. If by "entropy" we mean "entropy according to the uniform measure", it's false. What we should really mean is entropy according to one's maximum entropy distribution (as the background measure), in which case the statement is true.

Biology-Inspired AGI Timelines: The Trick That Never Works

I have calculated the number of computer operations used by evolution to evolve the human brain - searching through organisms with increasing brain size - by adding up all the computations that were done by any brains before modern humans appeared. It comes out to 10^43 computer operations. AGI isn't coming any time soon!

And yet, because your reasoning contains the word "biological", it is just as invalid and unhelpful as Moravec's original prediction.

I agree that the conclusion about AGI not coming soon is invalid, so the following isn't exactly responding to what you say. But: ISTM the evolution thing is somewhat qualitatively different from Moravec or Stack More Layers, in that it softly upper bounds the uncertainty about the algorithmic knowledge needed to create AGI. IDK how easy it would be to implement an evolution that spits out AGI, but that difficulty seems like it should be less conceptually uncertain than the difficulty of understanding enough about AGI to do something more clever with less compute. Like, we could extrapolate out 3 OOMs of compute/$ per decade to get an upper bound: very probably AGI before 2150-ish, if Moore's law continues. Not very certain, or helpful if you already think AGI is very likely soon-ish, but it has nonzero content.

Biology-Inspired AGI Timelines: The Trick That Never Works

Now having read the rest of the essay... I guess "maximum entropy" is just straight up confusing if you don't insert the "...given assumptions XYZ". Otherwise it sounds like there's such a thing as "the maximum-entropy distribution", which doesn't exist: you have to cut up the possible worlds somehow, and different ways of cutting them up produces different uniform distributions. (Or in the continuous case, you have to choose a measure in order to do integration, and that measure contains just as much information as a probability distribution; the uniform measure says that all years are the same, but you could also say all orders of magnitude of time since the Big Bang are the same, or something else.) So how you cut up possible worlds changes the uniform distribution, i.e. the maximum entropy distribution. So the assumptions that go into how you cut up the worlds, are determining your maximum entropy distribution.

Load More