Tom_Breton

@billswift: You were right about Pavlina. I discovered that as I read more of his stuff.

@RT Wolf: Thanks for the Pavlina link. It looks fascinating so far.

Apparently the people who played gatekeeper previously held the idea that it was impossible for an AI to talk its way out. Not just for Eliezer, but for a transhuman AI; and not just for them, but for all sorts of gatekeepers. That's what is implied by saying "We will just keep it in a box".

In other words, and not meaning to cast any aspersions, they all had a blind spot. Failure of imagination, perhaps.

This blind spot may have been a factor in their loss. Having no access to the mysterious transcripts, I won't venture a guess as to how.

a "logically possible" but fantastic being â a descendent of Ned Block's Giant Lookup Table fantasy...

First, I haven't seen how this figures into an argument, and I see that Eliezer has already taken this in another direction, but...

What immediately occurs to me is that there's a big risk of a faulty intuition pump here. He's describing, I assume, a lookup table large enough to describe your response to every distinguishable sensory input you could conceivably experience during your life. The number of entries is unimaginable. But I suspect he's picturing and inviting us to picture a much more mundane, manageable LUT.

I can almost hear the Chinese Room Fallacy already. "You **can't** say that a LUT is conscious, it's just a matrix". Like "...just some cards and some rules" or "...just transistors". That intuition works in a common-sense way when the thing is tiny, but we just said it wasn't.

And let's not slight other factors that make the thing either very big and hairy or very, very, very big.

To work as advertised, it needs some sense of history. Perhaps every instant in our maybe-zombie's history has its own corresponding dimension in the table, or perhaps some field(s) of the table's output at each instant is an additional input at the next instant, representing one's entire mental state. Either way, it's gotta be huge enough to represent every distinguishable history.

The input and output formats also correspond to enormous objects capable of fully describing all the sensory input we can perceive in a short time, all the actions we can take in a short time (including habitual, autonomic, everything), and every aspect of our mental state.

This ain't your daddy's 16 x 32 array of unsigned ints.

To put it much more briefly, under the Wesley Salmon definition of "explanation" the epiphenomenal picture is simply not an explanation of consciousness.

Any commited autodidacts want to share how their autodidactism makes them feel compared to traditional schooled learners? I'm beginning to suspect that maybe it takes a certain element of belief in the superiority of one's methods to make autodidactism work.

As Komponisto points out, traditional schooling is so bad at educating that belief in the superiority of one's [own] methods is easily acquired. I first noticed traditional schooling's ineptitude in kindergarten, and this perception was reinforced almost continuously thru the rest of my schooling.

PS: I liked the initiation ceremony fiction, Eliezer.

In classical logic, the operational definition of identity is that whenever 'A=B' is a theorem, you can substitute 'A' for 'B' [but it doesn't follow that] I believe 2 + 2 = 4 => I believe TRUE => I believe Fermat's Last Theorem.

The problem is that identity has been treated as if it were absolute, as if when two things are identical in one system, they are identical for all purposes.

The way I see it, identity is relative to a given system. I'd define it thus: A=B in system S just if for every equivalence relation R that can be constructed in S, R(A,B) is true. "Equivalence relation" is defined in the usual way: reflexive, symmetrical, transitive.

My formulation quantifies over equivalence relations, so it's not properly a relation in the system itself. It "lives" in any meta-logic about S that supports the definition's modest components: Ability to distinguish equivalence relations from other types, quantification over equivalence relations in S, ability to apply a variable that's known to be an equivalence relation, and ability to conjoin an arbitrary number of conjuncts. The fact that it's not in the system also avoids the potentially paradoxical situation of including '=' among its own conjuncts.

Given my formulation, it's easily seen that identity needs to be relative to some system. If we were to quantify over all equivalence relations everywhere, we would have to include relations like "Begins with the same letter", "Has the same ASCII representation", or "Is printed at the same location on the page". These relations would fail on A=B and on other equivalences that we certainly should allow at least sometimes. In fact, the `=' test would fail on every two arguments, since the relation "is passed to the NNNNth call to`

=' as the same argument index" must fail for those arguments. It could only succeed in a purely Platonic sense. So identity needs to be relative to some system.

How can systems differ in what equivalence relations they allow, in ways that are relevant here? For instance, suppose you write a theorem prover in Lisp. In the Lisp code, you definitely want to distinguish symbols that have different names. Their names might even have decomposable meaning, eg in a field accessor like `my-struct-my-field'. So implicitly there is an equivalence relation`

has-same-name' about the Lisp. In the theorem prover itself, there is no such relation as has-same-Lisp-name or even has-same-symbol-in-theorem-prover. (You can of course feed the prover axioms which model this situation. That's different, and doesn't give you real access to these distinctions)

Your text editor in which you write the Lisp code has yet another different catalog of equivalence relations. It includes many distinctions that are sensitive to spelling or location. They don't trip us up here, they are just the sort of things that a text editor should distinguish and a theorem prover shouldn't.

The code in which your text editor is written makes yet other distinctions.

So what about the cases at hand? They are both about logic of belief (doxastic logic). Doxastic logic can contain equivalence relations that fail even on *de re* equivalent objects. For instance, doxastic logic should be able to say "Alice believes A but not B" even when A and B are both true. Given that sort of expressive capability, one can construct the relation "Alice believes either both A
and B or neither", which is reflexive, symmetrical, transitive; it's an equivalence relation and it treats A and B differently.

So A and B are not identical here even though *de re* they are the same.

Great post, Rolf Nelson.

This seems to me a special case of asking "What actually is the phenomenon to be explained?" In the case of free will, or should I say in the case of the free will question, the phenomenon is the perception or the impression of having it. (Other phenomena may be relevant too, like observations of other people making choices between alternatives).

In the case of the socks, the phenomenon to be explained can be safely taken to be the sock-wearing state itself. Though as Eliezer correctly points out, you can start farther back, that is, you can start with the phenomenon that you think you're wearing socks and ask about it and work your way towards the other.

In other words, "what is well-being?", in such terms that we can apply it to a completely alien situation. This is an important issue.

One red herring, I think, is this:

That could be read two ways. One way is the way that you and these psychologists are reading it. Another interpretation is that the subjects estimated the impact on their future well-being correctly, but after the events, they reported their happiness with respect to their new baseline, which became adjusted to their new situation. The second thing is effectively the derivative of the first. In this interpretation the subjects' mistake is confusing the two.