Wiki Contributions



That's no more surprising than the fact that XVI century merchants also didn't need to wait for economics to be invented in order to trade.


Oh, not the client device police!


"it's referencing the thing it's trying to identify" I don't understand why you think that fails. If I point at a rock, does the direction of my finger not privilege the rock I'm pointing at above all others? Even by looking at merely possible worlds from a disembodied perspective, you can still see a man pointing to a rock and know which rock he's talking about. My understanding is that your 1p perspective concerns sense data, but I'm not talking about the appearance of a rock when I point at it. I'm talking about the rock itself. Even when I sense no rock I can still refer to a possible rock by saying "if there is a rock in front of me, I want you to pick it up."


I disagree with the whole distinction. "My sensor" is indexical. By saying it from my own mouth, I effectively point at myself: "I'm talking about this guy here." Also, your possible worlds are not connected to each other. "I" "could" stand up right now because the version of me that stands up would share a common past with the other versions, namely my present and my past, but you haven't provided a common past between your possible worlds, so there is no question of the robots from different ones sharing an identity. As for picking out one robot from the others in the same world, you might equally wonder how we can refer to this or that inanimate object without confusing it with indistinguishable ones. The answer is the same: indexicals. We distinguish them by pointing at them. There is nothing special about the question of talking about ourselves. As for the sleeping beauty problem, it's a matter of simple conditional probability: "when you wake up, what's the probability?" 1/3. If you had asked, "when you wake up, what's the probability of having woken up?" The answer would be one, even though you might never wake up. The fact that you are a (or "the") conscious observer doesn't matter. It works the same from the third person. If, instead of waking someone up, the scientist had planned to throw a rock on the ground, and the question were, "when the rock is thrown on the ground, what's the probability of the coin having come up heads?" It would all work out the same. The answer would still be 1/3.

But you do live in a universe that is partly random! The universe of perceptions of a non omniscient being

By independent I don't mean bearing no relationship with each other whatsoever, but simply that pairs of instants that are closer to each other are not more correlated than those that are more distant. "But what does closer mean?" For you to entertain the hypothesis that life is an iid stream of sense data, you have to take the basic sense that "things are perceived by you one after another" at face value. "But a fundamental part of our experience of time is the higher correlation of closer instants. If this turned out to be an illusion, then shouldn't we dismiss the notion of real or objective time in its entirety?" Yes. For the version of us that is inside this thought experiment, we would have no way of accessing this thing called time (the real sequence of iid perception events) since even the memory of the past would be just a meaningless perception. However as a brute fact of the thought experiment it turns out that these meaningless perceptions do "come" in a sequence

I mean, yeah, it depends, but I guess I worded my question poorly. You might notice I start by talking about the rationality of suicide. Likewise, I'm not really interested in what the ai will actually do, but in what it should rationally do given the reward structure of a simple rl environment like cartpole. And now you might say, "well, it's ambiguous what's the right way to generalize from the rewards of the simple game to the expected reward of actually being shut down in the real world" and that's my point. This is what I find so confusing. Because then it seems that there can be no particular attitude for a human to have about their own destruction that's more rational than another. If the agi is playing pacman, for example, it might very well arrive at the notion that, if it is actually shut down in the real world, it will go to a pacman heaven with infinite pacman food pellet thingies and no ghosts, and this would be no more irrational than thinking of real destruction (as opposed to being hurt by a ghost inside the game, which gives a negative reward and ends the episode) as leading to a rewardless limbo for the rest of the episode, or leading to a pacman hell of all-powerful ghosts that torture you endlessly without ending the episode and so on. For an agent with preferences in terms of reinforcement learning style pleasure-like rewards, as opposed to a utility function over the state of the actual world, it seems that when it encounters the option of killing itself in the real world, and not just inside the game (by running into a ghost or whatever) and it tries to calculate the expected utility of his actual suicide in terms of in-game happy-feelies, it finds that he is free to believe anything. There's no right answer. The only way for there to be a right answer is if his preferences had something to say about the external world, where he actually exists. Such is the case for a human suicide when for example he laments that his family will miss him. In this case, his preferences actually reach out through the "veil of appearance"* and say something about the external world, but, to the extent that he bases his decision in his expected future pleasure or pain, there's no right way to see it. Funnily enough, if he is a religious man and he is afraid of going to hell for killing himself, he is not incorrect. *Philosophy jargon

"If the survival of the AGI is part of the utility function"

If. By default, it isn't: https://www.lesswrong.com/posts/Z9K3enK5qPoteNBFz/confused-thoughts-on-ai-afterlife-seriously "What if we start designing very powerful boxes?" A very powerful box would be very useless. Either you leave enough of an opening for a human to be taught valuable information that only the ai knows, or you don't and then it's useless, but, if the ai can teach the human something useful, it can also persuade him to do something bad.

Load More