Sorted by New

Wiki Contributions


So, when people pick chocolate, it illustrates that that's what they truly desire, and when they pick vanilla, it just means that they're confused and really they like chocolate but they don't know it.

Acting based on the feelings one will experience is something that already happens, so optimizing for it is sensible

I can't really pick apart your logic here, because there isn't any. This is like saying "buying cheese is something that already happens, so optimizing for it is sensible"

I like marginal revolution, if only because the comments section will usually yell at them when they post something stupid.

Overall, it sounds to me like people are confusing their feelings about (predicted) states of the world with caring about states directly.

But aren't you just setting up a system that values states of the world based on the feelings they contain? How does that make any more sense?

You're arguing as though neurological reward maximization is the obvious goal to fall back to if other goals aren't specified coherently. But people have filled in that blank with all sorts of things. "Nothing matters, so let's do X" goes in all sorts of zany directions.

2:30 is a good time to go to the dentist.

I would. I'd want to do some shorter test runs first though, to get used to the idea, and I'd want to be sure I was in a good mood for the main reset point.

It would probably be good to find a candidate who was enlightened in the buddhist sense, not only because they'd be generally calmer and more stable, but specifically because enlightenment involves confronting the incoherent naïve concept of self and understanding the nature of impermanence. From the enlightened perspective, the peculiar topology of the resetting subjective experience would not be a source of anxiety.

Q: Is it important to figure out how to make AI provably friendly to us and our values (non-dangerous), before attempting to solve artificial general intelligence?

Stan Franklin: Proofs occur only in mathematics.

This seems like a good point, and something that's been kind of bugging me for a while. It seems like "proving" an AI design will be friendly is like proving a system of government won't lead to the economy going bad. I don't understand how it's supposed to be possible.

I can understand how you can prove a hello world program will print "hello world", but friendly AI designs are based around heavy interaction WITH the messy outside world, not just saying hello to it, but learning all but its most primitive values from it.

How can we be developing 99% of our utility function by stealing it from the outside world, where we can't even "prove" that the shop won't be out of shampoo, and yet simultaneously have a "proof" that this will all work out? Even if we're not proving "friendliness" per se, but just that the AI has "consistent goals under self-modification", consistent with WHAT? If you're not programming in an opinion about abortion and gun control to start with, how can any value it comes to regarding that be "consistent" OR "inconsistent"?

Can you give some examples of the problem?

Load More