Wiki Contributions


Mathematics are so firmly grounded in the physical reality that when observations don't line up with what our math tells us, we must change our understanding of reality, not of math. This is because math is inextricably tied to reality, not because it is separate from it.

On the other hand...

I could add: Objective punishments and rewards need objective justification.

From my perspective, treating rationality as always instrumental, and never a terminal value is playing around with it's traditional meaning. (And indiscriminately teaching instrumental rationality is like indiscriminately handing out weapons. The traditional idea, going back to st least Plato, is that teaching someone to be rational improves them...changes their values)

I am aware that humans hav a non zero level of life threatening behaviour. If we wanted it to be lower, we could make it lower, at the expense of various costs. We don't which seems to mean we are happy with the current cost benefit ratio. Arguing, as you have, that the risk of AI self harm can't be reduced to zero doesn't mean we can't hit an actuarial optimum.

It is not clear to me why you think safety training would limit intelligence.

Regarding the anvil problem: you have argued with great thoroughness that one can't perfectly prevent an AIXI from dropping an anvil on its head. However, I can't see the necessity. We would need to get the probability of a dangerously unfriendly SAI as close to zero as possible, because it poses an existential threat. However, a suicidally foolish AIXI is only a waste of money.

Humans have a negative reinforcement channel relating to bodily harm called pain. It isn't perfect, but it's good enough to train most humans to avoid doing suicidal stupid things. Why would an AIXI need anything better? Yout might want to answer that there is some danger related to an AIXI s intelligence, but it's clock speed, or whatever, could be throttle, during training.

Also any seriously intelligent .AI made with the technology of today, or the near future, is going to require a huge farm of servers. The only way it could physically interact with the world is through remote controlled body...and if drops an anvil on that, it actually will survive as a mind!

An entity that has contradictory beliefs will be a poor instrumental rationalist. It looks like you would need to engineer a distinction between instrumental beliefs and terminal beliefs. While we're on the subject, you might need a firewall to stop an .AI acting on intrinsically motivating ideas, if they exist. In any case, orthogonality is an architecture choice, not an ineluctable fact about minds.

The OT has multiple forms, as Armstrong notes. An OT that says you could make arbitrary combinations of preference and power if you really wanted to, can't plug into an argument that future .AI will ,with high probability, be a Lovecraftian horror, at least not unless you also aargue that an orthogonal architecture will be chosen, with high probability.

something previously deemed "impossible"

It's clearly possible for some values of "gatekeeper", since some people fall for 419 scams. The test is a bit meaningless without information about the gatekeepers

The problem is that I don't see much evidence that Mr. Loosemore is correct. I can quite easily conceive of a superhuman intelligence that was built with the specification of "human pleasure = brain dopamine levels", not least of all because there are people who'd want to be wireheads and there's a massive amount of physiological research showing human pleasure to be caused by dopamine levels.

I don't think Loosemore was addressing deliberately unfriendly AI, and for that matter EY hasn't been either. Both are addressing intentionally friendly or neutral AI that goes wrong.

I can quite easily conceive of a superhuman intelligence that knows humans prefer more complicated enjoyment, and even do complex modeling of how it would have to manipulate people away from those more complicated enjoyments, and still have that superhuman intelligence not care.

Wouldn't it care about getting things right?

Trying to think this out in terms of levels of smartness alone is very unlikely to be helpful.

Load More