The Orthogonality Thesis statesasserts that there can exist arbitrarily intelligent agents pursuing any kind of goal.
The strong form of the Orthogonality Thesis says that there’s no extra difficulty or complication in the existence of an intelligent agent that pursues a goal, above and beyond the computational tractability of that goal.
Suppose some strange alien came to Earth and credibly offered to pay us one million dollars’ worth of new wealth every time we created a paperclip. We’d encounter no special intellectual difficulty in figuring out how to make lots of paperclips.
That is, minds would readily be able to reason about:
- How many paperclips would result, if I pursued a policy null ?
- How can I search out a policy null that happens to have a high answer to the above question?
The Orthogonality Thesis asserts that since these questions are not computationally intractable, it’s possible to have an agent can have any combinationthat tries to make paperclips without being paid, because paperclips are what it wants. The strong form of intelligence level and final goal,the Orthogonality Thesis says that is, its final goals and intelligence levels can vary independentlythere need be nothing especially complicated or twisted about such an agent.
The Orthogonality Thesis is a statement about computer science, an assertion about the logical design space of each other. This is in contrastpossible cognitive agents. Orthogonality says nothing about whether a human AI researcher on Earth would want to build an AI that made paperclips, or conversely, want to make a nice AI. The Orthogonality Thesis just asserts that the space of possible designs contains AIs that make paperclips. And also AIs that are nice, to the beliefextent there’s a sense of “nice” where you could say how to be nice to someone if you were paid a billion dollars to do that, becauseand to the extent you could name something physically achievable to do.
This contrasts to inevitablist theses which might assert, for example:
- “It doesn’t matter what kind of
their intelligence, AIsAI you build, it will turn out to only pursue its own survival as a final end.” - “Even if you tried to make an AI optimize for paperclips, it would reflect on those goals, reject them as being stupid, and embrace a goal of valuing all
convergesapient life.”
The reason to talk about Orthogonality is that it’s a common goal.key premise in two highly important policy-relevant propositions:
- It is possible to build a nice AI.
- It is possible to screw up when trying to build a nice AI, and if you do, the AI will not automatically decide to be nice instead.
Orthogonality does not require that all agent designs be equally compatible with all goals. E.g., the agent architecture AIXI-tl can only be formulated to care about direct...
The Orthogonality Thesis
statesasserts that there can exist arbitrarily intelligent agents pursuing any kind of goal.The strong form of the Orthogonality Thesis says that there’s no extra difficulty or complication in the existence of an intelligent agent that pursues a goal, above and beyond the computational tractability of that goal.
Suppose some strange alien came to Earth and credibly offered to pay us one million dollars’ worth of new wealth every time we created a paperclip. We’d encounter no special intellectual difficulty in figuring out how to make lots of paperclips.
That is, minds would readily be able to reason about:
The Orthogonality Thesis asserts that since these questions are not computationally intractable, it’s possible to have an agent
can have any combinationthat tries to make paperclips without being paid, because paperclips are what it wants. The strong form ofintelligence level and final goal,the Orthogonality Thesis says thatis, itsfinal goalsandintelligence levelscan vary independentlythere need be nothing especially complicated or twisted about such an agent.The Orthogonality Thesis is a statement about computer science, an assertion about the logical design space of
each other. This is in contrastpossible cognitive agents. Orthogonality says nothing about whether a human AI researcher on Earth would want to build an AI that made paperclips, or conversely, want to make a nice AI. The Orthogonality Thesis just asserts that the space of possible designs contains AIs that make paperclips. And also AIs that are nice, to thebeliefextent there’s a sense of “nice” where you could say how to be nice to someone if you were paid a billion dollars to do that,becauseand to the extent you could name something physically achievable to do.This contrasts to inevitablist theses which might assert, for example:
their intelligence, AIsAI you build, it will turn out to only pursue its own survival as a final end.”convergesapient life.”The reason to talk about Orthogonality is that it’s a
common goal.key premise in two highly important policy-relevant propositions:Orthogonality does not require that all agent designs be equally compatible with all goals. E.g., the agent architecture AIXI-tl can only be formulated to care about direct...