Consider an agent with a model of the world W. How does W relate to the real world.world? W might contain a chair. In order for W to be useful it needs to map to reality, i.e. there is a function f with W_chair ↦ R_chair.
The pointers problem refers to the fact that most humans would rather have an AI that acts based on real-world human values, not just human estimates of their own values – and that the two will be different in many situations, since humans are not all-seeing or all-knowing[citation needed].knowing. It was introduced in a post with the same name.
The pointers problem refers to the fact that most humans would rather have an AI that acts based on real-world human values, not just human estimates of their own values – and that the two will be different in many situations, since humans are not all-seeing or all-knowing[citation needed].It was introduced in a post with the same name.
Consider an agent with a model of the world W. How does W relate to the real world. W might contain a chair. In order for W to be useful it needs to map to reality, i.e. there is a function f with W_chair ↦ R_chair.
The pointers problem refers is about figuring out f.
In John's words (who introduced the concept here):
What functions of what variables (if any) in the environment and/or another world-model correspond to the
fact that most humanslatent variables in the agent’s world-model?
This relates to alignment, as we would rather havelike an AI that acts based on real-world human values, not just human estimates of their own values – and that the two will be different in many situations, since humans are not all-seeing or all-knowing. It was introduced in a post with the same name.Therefore we'd like to figure out how to point to our values directly.