LESSWRONG
LW

Wikitags

The Pointers Problem

Edited by Johannes C. Mayer, 1point7point4, et al. last updated 18th Oct 2024

Consider an agent with a model of the world W. How does W relate to the real world? W might contain a chair. In order for W to be useful it needs to map to reality, i.e. there is a function f with W_chair ↦ R_chair.

The pointers problem is about figuring out f.

In John's words (who introduced the concept here):

What functions of what variables (if any) in the environment and/or another world-model correspond to the latent variables in the agent’s world-model?

This relates to alignment, as we would like an AI that acts based on real-world human values, not just human estimates of their own values – and that the two will be different in many situations, since humans are not all-seeing or all-knowing. Therefore we'd like to figure out how to point to our values directly.

Subscribe
1
Subscribe
1
Discussion0
Discussion0
Posts tagged The Pointers Problem
61The Pointers Problem: Clarifications/Variations
Ω
abramdemski
5y
Ω
16
129The Pointers Problem: Human Values Are A Function Of Humans' Latent Variables
Ω
johnswentworth
5y
Ω
50
72Don't design agents which exploit adversarial inputs
Ω
TurnTrout, Garrett Baker
3y
Ω
64
116Robust Delegation
Ω
abramdemski, Scott Garrabrant
7y
Ω
10
62Alignment allows "nonrobust" decision-influences and doesn't require robust grading
Ω
TurnTrout
3y
Ω
41
48Don't align agents to evaluations of plans
Ω
TurnTrout
3y
Ω
49
46[Intro to brain-like-AGI safety] 9. Takeaways from neuro 2/2: On AGI motivation
Ω
Steven Byrnes
3y
Ω
11
41The Pointer Resolution Problem
Ω
Jozdien
2y
Ω
20
33People care about each other even though they have imperfect motivational pointers?
Ω
TurnTrout
3y
Ω
25
24Stable Pointers to Value: An Agent Embedded in Its Own Utility Function
Ω
abramdemski
8y
Ω
9
20Stable Pointers to Value III: Recursive Quantilization
Ω
abramdemski
7y
Ω
4
19Stable Pointers to Value II: Environmental Goals
Ω
abramdemski
8y
Ω
3
41Updating Utility Functions
Ω
JustinShovelain, Joar Skalse
3y
Ω
6
39Human sexuality as an interesting case study of alignment
beren
3y
26
16Half-baked idea: a straightforward method for learning environmental goals?
Ω
Q Home
7mo
Ω
7
Load More (15/17)
Add Posts