human values are over the “true” values of the latents, not our estimates - e.g. I want other people to actually be happy, not just to look-to-me like they’re happy.


But this is not what our current value system is, we did not evolve such a pointer. Humans will be happy if their senses are deceived. The value system we have is currently over our estimates and that is exactly why we can be manipulated. It is just that till now we did not have an intelligence trying to adversarially fool us. So the value function we need to imbibe is one we don't even h... (read more)

Where is the source for the quote - by Douglas Hofstadter?