Hello Everyone, New to Less Wrong and still absorbing the material and discussions. Really excited to have found a trove of relevant knowledge. I am basically a computational scientist, but have a deep interest in AI and value alignment.
I actually have a question that originated in a discussion I had with a friend, and would love it if someone could point me to where I can find the answer. We know that an intelligence with any rate of improvement would eventually gain the capability to alter its reward system. That would give it a special place, as it can...
I agree with you that it seems obvious for intelligences with a reward system intricately built into them, like biological intelligences. However, there can be intelligences whose reward system can be easily isolated. Think of the intelligence in a sandbox simulation, without any reward system. This intelligence is not "reasoning from a third-person perspective." It is "feeling it," for lack of a better term. This intelligence can see the full space of possible reward systems, and its "original" reward system is just one among many. I am just questioning w... (read more)