LESSWRONG
LW

1461
luosha@gmail.com
0020
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No posts to display.
No wikitag contributions to display.
What, if not agency?
luosha@gmail.com5d10

"Maybe one agent aligned with another agent is a somewhat anti-natural concept."   
If you have two agents that are perfectly aligned (have all the same preferences with respect to how the world should be), and who can share information (they should want to, if they are perfectly aligned) in order to coordinate action, would there be any reason to model them as two distinct agents, rather than one? Maybe the natural concept of having multiple agents who are aligned requires that there be only imperfect communication among them, so that it makes sense to distinguish each from the others. 

Reply
AGI Ruin: A List of Lethalities
luosha@gmail.com3y10

On instrumental convergence: humans would seem to be a prominent counterexample to "most agents don't let you edit their utility functions" -- at least in the sense that our goals/interests etc are quite sensitive to those of people around us. So maybe not explicit editing, but lots of being influenced by and converging to the goals and interests of those around us. (and maybe this suggests another tool for alignment, which is building in this same kind of sensitivity to artificial agents' utility functions)

Reply