This article is really approachable for someone like me who’s just getting acquainted with mathematical AI safety research, so I appreciate that! This definitely helped me better understand the IMP.
I have a question about this part in the "How is the controller 'modelling' the environment?" section:
If the joint system is represented by environment-controller pairs , then being injective means that no two pairs (within ) will have the same environment value or controller value . This means that with appropriate re-labelling, each joint state can be indexed:
I don't see how this follows. Couldn't the shape of the joint states be more complicated such as forming a cycle or having multiple joint states evolve to the same joint state? Both these possibilities would break the indexing. Is there something I'm missing that implies this linear structure?
Thanks for the answer! That confirms what I was thinking.
That second case: (s1,w1)→(s2,w2) and (s1,w3)→(s2,w2) surprised me since I initially thought that the IMP implied that there was an additional isomorphism between the controller and the environment. I guess that isomorphism effectively still exists since it can be created through the use of coarse graining over the controller states like you mentioned.