How much information does an optimal policy contain about its environment?
This post is an informal explainer of our paper which can be found on arxiv. This work was funded by the Advanced Research + Invention Agency (ARIA) Safeguarded AI Programme through project code MSAI-SE01-P005. Introduction There is an intuition that a powerful agent might have to contain some kind of world model as part of its structure in order to achieve its goals[1]. Part of this intuition comes from the fact that the actions taken by a successful agent often themselves contain information about the world. Since the actions taken by the agent are somehow generated by its internal structure, the information that is present in the actions must somehow correspond to something happening inside the agent. In this way, external, behavioural actions can give us some insight into the internal structure of an agent. We are interested in quantifying the information contained in the actions of an agent since it might provide a way to bridge the gap between observed behaviour and internal structure. One way to operationalize this question is to ask: 'if I lack knowledge about the environment but can observe an optimal agent's actions, how much information do I gain about the environment?'. Imagine a mouse who has learned that yellow boxes always contain cheese and that blue boxes are always empty. If we observe that the mouse always chews through a yellow box and ignores a blue box, the mouse's actions contain information about the true state of the world. If you didn't know what colour boxes contained cheese, but you did know that the mouse always successfully finds the cheese, observing the mouse's actions would give you information about the world (ie. the location of the cheese). In this sense, we can say that the actions of an optimal agent (the mouse) contain information about the world (the location of the cheese). From this claim, we might want to ask 'for what kind of goals is this true?' and 'how much information does the actions of the mouse contain about thi