I am after papers that you have stumbled across that may be relevant to the core of the understanding the existential AI alignment problem despite not having explicit links to AI alignment.
Here are papers I have found that fit the above criteria to give you a better idea of what I'm after:
While I've used the term "agent foundations" I expect the majority of useful papers will not use terms like agency, optimization etc.
An interesting paper is The information theory of individuality, Krakauer et. al
Maybe Computationally Tractable Choice.
Interpreting Systems as Solving POMDPs: A Step Towards a Formal Understanding of Agency
"General cognitive science": Boyd et al. (2022), Goyal & Bengio (2022), Levin (2022), Fields et al. (2022), Friston et al. (2022), Ma et al. (2022), LeCun (2022) (links from here).
Also, general theories of cognitive development, e.g., Kuchling et al. 2022; Fields et al. 2022.