I am after papers that you have stumbled across that may be relevant to the core of the understanding the existential AI alignment problem despite not having explicit links to AI alignment. 

Here are papers I have found that fit the above criteria to give you a better idea of what I'm after:

While I've used the term "agent foundations" I expect the majority of useful papers will not use terms like agency, optimization etc. 

New Answer
New Comment