Can we learn much by studying the behaviour of RL policies?
Economists sometimes study revealed preferences, which are preferences that we can infer from choices, e.g. when given the choice between an apple or an orange, if I choose an apple, then I have revealed a preference for an apple over an orange. I'm wondering about the revealed preferences of RL...
May 15, 20231