Goodhart in RL with KL: Appendix — LessWrong