Reinforcement Learning: A Non-Standard Introduction (Part 1) — LessWrong