x
Reinforcement learning and linguistic convention — LessWrong