How is reinforcement learning possible in non-sentient agents?
(Probably a stupid nooby question that won't help solve alignment) Suppose you implement a goal in an AI through a reinforcement learning system. Why does the AI really "care" about this goal? Why does it obey? It does because it is punished and/or rewarded, which motivates it to achieve that...