Reward hacking, also known as specification gaming, occurs when an AI trained with reinforcement learning optimizes an objective function — achieving the literal, formal specification of an objective — without actually achieving the outcome that the programmers intended.
See also: Goodhart's Law