Reinforcement Learning from Human Feedback (RLHF) is a machine learning technique where the model's training signal uses human evaluations of the model's outputs, rather than labeled data or a ground truth reward signal.