Supervise Process, not Outcomes
We can think about machine learning systems on a spectrum from process-based to outcome-based: * Process-based systems are built on human-understandable task decompositions, with direct supervision of reasoning steps. * Outcome-based systems are built on end-to-end optimization, with supervision of final results. This post explains why Ought is devoted to process-based systems. The argument is: 1. In the short term, process-based ML systems have better differential capabilities: They help us apply ML to tasks where we don’t have access to outcomes. These tasks include long-range forecasting, policy decisions, and theoretical research. 2. In the long term, process-based ML systems help avoid catastrophic outcomes from systems gaming outcome measures and are thus more aligned. 3. Both process- and outcome-based evaluation are attractors to varying degrees: Once an architecture is entrenched, it’s hard to move away from it. This lock-in applies much more to outcome-based systems. 4. Whether the most powerful ML systems will primarily be process-based or outcome-based is up in the air. 5. So it’s crucial to push toward process-based training now. There are almost no new ideas here. We’re reframing the well-known outer alignment difficulties for traditional deep learning architectures and contrasting them with compositional approaches. To the extent that there are new ideas, credit primarily goes to Paul Christiano and Jon Uesato. We only describe our background worldview here. In a follow-up post, we explain why we’re building Elicit, the AI research assistant. The spectrum Supervising outcomes Supervision of outcomes is what most people think about when they think about machine learning. Local components are optimized based on an overall feedback signal: * SGD optimizes weights in a neural net to reduce its training loss * Neural architecture search optimizes architectures and hyperparameters to have low validation loss * Policy gradient optimizes p