The section "The Future": > As we enter 2025, we will have to become more than a lab and a startup — we have to become an enduring company. The Board’s objectives as it considers, in consultation with outside legal and financial advisors, how to best structure OpenAI to advance...
Introduction The AI safety debate agenda, proposed by Irving et al. (2018), explores using debates between AI agents to ensure truthful answers from advanced systems. Recently three key debate settings have been studied with LLMs: Information asymmetric debates: Debaters have access to information unavailable to the judge. This hidden information...
Some approaches to AI alignment incur upfront costs to the creator (an “alignment tax”). In this post, I discuss “alignment windfalls” which are strategies that tend towards the long-term public good at the same time as reaping short-term benefits for a company. My argument, in short: 1. Just as there...
We want to advance process-based supervision for language models. To make it easier for others to contribute to that goal, we're sharing code for writing compositional language model programs, and a tutorial that explains how to get started: * The Interactive Composition Explorer (ICE) is a library for writing and...
Ought will host a factored cognition “Lab Meeting” on Friday September 16 from 9:30AM - 10:30AM PT. We'll share the progress we've made using language models to decompose reasoning tasks into subtasks that are easier to perform and evaluate. This is part of our work on supervising process, not outcomes....
Can AI systems substantially help with alignment research before transformative AI? People disagree. Ought is collecting a dataset of alignment research tasks so that we can: 1. Make progress on the disagreement 2. Guide AI research towards helping with alignment We’re offering a prize of $200-$2000 for each contribution to...
Ought is an applied machine learning lab. We’re building Elicit, the AI research assistant. Our mission is to automate and scale open-ended reasoning. To get there, we train language models by supervising reasoning processes, not outcomes. This is better for reasoning capabilities in the short run and better for alignment...