LESSWRONG
LW

Wikitags

Debate (AI safety technique)

Edited by Bird Concept, plex last updated 15th Jul 2022

Debate is a proposed technique for allowing human evaluators to get correct and helpful answers from experts, even if the evaluator is not themselves an expert or able to fully verify the answers.[1] The technique was suggested as part of an approach to build advanced AI systems that are aligned with human values, and to safely apply machine learning techniques to problems that have high stakes, but are not well-defined (such as advancing science or increase a company's revenue). [2][3]

  1. ^

    https://www.lesswrong.com/posts/Br4xDbYu4Frwrb64a/writeup-progress-on-ai-safety-via-debate-1

  2. ^

    https://ought.org/mission

  3. ^

    https://openai.com/blog/debate/

Subscribe
1
Subscribe
1
Discussion1
Discussion1
Posts tagged Debate (AI safety technique)
103Writeup: Progress on AI Safety via Debate
Ω
Beth Barnes, paulfchristiano
5y
Ω
18
20Briefly thinking through some analogs of debate
Eli Tyre
3y
3
75A guide to Iterated Amplification & Debate
Ω
Rafael Harth
5y
Ω
12
138Debate update: Obfuscated arguments problem
Ω
Beth Barnes
5y
Ω
24
57An alignment safety case sketch based on debate
Ω
Marie_DB, Jacob Pfau, Benjamin Hilton, Geoffrey Irving
4mo
Ω
21
49How should AI debate be judged?
QΩ
abramdemski, paulfchristiano
5y
QΩ
26
35Thoughts on AI Safety via Debate
Vaniver
7y
12
27AI Safety via Debate
Ω
ESRogs
7y
Ω
14
33Optimal play in human-judged Debate usually won't answer your question
Ω
Joe Collman
5y
Ω
12
220An overview of 11 proposals for building safe advanced AI
Ω
evhub
5y
Ω
37
127My Overview of the AI Alignment Landscape: A Bird's Eye View
Ω
Neel Nanda
4y
Ω
9
121[New LW Feature] "Debates"
Ruby, RobertM, GPT-4, Claude+
2y
35
107Imitative Generalisation (AKA 'Learning the Prior')
Ω
Beth Barnes
5y
Ω
15
75Why I'm excited about Debate
Ω
Richard_Ngo
5y
Ω
12
74Why I’m not working on {debate, RRM, ELK, natural abstractions}
Ω
Steven Byrnes
3y
Ω
19
Load More (15/84)
Add Posts