x

LESSWRONG

LW

Debate (AI safety technique) — LessWrong

Debate (AI safety technique)

Edited by Bird Concept, plex last updated 15th Jul 2022

Debate is a proposed technique for allowing human evaluators to get correct and helpful answers from experts, even if the evaluator is not themselves an expert or able to fully verify the answers.^[1] The technique was suggested as part of an approach to build advanced AI systems that are aligned with human values, and to safely apply machine learning techniques to problems that have high stakes, but are not well-defined (such as advancing science or increase a company's revenue). ^[2]^[3]

^{^}
https://www.lesswrong.com/posts/Br4xDbYu4Frwrb64a/writeup-progress-on-ai-safety-via-debate-1
^{^}
https://ought.org/mission
^{^}
https://openai.com/blog/debate/

Add Posts

1

1

Posts tagged Debate (AI safety technique)

9

105Writeup: Progress on AI Safety via Debate

Beth Barnes, paulfchristiano

6y

18

9

20Briefly thinking through some analogs of debate

4y

3

8

76A guide to Iterated Amplification & Debate

6y

15

4

146Debate update: Obfuscated arguments problem

5y

24

4

62An alignment safety case sketch based on debate

Marie_DB, Jacob Pfau, Benjamin Hilton, Geoffrey Irving

1y

21

4

49How should AI debate be judged?

abramdemski, paulfchristiano

6y

26

4

35Thoughts on AI Safety via Debate

8y

12

4

27AI Safety via Debate

8y

14

3

33Optimal play in human-judged Debate usually won't answer your question

5y

12

2

221An overview of 11 proposals for building safe advanced AI

6y

37

2

129My Overview of the AI Alignment Landscape: A Bird's Eye View

4y

9

2

121[New LW Feature] "Debates"

Ruby, RobertM, GPT-4, Claude+

3y

35

2

107Imitative Generalisation (AKA 'Learning the Prior')

5y

15

2

77Why I'm excited about Debate

5y

12

2

75Why I’m not working on {debate, RRM, ELK, natural abstractions}

3y

19

Load More (15/91)

Add Posts