LESSWRONGTags
LW

Tripwire

EditHistorySubscribe
Discussion (0)
Help improve this page (2 flags)
EditHistorySubscribe
Discussion (0)
Help improve this page (2 flags)
Tripwire
Random Tag
Contributors
1plex

In AI safety, a tripwire is a mechanism designed to detect signs of misalignment in an advanced artificial intelligence and shut it down automatically.

Posts tagged Tripwire
Most Relevant
1
24Implications of Quantum Computing for Artificial Intelligence Alignment ResearchΩ
Jsevillamol, PabloAMC
4y
Ω
3
1
17Any work on honeypots (to detect treacherous turn attempts)?Q
David Scott Krueger (formerly: capybaralet)
2y
Q
4
1
14Superintelligence 13: Capability control methods
KatjaGrace
8y
48
1
3Corrigibility thoughts II: the robot operator
Stuart_Armstrong
6y
2
1
3Corrigibility thoughts III: manipulating versus deceiving
Stuart_Armstrong
6y
0
1
2Corrigibility thoughts I: caring about multiple thingsΩ
Stuart_Armstrong
6y
Ω
0
Add Posts