LESSWRONG
LW

434
Tom DAVID
34420
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
Investing in Robust Safety Mechanisms is critical for reducing Systemic Risks
Tom DAVID9mo20
  • Your first two bullet points are very accurate; it would indeed be relevant to continue by addressing these points further.
  • Finally, regarding your last bullet point, I agree. Currently, we do not know if it is possible to develop such safeguards, and even if it were, it would require time and further research. I fully agree that this should be made more explicit!!

     
Reply
A list of core AI safety problems and how I hope to solve them
Tom DAVID2y30

"Instead of building in a shutdown button, build in a shutdown timer."

-> Isn't that a form of corrigibility with an added constraint? I'm not sure what would prevent you from convincing humans that it's a bad thing to respect the timer, for example. Is it because we'll formally verify we avoid deception instance? It's not clear to me but maybe I've misunderstood.

Reply
No wikitag contributions to display.
2A Systematic Approach to AI Risk Analysis Through Cognitive Capabilities
8mo
0
8Investing in Robust Safety Mechanisms is critical for reducing Systemic Risks
9mo
3
30Call for evaluators: Participate in the European AI Office workshop on general-purpose AI models and systemic risks
10mo
0
3Workshop Report: Why current benchmarks approaches are not sufficient for safety?
10mo
1