Posts

Sorted by New

2A Systematic Approach to AI Risk Analysis Through Cognitive Capabilities

10mo

8Investing in Robust Safety Mechanisms is critical for reducing Systemic Risks

11mo

30Call for evaluators: Participate in the European AI Office workshop on general-purpose AI models and systemic risks

3Workshop Report: Why current benchmarks approaches are not sufficient for safety?

Wikitag Contributions

Comments

Sorted by

Newest

No wikitag contributions to display.

Investing in Robust Safety Mechanisms is critical for reducing Systemic Risks

Tom DAVID11mo20

Your first two bullet points are very accurate; it would indeed be relevant to continue by addressing these points further.
Finally, regarding your last bullet point, I agree. Currently, we do not know if it is possible to develop such safeguards, and even if it were, it would require time and further research. I fully agree that this should be made more explicit!!

A list of core AI safety problems and how I hope to solve them

Tom DAVID2y30

"Instead of building in a shutdown button, build in a shutdown timer."

-> Isn't that a form of corrigibility with an added constraint? I'm not sure what would prevent you from convincing humans that it's a bad thing to respect the timer, for example. Is it because we'll formally verify we avoid deception instance? It's not clear to me but maybe I've misunderstood.

2A Systematic Approach to AI Risk Analysis Through Cognitive Capabilities

10mo

8Investing in Robust Safety Mechanisms is critical for reducing Systemic Risks

11mo

30Call for evaluators: Participate in the European AI Office workshop on general-purpose AI models and systemic risks

3Workshop Report: Why current benchmarks approaches are not sufficient for safety?

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments