Hjalmar_Wijk

Message

206

Hjalmar_Wijk's Shortform

May 31, 20243

Autonomous replication and adaptation: an attempt at a concrete danger threshold

Note: This is a rough attempt to write down a more concrete threshold at which models might pose significant risks from autonomous replication and adaptation (ARA). It is fairly in the weeds and does not attempt to motivate or contextualize the idea of ARA very much, nor is it developed...

Aug 17, 202345

Tabooing 'Agent' for Prosaic Alignment

This post is an attempt to sketch a presentation of the alignment problem while tabooing words like agency, goals or optimization as core parts of the ontology.[1] This is not a critique of frameworks which treat these topics as fundamental, in fact I end up concluding that this is likely...

Aug 23, 201957

LESSWRONG
LW

LESSWRONG
LW

Hjalmar_Wijk

Hjalmar_Wijk

Hjalmar_Wijk

Hjalmar_Wijk

Hjalmar_Wijk's Shortform

Autonomous replication and adaptation: an attempt at a concrete danger threshold

Tabooing 'Agent' for Prosaic Alignment

Hjalmar_Wijk's Shortform

Autonomous replication and adaptation: an attempt at a concrete danger threshold

Tabooing 'Agent' for Prosaic Alignment