x

LESSWRONG

LW

Clairstan

Clairstan

Message

1

3y

Clairstan

3y

Clairstan — LessWrong

Optimization, loss set at variance in RL

This is a suggestion of a reinforcement learning model with an additional conflicting dynamic between optimized data and loss function. That conflict is intended to reduce RL’s intrinsic Omohundro x-risk dynamic. Thus, it’s also an attempt to produce, though without details, a good idea for AI safety, and so as...

Jul 22, 2023•1