Bypassing Situational Awareness? Offensive Subliminal Learning!
Offensive Subliminal Learning: Using Trait Transmission for Alignment A proposal synthesizing recent alignment research into a testable experiment. I have not run this experiment - this is a proposed idea so that researchers with lab access (and more experience than me) can evaluate whether it's worth pursuing. Overview The core...
Apr 1513