Subliminal Learning Across Models
Tl;dr: We show that subliminal learning can transfer sentiment across models (with some caveats). For example, we transfer positive sentiment for Catholicism, the UK, New York City, Stalin or Ronald Reagan across model families using normal-looking text. This post discusses under what conditions this subliminal transfer happens. — The original...
With all respect, I think this is a weak argument which ignores the reality of the situation. These models will, almost by definition, be assistants and companions simultaneously. Whatever formal distinction one wants to draw between these two roles, we must acknowledge that the AI model which makes government decisions will be intimately related to (and developed using the same principles) as the model which engages me on philosophical questions.