Benji Berczi — LessWrong

Does distilling Claude carry the persona with it?

TL;DR Both GLM 5.2 and Kimi K3 have been reported identifying as Claude in user conversations, suggesting distillation and/or training-data contamination. To test whether they simply use the name or have inherited a latent, selectable Claude persona, we ran identity swaps across a grid of seven Chinese and Western models...

Jul 2440

Personascope: Measuring how deeply LLMs adopt personas

Benji Berczi, Kyuhee Kim, James Requeima, Sid Black, Cozmin Ududec This is work done by Benji and Kyuhee during MATS Winter 2026, mentored by Cozmin Ududec, and advised by James and Sid. Figure 1. A model can take on a persona fully in voice while not changing its behaviour at...

Jul 739

In-context learning alone can induce weird generalisation

by Cozmin Ududec, Benji Berczi, and Kyuhee Kim

Benji Berczi, Kyuhee Kim, Cozmin Ududec, James Requeima This is work done by Kyuhee and Benji during MATS Winter 2026, mentored by Cozmin Ududec, and in collaboration with James. TL;DR * Weird generalisation can happen just with prompting, without fine-tuning. Just by adding benign biographical facts (e.g. facts about Hitler...

Feb 2571