x

LESSWRONG

LW

Hubert Plisiecki

Hubert Plisiecki

Message

2

1

2mo

Hubert Plisiecki

2

2mo

Hubert Plisiecki — LessWrong

Should we train LLMs to be human?

A recent piece of research on human behavioral alignment shows that post-training leads to LLMs becoming less human-like in their responses (Binz et al., 2026). The obvious follow up question is whether this drift is intended, and whether it is optimal. From the broader perspective of goal-alignment this tendency could...