This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
LW
Login
Meg
Posts
Sorted by New
65
Towards Understanding Sycophancy in Language Models
Ω
1mo
Ω
0
118
Paper: LLMs trained on “A is B” fail to learn “B is A”
Ω
2mo
Ω
71
101
Paper: On measuring situational awareness in LLMs
Ω
3mo
Ω
15
Wiki Contributions
Comments