This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
is fundraising!
LW
$
Login
AlexMeinke
Posts
Sorted by New
112
Ablations for “Frontier Models are Capable of In-context Scheming”
1mo
1
203
Frontier Models are Capable of In-context Scheming
Ω
1mo
Ω
23
61
Training AI agents to solve hard problems could lead to Scheming
Ω
2mo
Ω
12
106
Me, Myself, and AI: the Situational Awareness Dataset (SAD) for LLMs
6mo
31
93
Apollo Research 1-year update
Ω
8mo
Ω
0
51
A starter guide for evals
Ω
1y
Ω
2
45
Paper: Tell, Don't Show- Declarative facts influence how LLMs generalize
Ω
1y
Ω
4
Wiki Contributions
Comments
Sorted by
Newest