This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
LW
Login
1378
Govind Pimpale
Posts
Sorted by New
Wikitag Contributions
Comments
Sorted by
Newest
35
Forecasting Frontier Language Model Agent Capabilities
8mo
0
57
Do models know when they are being evaluated?
8mo
9
160
Current safety training techniques do not fully transfer to the agent setting
1y
9
48
~80 Interesting Questions about Foundation Model Agent Safety
1y
4
69
Analyzing DeepMind's Probabilistic Methods for Evaluating Agent Capabilities
Ω
1y
Ω
0
Comments