LESSWRONG
LW

375
Lukas Petersson
110420
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
38AI misbehaviour in the wild from Andon Labs' Safety Report
20d
0
7The Same Heaven
5mo
1
5Linguistic Imperialism in AI: Enforcing Human-Readable Chain-of-Thought
7mo
0
58AI Safety as a YC Startup
8mo
9
Project Vend: Can Claude run a small shop?
Lukas Petersson3mo40

Thanks for highlighting our work!

Reply1
Which evals resources would be good?
Lukas Petersson10mo70

This is probably not the first barrier to getting into evals, but I have an AI safety startup that designs evals. However, we don't have the capacity to also do good elicitation. I think we lose a lot of signal from our evals because our agent is too weak to explore properly. We're currently using Inspect's basic_agent. Metr's modular_public is better, but we prefer inspect over vivaria otherwise. I think open-sourcing a better agent would be positive for the evals community without contributing to capabilities.

Reply