LESSWRONG
LW

Lukas Petersson
110420
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Project Vend: Can Claude run a small shop?
Lukas Petersson2mo40

Thanks for highlighting our work!

Reply1
Which evals resources would be good?
Lukas Petersson10mo70

This is probably not the first barrier to getting into evals, but I have an AI safety startup that designs evals. However, we don't have the capacity to also do good elicitation. I think we lose a lot of signal from our evals because our agent is too weak to explore properly. We're currently using Inspect's basic_agent. Metr's modular_public is better, but we prefer inspect over vivaria otherwise. I think open-sourcing a better agent would be positive for the evals community without contributing to capabilities.

Reply
38AI misbehaviour in the wild from Andon Labs' Safety Report
6d
0
7The Same Heaven
5mo
1
5Linguistic Imperialism in AI: Enforcing Human-Readable Chain-of-Thought
6mo
0
58AI Safety as a YC Startup
8mo
9