LESSWRONG
LW

This is a great point. I admit I have to better understand what each model provider does behind the scenes in the API. Sad if the days of access to the model is gone.

You can't eval GPT5 anymore

Lukas Petersson1mo100

We thought about that, but then it's not reproducible if we want to run it for new models later

You can't eval GPT5 anymore

Lukas Petersson1mo172

Thanks, that would be great!

Project Vend: Can Claude run a small shop?

Lukas Petersson4mo40

Thanks for highlighting our work!

98LLM robots can't pass butter (and they are having an existential crisis about it)

158You can't eval GPT5 anymore

1mo

39AI misbehaviour in the wild from Andon Labs' Safety Report

2mo

7The Same Heaven

7mo

5Linguistic Imperialism in AI: Enforcing Human-Readable Chain-of-Thought

8mo

58AI Safety as a YC Startup

10mo