Top postsTop post
I. Like most people, my teens and twenties have been confusing and not always the most fun. I’ve struggled to make friends. In high school and university, I didn’t have as many romantic relationships as I wanted. When I was 24, I met a beautiful, wonderful woman who became my...
I reproduced results from ARC Evals' recent report, Evaluating Language-Model Agents on Realistic Autonomous Tasks. For the report, ARC Evals built a set of language model agents, combining a language model like GPT-4 with scaffolding software that lets the language model execute shell commands and interact with a browser. Then,...