agg — LessWrong

Generating the Funniest Joke with RL (according to GPT-4.1)

Language models are not particularly good at generating funny jokes. Asked for their funniest jokes, Claude 3.7 gives us: > Why don't scientists trust atoms? Because they make up everything! o3 gives us: > Why don't scientists trust atoms anymore? Because they make up everything—and they just can't keep their...

May 16, 2025106

Transfer learning and generalization-qua-capability in Babbage and Davinci (or, why division is better than Spanish)

by RP and agg

Tl/Dr: * Generalisation is a capability in its own right, and in our experiments it scales less quickly than specific capabilities. * Our experiments show that on a range of tasks, finetuning Babbage on task A transfers more to task B than finetuning Davinci on task A transfers to task...

Feb 9, 202450

Introducing REBUS: A Robust Evaluation Benchmark of Understanding Symbols

by Arjun Panickssery and agg

This is a summary of https://arxiv.org/abs/2401.05604. When Google announced Gemini Pro, they displayed its ability to solve rebuses—wordplay puzzles which involve creatively adding and subtracting letters from words derived from text and images. We introduce a new benchmark (Github) evaluating the performance of multimodal large language models on rebus puzzles....

Jan 15, 202433

Apply to the Cavendish Labs Fellowship (by 4/15)

Cavendish Labs is a new research organization in Vermont focused on technical work on existential risks. We'd like to invite you to apply to our fellowships in AI safety and biosecurity! Positions are open for any time between June 1 and December 10, 2023. We pay a stipend of $1,500/month,...

Apr 3, 202311

What's the simplest concrete unsolved problem in AI alignment?

In your preferred area of AI alignment, what is the simplest concrete unsolved problem? By "simplest", ideally the problem has been solved when any of the conditions are weakened. However, this isn't always possible, so a simpler solved version of the problem could also work (e.g., Goldbach's weak conjecture is...

Jan 26, 202329

Announcing Cavendish Labs

by derikk and agg

We’re excited to announce Cavendish Labs, a new research institute in Vermont focused on AI safety and pandemic prevention! We’re founding a community of researchers who will live together and work on the world’s most pressing problems. Uh, why Vermont? It’s beautiful; it has one of the cheapest costs of...

Jan 19, 202359