x

LESSWRONG

LW

Einar Urdshals — LessWrong

Einar Urdshals

Einar Urdshals

Message

61

2

2y

Einar Urdshals

61

2y

Modelling Trajectories - Interim results

by NickyP, Einar Urdshals, Micurie, and Éloïse Benito-Rodriguez

Introduction Note: These are results which have been in drafts for a year, see discussion about how we have moved on to thinking about these things. Our team at AI Safety Camp has been working on a project to model the trajectories of language model outputs. We're interested in predicting...

Dec 4, 2025•11

Measuring Structure Development in Algorithmic Transformers

by Micurie and Einar Urdshals

tl;dr: We compute the evolution of the local learning coefficient (LLC), a proxy for model complexity, for an algorithmic transformer. The LLC decreases as the model learns more structured solutions, such as head specialization. This post is structured in three main parts, (1) a summary, giving an overview of the...

Aug 22, 2024•56