291

LESSWRONG
LW

290
Interpretability (ML & AI)AI
Frontpage

9

Physics of Language models (part 2.1)

by Nathan Helm-Burger
19th Sep 2024
1 min read
2

9

This is a linkpost for https://youtu.be/bpp6Dz8N2zY?si=RC20soJLynXxNOfv

9

Physics of Language models (part 2.1)
7StefanHex
2Logan Riggs
New Comment
2 comments, sorted by
top scoring
Click to highlight new comments since: Today at 12:26 PM
[-]StefanHex1y70

Paper link: https://arxiv.org/abs/2407.20311

(I have neither watched the video nor read the paper yet, just in case someone else was looking for the non-video version)

Reply
[-]Logan Riggs1y20

Could you dig into why you think it's great inter work?

Reply
Moderation Log
More from Nathan Helm-Burger
View more
Curated and popular this week
2Comments
Interpretability (ML & AI)AI
Frontpage

This is perhaps the best interpretability work I've seen outside of Chris Olah's team.