LESSWRONG
LW

AICommunity
Frontpage

8

What empirical research directions has Eliezer commented positively on?

by Chris_Leong
15th Apr 2025
1 min read
1

8

AICommunity
Frontpage

8

What empirical research directions has Eliezer commented positively on?
14Mateusz Bagiński
New Comment
1 comment, sorted by
top scoring
Click to highlight new comments since: Today at 7:50 AM
[-]Mateusz Bagiński5mo*140

Self-Other Overlap: https://www.lesswrong.com/posts/hzt9gHpNwA2oHtwKX/self-other-overlap-a-neglected-approach-to-ai-alignment?commentId=WapHz3gokGBd3KHKm

Emergent Misalignment: https://x.com/ESYudkowsky/status/1894453376215388644 

He was throwing vaguely positive comments about Chris Olah, but I think always/usually caveating it with "capabilities go like this [big slope], Chris Olah's interpretability goes like this [small slope]" (e.g., on Lex Fridman podcast and IIRC some other podcast(s)).

ETA: 

SolidGoldMagikarp: https://www.lesswrong.com/posts/aPeJE8bSo6rAFoLqg/solidgoldmagikarp-plus-prompt-generation#Jj5yN2YTp5AphJaEd 

He also said that Collin Burns's DLK was a "highly dignified work". Ctrl+f "dignified" here though it doesn't link to the tweet (?) but should be findable/verifiable.

Reply
Moderation Log
More from Chris_Leong
View more
Curated and popular this week
1Comments

I'm interested in both work that he's commented on positively after the fact and any comments might have made on what directions are generally fruitful.