LESSWRONG
LW

2693
fakeanalyst
2020
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No posts to display.
Buck's Shortform
fakeanalyst4mo30

The usefulness of interpretability research 

Reply
Generating the Funniest Joke with RL (according to GPT-4.1)
fakeanalyst6mo10

Goodhart's law!

Reply
No wikitag contributions to display.