LESSWRONG
LW

1653
Leon Lang
1893Ω132141810
Message
Dialogue
Subscribe

I'm a last-year PhD student at the University of Amsterdam working on AI Safety and Alignment, and specifically safety risks of Reinforcement Learning from Human Feedback (RLHF). Previously, I also worked on abstract multivariate information theory and equivariant deep learning. https://langleon.github.io/

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
2Leon Lang's Shortform
3y
86
Cole Wyeth's Shortform
Leon Lang9d20

This is an edge case, but just flagging that it's a bit unclear to me how to apply this to my own post in a useful way. As I've disclosed in the post itself:

OpenAI's o3 found the idea for the dovetailing procedure. The proof of the efficient algorithmic Kraft coding in the appendix is mine. The entire post is written by myself, except the last paragraph of the following section, which was first drafted by GPT-5.

Does this count as Level 3 or 4? o3 provided a substantial idea, but the resulting proof was entirely written down by myself. I'm also unsure whether the full drafting of precisely one paragraph (which summarizes the rest of the post) by GPT-5 counts as editing or the writing of substantial parts. 

Reply
Statement of Support for "If Anyone Builds It, Everyone Dies"
Leon Lang22d20

I don’t know what “all-too-plausibly” means. Depending on the probabilities that this implies I may agree or disagree.

Reply
How To Dress To Improve Your Epistemics
Leon Lang1mo60

Fwiw., my hair grew longer and people often point that out, but never has anyone followed with “looks good”.

Reply
Trust me bro, just one more RL scale up, this one will be the real scale up with the good environments, the actually legit one, trust me bro
Leon Lang1mo20

I think the compute they spend on inference will also just get scaled up over time. 

Reply
Banning Said Achmiz (and broader thoughts on moderation)
Leon Lang1mo1810

I think people don’t usually even try to figure something like that out, or are even aware of the option. So if you publicly announce that a user has deactivated their account X times, then this is information that almost no one would otherwise ever receive. 

I also have the sense that it’s better to not do that, even though I have a hard time explaining in words why that is.

Reply
Leon Lang's Shortform
Leon Lang2mo150

A NeurIPS paper on scaling laws from 1993, shared by someone on twitter.

Reply
Leon Lang's Shortform
Leon Lang2mo120

Is there a way to filter on Lesswrong for all posts from the alignment forum?

I often like to just see what's on the alignment forum, but I dislike that I don't see most Lesswrong comments when viewing those posts on the alignment forum. 

Reply
Futility Illusions
Leon Lang2mo63

I suspect there’s a basic reason why futility claims are often successful in therapy/coaching: by claiming (and succeeding in convincing the client) that something can’t be changed, you reduce the client’s shame in not changing the thing. Now the client is without shame, and that’s a state of mind that makes it a priory easier to change, and focusing the change on aspects the client didn’t fail on in the past additionally increases the chance of succeeding since there’s no evidence of not succeeding on those aspects.

However, I also really care about truth, and so I really dislike such futility claims.

Reply
Leon Lang's Shortform
Leon Lang2mo40

I feel like Cunningham's law got confirmed here. I'm really glad about all the things I learned from people who disagreed with me. 

Reply
Leon Lang's Shortform
Leon Lang2mo30

Thanks a lot for this very insightful comment!

Reply
Load More
No wikitag contributions to display.
32The Coding Theorem — A Link between Complexity and Probability
2mo
4
160X explains Z% of the variance in Y
4mo
34
36How to work through the ARENA program on your own
4mo
3
51[Paper Blogpost] When Your AIs Deceive You: Challenges with Partial Observability in RLHF
Ω
1y
Ω
2
90We Should Prepare for a Larger Representation of Academia in AI Safety
2y
14
32Andrew Ng wants to have a conversation about extinction risk from AI
2y
2
26Evaluating Language Model Behaviours for Shutdown Avoidance in Textual Scenarios
2y
0
48[Appendix] Natural Abstractions: Key Claims, Theorems, and Critiques
Ω
3y
Ω
0
246Natural Abstractions: Key Claims, Theorems, and Critiques
Ω
3y
Ω
26
38Andrew Huberman on How to Optimize Sleep
3y
6
Load More