LESSWRONG
LW

494
Leon Lang
1885Ω132141800
Message
Dialogue
Subscribe

I'm a last-year PhD student at the University of Amsterdam working on AI Safety and Alignment, and specifically safety risks of Reinforcement Learning from Human Feedback (RLHF). Previously, I also worked on abstract multivariate information theory and equivariant deep learning. https://langleon.github.io/

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
2Leon Lang's Shortform
3y
86
No wikitag contributions to display.
Statement of Support for "If Anyone Builds It, Everyone Dies"
Leon Lang8d20

I don’t know what “all-too-plausibly” means. Depending on the probabilities that this implies I may agree or disagree.

Reply
How To Dress To Improve Your Epistemics
Leon Lang13d60

Fwiw., my hair grew longer and people often point that out, but never has anyone followed with “looks good”.

Reply
Trust me bro, just one more RL scale up, this one will be the real scale up with the good environments, the actually legit one, trust me bro
Leon Lang1mo20

I think the compute they spend on inference will also just get scaled up over time. 

Reply
Banning Said Achmiz (and broader thoughts on moderation)
Leon Lang1mo179

I think people don’t usually even try to figure something like that out, or are even aware of the option. So if you publicly announce that a user has deactivated their account X times, then this is information that almost no one would otherwise ever receive. 

I also have the sense that it’s better to not do that, even though I have a hard time explaining in words why that is.

Reply
Leon Lang's Shortform
Leon Lang1mo150

A NeurIPS paper on scaling laws from 1993, shared by someone on twitter.

Reply
Leon Lang's Shortform
Leon Lang1mo120

Is there a way to filter on Lesswrong for all posts from the alignment forum?

I often like to just see what's on the alignment forum, but I dislike that I don't see most Lesswrong comments when viewing those posts on the alignment forum. 

Reply
Futility Illusions
Leon Lang1mo63

I suspect there’s a basic reason why futility claims are often successful in therapy/coaching: by claiming (and succeeding in convincing the client) that something can’t be changed, you reduce the client’s shame in not changing the thing. Now the client is without shame, and that’s a state of mind that makes it a priory easier to change, and focusing the change on aspects the client didn’t fail on in the past additionally increases the chance of succeeding since there’s no evidence of not succeeding on those aspects.

However, I also really care about truth, and so I really dislike such futility claims.

Reply
Leon Lang's Shortform
Leon Lang1mo40

I feel like Cunningham's law got confirmed here. I'm really glad about all the things I learned from people who disagreed with me. 

Reply
Leon Lang's Shortform
Leon Lang1mo30

Thanks a lot for this very insightful comment!

Reply
Leon Lang's Shortform
Leon Lang1mo30

I think we may not disagree about any truth-claims about the world. I'm just satisfied that the north star of Solomonoff induction exists at all, and that it is as computable (albeit only semicomputable), well-predicting, science-compatible and precise as it is. I expected less from a theory that seems so unpopular. 

> It predicts well: It's provenly a really good predictor

So can you point to any example of anyone ever predicting anything using it?

No, but crucially, I've also never seen anyone predict as well as someone using Solomonoff induction with any other method :) 

Reply
Load More
32The Coding Theorem — A Link between Complexity and Probability
2mo
4
160X explains Z% of the variance in Y
3mo
34
36How to work through the ARENA program on your own
4mo
3
51[Paper Blogpost] When Your AIs Deceive You: Challenges with Partial Observability in RLHF
Ω
1y
Ω
2
90We Should Prepare for a Larger Representation of Academia in AI Safety
2y
14
32Andrew Ng wants to have a conversation about extinction risk from AI
2y
2
26Evaluating Language Model Behaviours for Shutdown Avoidance in Textual Scenarios
2y
0
48[Appendix] Natural Abstractions: Key Claims, Theorems, and Critiques
Ω
3y
Ω
0
246Natural Abstractions: Key Claims, Theorems, and Critiques
Ω
3y
Ω
26
38Andrew Huberman on How to Optimize Sleep
3y
6
Load More