LESSWRONG
LW

Max Niederman
4207300
Message
Dialogue
Subscribe

maxniederman.com

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
4Max Niederman's Shortform
2mo
9
Should you make stone tools?
Max Niederman1mo4122

This seems plausible given that virtually every knowledge worker I know fantasizes to some extent about working with their hands.

Reply
Caleb Biddulph's Shortform
Max Niederman1mo43

I suspect that this method will only work well on tasks where the model needs to reason explicitly in order to cheat. So, e.g., if the model needs to reason out some trait of the user in order to flatter them the prompt will likely kick in and get it to self-report its cheating, but if the model can learn to flatter the user without on-the-fly without reasoning the prompt probably won't do anything. By analogy, if I instruct a human to tell me whenever they use hand gestures to communicate something, they will have difficulty because their hand gestures are automatic and not normally promoted to conscious attention.

Reply
Max Niederman's Shortform
Max Niederman1mo20

There's the atom transformer in AlphaFold-like architectures, although the embeddings it operates on do encode 3D positioning from earlier parts of the model so maybe that doesn't count.

Reply
Max Niederman's Shortform
Max Niederman1mo357

Transformers do not natively operate on sequences.

This was a big misconception I had because so much of the discussion around transformers is oriented around predicting sequences. However, it's more accurate to think of general transformers as operating on unordered sets of tokens. The understanding of sequences only comes if you have a positional embedding to tell the transformer how the tokens are ordered, and possibly a causal mask to force attention to flow in only one direction.

Reply2
Max Niederman's Shortform
Max Niederman1mo10

The Money Stuff column mentioned AI alignment, rationality, and the UK AISI today:

Here is a post from the UK AI Security Institute looking for economists to “find incentives and mechanisms to direct strategic AI agents to desirable equilibria.” One model that you can have is that superhuman AI will be terrifying in various ways, but extremely rational. Scary AI will not be an unpredictable lunatic; it will be a sort of psychotic pursuing its own aims with crushing instrumental rationality. And arguably that’s where you need economists! The complaint people have about economics is that it tries to model human behavior based on oversimplified assumptions of rationality. But if super AI is super-rational, economists will be perfectly suited to model it. Anyway if you want to design incentives for AI here’s your chance.

Reply
What are non-obvious class markers?
Max Niederman1mo21

@samuelshadrach (currently ratelimited) sent me the following document on the difference between elite and knowledge class social norms. This is not per se about economic class like I'm primarily interested in, and it's more about different social norms than subtle markers, but it's somewhat relevant so I'm linking it here:
 

Elite Social Norms

Reply
What are non-obvious class markers?
Max Niederman1mo10

Skiing is an interesting one. I never thought about it in those terms since I grew up in Alaska where downhill skiing was relatively accessible (like CO/UT). I also wouldn't be surprised if outdoor activities in general are correlated with class, even when they're not necessarily expensive (e.g. hiking).

Reply
Sam Marks's Shortform
Max Niederman1mo24

I don't think it's correct to describe the optimization social media companies do as Goodharting. They're optimizing for exactly what they want: money. It's not that they want what's truly best for their users and are mistaking engagement for that -- I think it's pretty clear at this point social media companies don't care at all about their users' wellbeing.

Reply1
Open Thread - Summer 2025
Max Niederman2mo1-3

FWIW, I don't think the site looks significantly worse on dark mode, although I can understand the desire not to have to optimize for two colorschemes.

Reply1
Open Thread - Summer 2025
Max Niederman2mo1210

Is there a reason that LessWrong defaults to light mode rather than automatically following the browser's setting? I personally find it a bit annoying to have to select auto every time I have a new localStorage and it's not clear to me what the upside is.

Reply
Load More
14What are non-obvious class markers?
Q
1mo
Q
14
54Where are the AI safety replications?
Q
2mo
Q
5
20Fullrank: Bayesian Noisy Sorting
2mo
2
4Max Niederman's Shortform
2mo
9
166Read the Pricing First
3mo
14
11Maximal Curiousity is Not Useful
3mo
0
31Grading on Word Count
2y
9