Stephen Fowler — LessWrong

LESSWRONG
LW

By all means wear what you want, but the positive reactions you get from strangers who directly approach you are not necessarily an accurate way to gauge how most people are reacting to your outfit. You're sampling from the population of "people who have spontaneously chosen to engage with you".

Generally when you wear a polarising outfit people who dislike it won't go out of their way to tell you. I'm extroverted enough that I will (very occasionally) complimented strangers in public on nice/unusual outfits, but I've never told a stranger their outfit is bad.

"I inevitably get weird looks from the kind of people who think having a tattoo is an affront to god but they give me that look for just existing with blue hair and pronouns too"

This line in particular just seems like bad epistemics. Is it really likely that everyone who reacts badly to their outfit would also judge them for having coloured hair?

Mikhail Samin's Shortform

Stephen Fowler2mo*0-5

I think we should show some solidarity to people committed to their beliefs and making a personal sacrifice, rather than undermining them by critiquing their approach.

Given that they're both young men and the hunger strikes are occurring in the first world, it seems unlikely anyone will die. But it does seem likely they or their friends will read this thread.

Beyond that, the hunger strike is only on day 2 and is has already received a small amount of media coverage. Should they go viral then this one action alone will have a larger differential impact on reducing existential risk than most safety researchers will achieve in their entire careers.

https://www.businessinsider.com/hunger-strike-deepmind-ai-threat-fears-agi-demis-hassabis-2025-9

Von Neumann's Fallacy and You

Stephen Fowler2mo81

il faut imaginer sisyphe heureux

Von Neumann might have been driven by a feeling of inadequacy, but that doesn't mean it was necessary for his success. One can imagine Von NewOutlook-Mann who took the same actions in life but viewed them as working towards a positive goal rather than needing to prove himself.

[Anthropic] A hacker used Claude Code to automate ransomware

Stephen Fowler2mo1-12

It strikes me that Anthropic's blog post is engaging in a bit of double-speak in saying they are "disrupting" the operations of cybercriminals.

What they are describing is retroactively taking action after crime has occurred.

Buck's Shortform

Stephen Fowler2mo*248

The following illustration from 2015 by Tim Urban seems like a decent summary of how people interpreted this and other statements.

Cole Wyeth's Shortform

Stephen Fowler2mo20

I've thrown on some limit orders if anyone is strongly pro-Kokotajlo.

Training a Reward Hacker Despite Perfect Labels

Stephen Fowler2mo30

That's awesome to hear.

(On a side note your hyperlink currently includes a spurious fullstop that means the link 404's).

Training a Reward Hacker Despite Perfect Labels

Stephen Fowler2mo*30

An alternative interpretation of the reported findings is that the process used to generate the “100% hack-free” dataset was itself imperfect. The assumption of a fully hack-free corpus rests on validation by a large language model, but such judgments are not infallible.

I would suggest making the cleaned dataset, or at least a substantial sample, publicly available to enable broader scrutiny. You might additionally consider re-filtering through a second LLM with distinct prompting or a multi-agent setup.

Tomás B.'s Shortform

Stephen Fowler3mo*1512

"I think society has weird memes about balding and male beauty in general. Stoically accepting a disfigurement isn't particularly noble"

I think calling natural balding "disfigurement" is in line with the weird memes around male beauty.

Not having hair isn't harmful.

Disclaimer: I may go bald.

Shortform

Stephen Fowler3mo50

I think you're extrapolating too far from your own experiences. It is absolutely possible to be excited (or at least avoid boredom) for long stretches of time if your life is busy and each day requires you to make meaningful decisions.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments