LESSWRONG
LW

202
Stephen Fowler
2668Ω24152120
Message
Dialogue
Subscribe

Masters student in Physics at the University of Queensland.

I am interested in Quantum Computing, physical AI Safety guarantees and alignment techniques that will work beyond contemporary models.

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
3Stephen Fowler's Shortform
3y
135
Mikhail Samin's Shortform
Stephen Fowler5d0-5

I think we should show some solidarity to people committed to their beliefs and making a personal sacrifice, rather than undermining them by critiquing their approach. 

Given that they're both young men and it is occurring in a first world country, it seems unlikely anyone will die. But it does seem likely they or their friends will read this thread.

Beyond that, the hunger strike is only on day 2 and is has already received a small amount of media coverage. Should they go viral then this one action alone will have a larger differential impact on reducing existential risk than most safety researchers will achieve in their entire careers. 

https://www.businessinsider.com/hunger-strike-deepmind-ai-threat-fears-agi-demis-hassabis-2025-9

 

Reply
Von Neumann's Fallacy and You
Stephen Fowler16d81

il faut imaginer sisyphe heureux

Von Neumann might have been driven by a feeling of inadequacy, but that doesn't mean it was necessary for his success. One can imagine Von NewOutlook-Mann who took the same actions in life but viewed them as working towards a positive goal rather than needing to prove himself.

Reply31
[Anthropic] A hacker used Claude Code to automate ransomware
Stephen Fowler17d1-12

It strikes me that Anthropic's blog post is engaging in a bit of double-speak in saying they are "disrupting" the operations of cybercriminals. 

What they are describing is retroactively taking action after crime has occurred. 

Reply
Buck's Shortform
Stephen Fowler20d*248

The following illustration from 2015 by Tim Urban seems like a decent summary of how people interpreted this and other statements.
 

Reply
Cole Wyeth's Shortform
Stephen Fowler26d20

I've thrown on some limit orders if anyone is strongly pro-Kokotajlo.

Reply
Training a Reward Hacker Despite Perfect Labels
Stephen Fowler1mo30

That's awesome to hear.

(On a side note your hyperlink currently includes a spurious fullstop that means the link 404's).

Reply
Training a Reward Hacker Despite Perfect Labels
Stephen Fowler1mo*30

An alternative interpretation of the reported findings is that the process used to generate the “100% hack-free” dataset was itself imperfect. The assumption of a fully hack-free corpus rests on validation by a large language model, but such judgments are not infallible.

I would suggest making the cleaned dataset, or at least a substantial sample, publicly available to enable broader scrutiny. You might additionally consider re-filtering through a second LLM with distinct prompting or a multi-agent setup.

Reply
Tomás B.'s Shortform
Stephen Fowler1mo*1512

"I think society has weird memes about balding and male beauty in general. Stoically accepting a disfigurement isn't particularly noble"

I think calling natural balding "disfigurement" is in line with the weird memes around male beauty.

Not having hair isn't harmful.

Disclaimer: I may go bald.

Reply1
Shortform
Stephen Fowler2mo50

I think you're extrapolating too far from your own experiences. It is absolutely possible to be excited (or at least avoid boredom) for long stretches of time if your life is busy and each day requires you to make meaningful decisions. 

Reply
Load More
23To what extent is the UK Government's recent AI Safety push entirely due to Rishi Sunak?
Q
2y
Q
4
20What are the best published papers from outside the alignment community that are relevant to Agent Foundations?
Q
2y
Q
4
4Ateliers: But what is an Atelier?
2y
2
7Ateliers: Motivation
2y
0
34Scaffolded LLMs: Less Obvious Concerns
Ω
2y
Ω
15
4What do beneficial TDT trades for humanity concretely look like?
Q
2y
Q
0
6Requisite Variety
2y
0
29Ng and LeCun on the 6-Month Pause (Transcript)
2y
7
14No Summer Harvest: Why AI Development Won't Pause
2y
17
8100 Dinners And A Workshop: Information Preservation And Goals
2y
0
Load More