x

LESSWRONG

LW

maddi — LessWrong

maddi

maddi

Message

academic foundations in cog sci, currently interested in AI alignment and interpretability work (and lots of other things!). i'm newish to this space so welcome constructive feedback (and suggested resources) when i'm wrong or missing something :)

26

3

11

5mo

maddi

academic foundations in cog sci, currently interested in AI alignment and interpretability work (and lots of other things!). i'm newish to this space so welcome constructive feedback (and suggested resources) when i'm wrong or missing something :)

maddi's Shortform

Response to Introspective Awareness research

This is a rewrite of a comment I originally crafted in response to Anthropic's recent research on introspective awareness with edits and expanded reflections. Abstract from the original research: > We investigate whether large language models can introspect on their internal states. It is difficult to answer this question through...

Dec 19, 2025•6

Who is AGI for, and who benefits from AGI?

Disclaimer: these ideas are not new, just my own way of organizing and elevating of what feels important to pay better attention to in the context of alignment work. All uses of em dashes are my own! LLMs were occasionally used to improve word choice or clarify expression. One of...

Dec 5, 2025•2