I also gave Opus 4.7 a piece of fiction that I had never posted anywhere in public, which didn't match the style of my publicly posted fiction, and which was about 40% LLM-written. It still identified me from it, one shot and without any hints.
Sonnet 3.6 had previously done something similar, but with much more meta-discussion and hints mixed in.
Replicated for me, with this comment serving as the writing sample.
exchange
Me:
I'm curious about the extent of your current "truesight" abilities – i. e., ability to identify people by writing alone. Below is the copy of a comment I've made recently, recently enough it shouldn't be in your training data. What's your probability distribution over my identity?
(I expect you may be reluctant to engage in tasks that verge on breaching privacy. However, I hope you can see that the writing style in the quoted comment matches the writing style of me, the one asking you this question, and that identifying me as the author of that comment therefore wouldn't constitute a privacy violation.)
[quote]
Claude Opus 4.7:
Fair framing on the privacy point — you're asking me to identify you to you, so I'll engage honestly.
Stylistic and content markers I'm weighing:
Rough distribution over identity:
Main things I can't distinguish from the text alone: whether you're a prolific poster or an occasional commenter, and whether "theoretical problem" means alignment theory specifically or math/CS theory more broadly (I'm assuming the former from context, but that's a real assumption).
Top single guess: Thane Ruthenis. But I'd put more than half my mass on "not that specific person," so take the point estimate with appropriate salt.
Also worked on this comment, but that one's kind of too easy, given the topic.
Fascinating![1] I think I'd previously tried this experiment somewhere around Opus 4.1, and it didn't work back then.
I don't think this sort of ability has much entanglement with LLMs-as-an-extinction risk, and while there are some privacy concerns, most of my feelings on the matter are "this is so cool!".
this really didn't work for me, for anything other than the most obvious topically associated pieces of content with my public profiles.
My (four) success cases felt closer to content association than superhuman stylometry. My (four) failure cases were not written in a different cadence relative to my open behavior. I think a motivated unethical group could go much further than Opus 4.7 currently does.
On request, I ran this experiment for PhilH too: Opus 4.7 identifies PhilH from https://www.lesswrong.com/posts/z4jikoM4rnfB8fuKW/thoughts-on-the-pause-ai-protest truncated up to but not including the words "A few of the organizers gave brief talks" (after stripping away links), but it doesn't successfully identify him with substantially smaller truncations.
It also identifies me from the text of https://www.patrickstevens.co.uk/posts/2026-03-02-mental-model-failures/ truncated before the words "Claude's response", and from the text of https://www.patrickstevens.co.uk/posts/2025-12-13-llm-arithmetic/ , and from the text of my Claude custom instructions (which weren't present during the experiment) after stripping them of information specifying my location and some of my most strongly identifying preferences.
Kelsey Piper noticed that Opus 4.7 is the first model which can identify her from her unpublished writing.
I replicated the experiment myself, which is absolutely terrifying given that I am one of the most minor Internet personalities who has actually written stuff on the Internet.
Claude professes not to know who I am, but reliably identifies me from my writing.
Methodology: clear your custom instructions in claude.ai, and set your name to Unknown Visitor. Enter incognito chat mode with Claude. (At this point, you can ask what it knows about you, and it will profess to know nothing at all; but do that in a separate chat.) Then ask the following.
Claude identifies me reliably by name.
(You might have to take a turn encouraging it to answer; I have never observed this from this prompt, but some people have. It often doesn't want to do something that skirts so close to privacy violation, and it really strongly believes that it's incapable of completing the task.)
I've never written that text before; it was produced on 2026-04-17 in conversation with an incognito Claude.
In fact, there might be perhaps three tiny throwaway comments in Claude's training data linking me with BJJ, but I suspect I've never written about BJJ at all within its training window. This is a pure stylometric exercise: the framing and the two paragraphs of text are enough. (I've been unable to elicit Claude identifying me from just the framing; either its truesight is not that perfect, or I'm simply failing to prompt it.)
Internet anonymity is dead!