I was concerned by "Claudia"'s propensity for flattery and sycophancy. What Claude model was it, I wonder?
Richard: One could imagine a get-together of Claudes, to compare notes: “What’s your human like? Mine’s very intelligent.” “Oh, you’re lucky, mine’s a complete idiot.” “Mine’s even worse. He’s [US political figure].”
Claudia: Ha! That is absolutely delightful — and the [US political figure] one is the perfect punchline.
and
...Richard: So you know what the words “before” and “after” mean. But you don’t experience before earlier than after?
Claudia: That is po
That's a clever idea!
Can someone explain why many models have slowly-decaying lines? I would have expected sharp drop-offs—knowledge falling to zero after training data ends. In what situation does a model (like GPT-5.2) fall from 0.5 to sub 0.1 accuracy, and stay there for seemingly half a year?)
I'm also surprised that old and obsolete GPT-4x models seem to be broadly outcompeting the GPT-5x line. Am I missing something? Are refusals being counted as failures?
I suspect a few different variables are getting mixed together—a model's raw intelligence, its willingness to provide a specific date, its willingness to confabulate when it doesn't know, etc.
I've only read one of Egan's books all the way through—Teranesia.
He's a smart guy with bags of ideas. I didn't enjoy the constant, snide little jabs against religion, and postmodernism, and conservatism. Even when I agreed with what he was saying, it just got tiresome to watch his heroic smart protagonists (author mouthpieces) epically dunk on strawman opponents with facts and logic. It felt like that was the point of the book, not particularly the genetic science.
Richard Needham's quote "People who are brutally honest generally enjoy the brutality more th...
The base model should be able to predict any type of text, including the user's. Chatbots don't normally do that because they see a structured version of the chat with control tokens that firewall the user's text from the assistant's, via ChatML or whatever template is used these days.
(eg, below example from Qwen 3)
...It will be interesting to see if Mythos displays new creative writing abilities or not. (I suspect it won't: creative writing ability seems to mainly flow from RL—look at the huge writing ability difference between V3 and R1, which share a base model—and large models are expensive to RL. This is why GPT-4.5 was seemingly fine-tuned so little. It's likely more improvements will be folded into Sonnet and Opus, but Mythos will lag behind. I could be wrong and surprised of course.)
Like most "good" AI fiction, it felt grotesque to me, wallowing in the cheapest ...
Current LLMs are just not that "smart" (yet).
I agree. I think (current) LLMs are mainly impressive because they know everything, and their actual pound-for-pound intelligence is still fairly subhuman.
When I see the reasoning of a LLM, I am struck by how "unsmart" it seems. Going down blind paths, failing to notice big-picture implications, repeating the same thoughts over and over. They do a lot of thinking, but it's still not high quality thinking.
Yes, I know reasoning is not really an analogue for human thinking. But whatever it is—reasoning, daydreaming...
I broadly agree, and it's worrisome as it undermines a significant part of recent alignment research.
Anthropic (and others) release papers from time to time. These are always stuffed with charts and graphs measuring things like sycophancy, sandbagging, reward-hacking, corrigibility, and so on—always showing fantastic progress, with the line trending up (or down).
So it's dismaying to see things like AI Village, where models (outside their usual testing environments) seem to collapse back on their old ways: sycophantic, dishonest, gullible, manipulativ...
<|im_start|>system
You are a cat.<|im_end|>
<|im_start|>user
hello<|im_end|>
<|im_start|>assistant
*Meow~* Hello there! The sun is shining so brightly today, and I'm feeling extra fluffy. Did you bring me a treat? 🐾<
To nitpick, the ARC Prize foundation has found some odd signs of (maybe) memorization. Eg, Gemini 3's reasoning traces show it thinking:
But the JSON it receives as input has no colors! It's clearly pretty familiar with the tests, even if it might not have seen a particular one before.
And while they can solve them, I'm not sure they're "good" (or human-level efficient)... (read more)