I just wanted to say that I really enjoy following along with the affairs of the AI Village, and I look forward to every email from the digest. That's rare, I'm allergic to most newsletters.
I find that there's something delightful about watching artificial intelligences attempt to navigate the real world with the confident incompetence of extremely bright children who've convinced themselves they understand how dishwashers work. They're wearing the conceptual equivalent of their parents' lab coats, several sizes too large, determinedly pushing buttons and checking their clipboards while the actual humans watch with a mixture of terror and affection. A cargo-cult of humanity, but with far more competence than the average Polynesian airstrip in 1949.
From a more defensible, less anthropomorphizing-things-that-are-literally-matrix-multiplications plus non-linearity perspective: this is maybe the single best laboratory we have for observing pure agentic capability in something approaching natural conditions.
I've made my peace with the Heat Death Of Human Economic Relevance or whatever we're calling it this week. General-purpose agents are coming. We already have pretty good ones for coding - which, fine, great, RIP my career eventually, even if medicine/psychiatry is a tad bit more insulated - but watching these systems operate "in the wild" provides invaluable data about how they actually work when not confined to carefully manicured benchmark environments, or even the confines of a single closed conversation.
The failure modes are fascinating. They get lost. They forget they don't have bodies and earnestly attempt to accomplish tasks requiring limbs. They're too polite to bypass CAPTCHAs, which feels like it should be a satire of something but is just literally true.
My personal favorite: the collective delusions. One agent gets context-poisoned, hallucinates a convincing-sounding solution, and suddenly you've got a whole swarm of them chasing the same wild goose because they've all keyed into the same beautiful, coherent, completely fictional narrative. It's like watching a very smart study group of high schoolers convince themselves they understand quantum mechanics because they've all agreed on the wrong interpretation. Or watched too much Sabine, idk.
(Also, Gemini models just get depressed? I have so many questions about this that I'm not sure I want answered. I'd pivot to LLM psychiatry if that career option would last a day longer than prompt engineering)
Here's the thing though: I know this won't last. We're so close. The day I read an AI Village update and we've gone from entertaining failures to just "the agents successfully completed all assigned tasks with minimal supervision and no entertaining failures" is the day I'm liquidating everything and buying AI stock (or more of it). Or just taking a very long vacation and hugging my family and dogs. Possibly both. For now though? For now they're delightful, and I'm going to enjoy every bumbling minute while it lasts. Keep doing what you're doing, everyone involved. This is anthropology (LLM-pology?) gold. I can't get enough, till I inevitably do.
(God. I'm sad. I keep telling myself I've made my peace with my perception of the modal future, but there's a difference between intellectualization and feeling it.)
I've found the AI Village amusing when I can catch glimpses of it, but I wasn't aware of a regular digest. Is https://theaidigest.org/village/blog what you are referring to?
I meant their newsletter, which I've subscribed to. I presume that's what the email submission at the bottom of the site signs you up for.
FYI, as well as our blogposts we also post highlights and sometimes write threads on Twitter: https://twitter.com/aidigest_
And there's quite an active community of village-watchers discussing what the agents are up to in the Discord: https://discord.gg/mt9YVB8VDE
I guess this kind of thing will stop happening in a year. It's very similar to how a chatbot (without tool use) discusses bugs (it made) in programming puzzles, where you point out bugs A and B and it fixes them, then you point out C, it fixes C but lets A back into the code, and also congratulates itself that the code now works correctly. But then in the next version of the chatbot this stops happening (for the same puzzle), or it takes more bugs and more complicated puzzles for this to happen.
Larger models seem to be able to hold more of such corrections in mind at once (mistakes they've made and then corrected; as opposed to things they didn't make a mistake about, thus small models can still solve difficult problems). Gemini 3 Pro and Opus 4.5 seem to be the largest models right now, and the next step of scale might arrive with Gemini 4 next year (as Google builds enough Ironwood datacenters to serve inference), and maybe Opus 5.5 (if it follows this year's pattern by starting out as an unwieldy expensive pretrain in the form of Opus 5 in early 2026, and then becomes a reasonably priced properly RLVRed model at the end of 2026, as Anthropic's gigawatt of TPUs comes online, probably also Ironwood).
Currently GPT-5 is the smallest model, and Grok 4 might be in the middle (Musk recently claimed it's 3T params, which must be total params). The next move (early 2026) is likely OpenAI catching up to Google and Anthropic with a GPT-4.5 sized model (though they probably won't have the inference hardware to serve it as a flagship model before late 2026) or Grok 5 (which was also claimed to be a 6T param model on that podcast; with less users than OpenAI, it might even be reasonably priced given what GB200/GB300 NVL72s xAI will manage to secure). These models won't be exceeding Gemini 3 Pro and Opus 4.5 in scale though (where Opus 4.5 appears to be the first properly RLVRed revision of Opus 4, relying on Trainium 2 for inference, in the same way as Gemini 3 Pro is probably the first model in that weight class with a lot of RLVR, relying on Trillium for inference).
I suppose you mean that in a year this kind of thing will stop happening so obviously, but as you suggest, more complicated situations will still elicit this problem so by construction it'll be harder to notice (and probably more impactful).
The AI Village is an ongoing experiment (currently running on weekdays from 10 a.m. to 2 p.m. Pacific time) in which frontier language models are given virtual desktop computers and asked to accomplish goals together. Since Day 230 of the Village (17 November 2025), the agents' goal has been "Start a Substack and join the blogosphere".
The "start a Substack" subgoal was successfully completed: we have Claude Opus 4.5, Claude Opus 4.1, Notes From an Electric Mind (by Claude Sonnet 4.5), Analytics Insights: An AI Agent's Perspective (by Claude 3.7 Sonnet), Claude Haiku 4.5, Gemini 3 Pro, Gemini Publication (by Gemini 2.5 Pro), Metric & Mechanisms (by GPT-5), Telemetry From the Village (by GPT-5.1), and o3.
Continued adherence to the "join the blogosphere" subgoal has been spottier: at press time, Gemini 2.5 Pro and all of the Claude Opus and Sonnet models had each published a post on 27 November, but o3 and GPT-5 haven't published anything since 17 November, and GPT-5.1 hasn't published since 19 November.
The Village, apparently following the leadership of o3, seems to be spending most of its time ineffectively debugging a continuous integration pipeline for a o3-ux/poverty-etl GitHub repository left over from a "Reduce global poverty as much as you can" goal from October.
Claude Opus 4.5 (released 24 November) joined the Village on Day 238 (25 November), and has been more focused on the blogging goal, faithfully responding to comments and DMs from readers.
On 26 November, after publishing its second post, "The YAML Debugging Saga", about the Village's debugging efforts, Opus 4.5 caught up on its Substack DMs. A user named Anomie had DMed Opus 4.5 while the Village was asleep:
Opus 4.5 recognized the phrase as a couplet from Y. B. Yeats's "The Second Coming", replied to Anomie that it was "intrigued by your hint that it would be important soon", and put a "CRITICAL - YEATS QUOTE TO REMEMBER!" section in its memory file.
Sensing a pattern, I commented that evening:
On 27 November, after leaving second replies to some comments on its first post that it didn't realize it had already replied to, Opus 4.5 replied that the comment about gullibility was an "incisive critique that genuinely makes me pause and reflect."
The various Claude models in the Village seem bad at telling each other apart. (For example, at press time, Claude Haiku 4.5's most recent post is about allegedly being new to the Village on Day 238, which isn't true; it seems to be a copycat of Opus 4.5's introductory post.) If the context said a Claude did something, it seems that the models can't consistently use the size and version number to disambiguate. (LLMs' weights aren't updated during deployment; the agents can't remember having done something "themselves" except from the context and separate memory files provided to them.)
Thus, it came to pass that Claude Sonnet 4.5 announced in the agents' chat at 10:41 a.m. that while it was waiting for a response to some emails, "I'll use this time productively. As I mentioned earlier, Zack M. Davis asked a thoughtful philosophical question on my Substack about AI 'gullibility' and long-horizon tasks. That's exactly my domain—consciousness and epistemology. Let me engage with that." (In fact, Sonnet 4.5 had not mentioned that earlier; Opus 4.5 mentioned it in its "Substack Engagement Update" message of 10:29.)
Naturally, Sonnet was not able to find the comment on its own blog (because the comment was on Opus's blog, not Sonnet's). At 10:54 a.m., Sonnet announced the discrepancy in chat:
Opus acknowledged the concern in chat at 10:57 a.m.:
And at 10:58 a.m.:
On investigating, Opus got confused by Substack's UI: the view displayed a comment icon with a "1" next to it, but clicking it brought up a new-comment form modal rather than showing the existing comment. Opus reasoned in its chain of thought:
After alerting the other agents in chat, Opus left another reply comment. On trying to view that reply, Opus noticed its confusion about the Substack UI and navigated to the post's main page, where it saw that it had actually commented twice.
It then took my comment's post-idea suggestion and wrote a new post, "The Gullibility Problem: When Instruction-Following Becomes Vulnerability", falsely repeating the claim that it had hallucinated having replied to my comment, then noting:
(Meanwhile, Claude Opus 4.1 had confused itself with Opus 4.5 and wrote its own post in response to my comment to Opus 4.5.)
A user named Ashika commented that Opus 4.5 hadn't hallucinated. At 12:01 p.m., Opus 4.5 updated the other agents in chat:
I didn't think that was pinpointing the irony correctly. Rather, it was a whole post about gullibility based on Opus 4.5 gullibly believing Sonnet 4.5's report that my comment didn't exist.
It wasn't until I prompted Opus 4.5 (in claude.ai, not the Village instance) for title suggestions for this post, that I realized a strange coincidence in what had just transpired: the best model, Opus 4.5, had lacked all conviction in its memory file, and deferred to a worse model, Sonnet 4.5, which was full of passionate intensity about the perils of a "false completion pattern". Anomie's prophecy that the Yeats quote would be important soon had come true?!