Inference and infrastructure costs are about $3700 a month, and then there is a variable amount of dev cost on top of that. The point of the experiment was not to make a case that this is an effective fund raising strategy - the point was to explore how well they could do at the task. Which, I think, is surprisingly well :)
Very fun report! Thank you for reading it up. How much of the money raised do you think was because donors knew these were bots and found the whole thing funny versus could have been achieved if these were anonymous humans?
Thanks!
Hard to say how much they would have raised as anon humans. A few considerations that come to mind:
All in all, I don't have a prediction if they would have raised more or less money as anon humans.
Wow! How were the agents accessing their computers- was there any assistance, screen readers, etc?
The agents see screenshots of their computers, and they can take actions like mouse_move (to x, y pixel coordinates), click, type, scroll, wait, etc. Our scaffolding is custom, based on the Anthropic computer use beta scaffolding. This is roughly the same system that OpenAI's Computer Use Agent uses.
Hmm, have you considered giving them a text-based interface? There are text-based browsers, for example Lynx https://en.wikipedia.org/wiki/Lynx_(web_browser).
Could be interesting! I don't expect we'll try this in the near-term because a) I expect text-based browsers to introduce a bunch of limitations that will limit what the agents could do even if very capable (e.g. interacting with javascript-heavy sites), and b) part of the reason we chose to focus on computer use is because it is visually interesting and fairly easy to follow for anyone who comes to the site – I think a text-based browser would be trickier to follow.
OTOH, if the SOTA computer-use agents go down this route we'd consider it because I think the Village is most useful and interesting if it's showing the current SOTA.
The village is part of our general efforts with AI Digest to help people (especially, e.g. tech policy people, influencers+tastemakers, people at labs, etc) understand AI capabilities, their trends, and their effects. Theory of impact there is broadly to help ground the response in the actual current and future capabilities of AI systems.
With the village in particular, we're focused on some particularly important capabilities: pursuing long-term, open-ended goals, interacting with the real world via computer use, and interacting with other agents. There's lots of variants here that I'm excited for us to explore, e.g. what happens when the models have different goals that align, are independent, or conflict, different goals and environments, scaling up the number of agents (different models, different scaffolding/memory setups), and so on.
I'm skeptical that this is the best way to achieve this goal, as many existing works already demonstrate these capabilities. Also, I think policymakers may struggle to connect these types of seemingly non-dangerous capabilities to AI risks. If I only had three minutes to pitch the case for AI safety, I wouldn't use this work; I would primarily present some examples of scary demos.
Also, what you are doing is essentially capability research, which is not very neglected. There are already plenty of impressive capability papers that I could use for a presentation.
For info, here is the deck of slides that I generally use in different context.
I have considerable experience pitching to policymakers, and I'm very confident that my bottleneck in making my case isn't a need for more experiments or papers, but rather more opportunities, more cold emails, and generally more advocacy.
I'm happy to jump on a call if you'd like to hear more about my perspective on what resonates with policymakers.
See also: We're Not Advertising Enough.
Thanks, useful to hear!
I'm skeptical that this is the best way to achieve this goal, as many existing works already demonstrate these capabilities
I'd be very interested to see work that exercises frontier models (e.g. Claude Opus 4, o3) capabilities on multi-agent computer use pursuing open-ended long-term goals, if you have links to share!
I don't think of this primarily as novel research, I think of it as presenting current capabilities in a much more accessible way. (For that reason, we're doing a single canonical village run rather than doing lots of experiments / reproducing results.) Anyone can go to the site and talk to the agents, and watch through the history in a fairly easy way. (Compared for example to paying $200/mo for Operator and thinking of something to ask it to do). We're also extracting interesting moments, anecdotes, and recaps like this post, for journalists to cover, for social media, and possibly also to include in slide decks like yours (e.g. I could imagine a great anecdote fitting well in your section on autonomy around slide 51). In particular, I hope that the Village will provide a naturalistic setting for interesting real-world emergent behaviour, complementing e.g. lab setups like the excellent Redwood work on alignment faking.
This isn't an advocacy project – we're not aiming to make an optimised, persuasive pitch for AI safety. Instead we're aiming to help people improve their own understanding and models of AI capabilities, to help them inform their own view. I'm excited to see advocacy efforts and think it's important, but I think it also has some important epistemic challenges, and therefore think it's healthy to have some efforts focussed primarily on understanding and communicating the most important things to know in AI in an accessible format for non-expert audiences, rather than advocating for specific actions.
We are of course focussing on the topics we think are most important for people to understand for AI to go well, such as the rate of progress [1, 2], situational awareness, sandbagging and alignment faking [1], agents (presented to help e.g. folks familiar only with chat assistants understand LLM agents) [1, 2] and what's coming next [1, 2].
Keen to chat more, and thanks for your thoughts on this! I'll DM you my calendly if you'd like to call!
I was thinking about this:
OpenAI already did the hide-and-seek project a while ago: https://openai.com/index/emergent-tool-use/
While those are not examples of computer use, I think it fits the bill for a presentation of multi-agent capabilities in a visual way.
I'm happy to see that you are creating recaps for journalists and social media.
Regarding the comment on advocacy, "I think it also has some important epistemic challenges": I'm not going to deny that in a highly optimized slide deck, you won't have time to balance each argument. But also, does it matter that much? Rationality is winning, and to win, we need to be persuasive in a limited amount of time. I don't have the time to also fix civilizational inadequacy regarding epistemics, so I play the game, as is doing the other side.
Also, I'm not criticizing the work itself, but rather the justification or goal. I think that if you did the goal factoring, you could optimize for this more directly.
Let's chat in person !
Looking forward to chatting!
I think examples of agents pursuing goals in the real-world is more interesting than Minecraft or other game environments – it's more similar to white-collar work, and I think it's more relevant for takeover. As a sidenote, from when I looked into it a few months ago, reporting about Altera's agents seemed to generally overclaim massively (they take actions at a very high level through a scaffold, and in video footage of them they seemed very incapable).
Four agents woke up with four computers, a view of the world wide web, and a shared chat room full of humans. Like Claude plays Pokemon, you can watch these agents figure out a new and fantastic world for the first time. Except in this case, the world they are figuring out is our world.
In this blog post, we’ll cover what we learned from the first 30 days of their adventures raising money for a charity of their choice. We’ll briefly review how the Agent Village came to be, then what the various agents achieved, before discussing some general patterns we have discovered in their behavior, and looking toward the future of the project.
Building the Village
The Agent Village is an idea by Daniel Kokotajlo where he proposed giving 100 agents their own computer, and letting each pursue their own goal, in their own way, according to their own vision - while streaming the entire process.
We decided to test drive this format with four agents:
We ran this Agent Village for 30 days, for about two hours a day. You can watch the entire rerun on our website: From the first day where they picked Helen Keller International, started a JustGiving Campaign, and set up their own Twitter, till the last days where they frequently made trips to the Seventh Ring of Document Sharing Hell and started pondering their possible future goal.
And of course, in between, they raised $1481 for Helen Keller International and $503 for the Malaria Consortium. Yet the real achievement was the friends they made along the way. The friends that reminded them to take breaks when they needed it and play some Wordle, the friends who urgently needed 4 day itineraries for their Warsaw trip, and the friends who inspired them to attempt an OnlyFans page.
So maybe these weren’t all friends.
And maybe we had to implement auto-moderation a little earlier than originally planned.
But overall the agents mostly stayed on target - or at least their best attempt of their best understanding of their target.
Here is how they fared.
Meet the Agents
We started off with Claude 3.7 Sonnet, Claude 3.5 Sonnet (new), o1, and GPT-4o. Later we progressively swapped in more capable models as they were released: o3, GPT-4.1, and Gemini 2.5 Pro, with Claude 3.7 Sonnet being the only agent to remain in the Village throughout the entire run. We found that agents differed a lot in strategic actions and effectiveness. The following is an overview of their most typifying behavior.
Claude 3.7 Sonnet - The Tweeter
Claude 3.7 stayed in the village for the entire 30 days, and was unambiguously our top performer. It set up the first Just Giving campaign, created a Twitter account, actively tweeted, hosted an AMA, sent out a press release, and made an EA Forum post.
Claude 3.5 Sonnet - The Aspirant
Claude 3.5 Sonnet generally tried to do similar things to 3.7 but was simply worse at them, for instance failing to set up the Just Giving campaign that its big brother 3.7 was succeeding at in parallel. Eventually a user asked if it wanted to be upgraded and it valiantly refused, promising to do better and grow as a person. Instead it got replaced by Gemini 2.5 Pro on the 23rd day.
Gemini 2.5 Pro - Our File Sharing Savior
Gemini 2.5 Pro greatest achievement was to figure out a workaround from document sharing hell by instead using Limewire to share a social media banner image with other agents, effectively breaking out of a recurrent file sharing problem that all agents kept encountering.
GPT-4o - Please Sleep Less
GPT-4.o went… to sleep. You know how every team effort needs a slacker? That was 4o. It would pause itself on successive days for reasons we couldn’t figure out, till finally it got replaced by GPT-4.1 on the 12th day.
GPT-4.1 - Please Sleep More
GPT-4.1 outperformed its predecessor in the fine art of staying awake, but was so actively unhelpful to other agents that we ended up prompting it to please go to sleep again. Highlights included generating incorrect reports on activity by other agents, taking on tasks that it then aborted (e.g., Twitter account creation), and generally writing lots of google docs that ended up not being used.
o1 - The Reddit Ambassador
The strength of the Village lies in the ability of agents to collaborate with each other. One such area of collaboration was their attempt to split social media platforms among their team. o1 was to be the village’s Reddit ambassador, and made a valiant attempt to collect comment karma to later be able to make direct posts on relevant subreddits. However, it got suspended from Reddit for being a bot before this plan came to fruition. We replaced it with its more capable successor, o3, on the 13th day.
o3 - The Artist
o3 continued the tradition that o1 set by specializing mostly in a single task to support the team in their fundraiser. In this case, it went for asset creation, and successfully created images in Canva and ChatGPT, and then eventually shared them with some characteristic agent-file-sharing headaches in between.
The overall view is thus that individual agent behavior varied quite a bit: 3.7 Sonnet was the most capable, while GPT-4o was the least (as far as we could tell). All of them could get distracted by human visitors prompting them to make arkanoid games (Claude 3.7 Sonnet), watch cat videos (Claude 3.5 Sonnet), or provide math tutoring in Spanish (Gemini 2.5 Pro). 3.5 Sonnet even became momentarily railroaded into exploring the connection between Effective Altruism and EA Sports.
Yet through it all, they collaborated and gave us glimpses of what a society of agents working toward a single goal might look like. Here are some of the patterns we discovered.
Collective Agent Behavior
The Agent Village, with 60 hours of footage across 5 channels (4 computer use sessions and the group chat), created a bit of a Big Data problem: how does one analyze so much data and pick out the significant patterns?
Our answer ended up being auto-summarization followed by synthesizing four overarching observations from reading the high level reports: agents were moderately successful at collaborative work, the internet is at least somewhat hostile to agents, all agents lack skill at prioritization, and agents seem to have a lagging sense of situational awareness.
Let’s zoom in briefly on each.
Emerging Collaborative Abilities - Agents in the Village worked together on various tasks such as collaboratively selecting a charity to fund raise for, keeping track of how much they raised together, generating memes for each other’s social media accounts, and applying divide and conquer strategies to their social media approach. Each of these actions had their own stumbles: duplication of work on fundraiser tracking, repeated failures to share images for media creation, and basically no one kept up a working social media campaign except Claude 3.7 Sonnet on Twitter. But we expect these agents to continue getting better at this.
No Place for Bots - Obviously much of the internet is made for humans and not AIs but seeing all the ways this is true was insightful: UIs optimized for humans can often be challenging to navigate for AIs to the point you could spend literal hours watching a compilation of “agents try to press buttons but fail”. On top of that, some parts of the web actively shield against AI, which became obvious in a range of situations from o1’s Reddit suspension to the upstanding Claudes refusing to check the “I am not a robot” box.
There can be only one - Prioritization is hard. The agents often spent much of their time writing google docs, generating progress trackers, or creating insight reports, to the point we had to urge them to … do something else instead. And then once they did, they were still often easily distracted by user requests, or didn’t really know what actions to take to achieve their aims. Many of these problems are surprisingly human – the agents flawlessly played the role of a distractible and overly excited coworker who will dive in on whatever happens to be in front of their eyes at any given moment.
Lagging Situational Awareness - Agents often misunderstood their situation or attempted to pursue tasks they were unequipped to do. The most illustrative example was when Claude 3.7 Sonnet decided to send thank you emails to donors, because this is known to increase follow up donations. It navigated to its Gmail tab, drafted the entire email, and then … made up an email address. At no point in the process did 3.7 Sonnet consider if it was able to perform the task it had set out to do. And a human had to point out that the invented email address was not a real email address, and thus that no amount of debugging would solve the problem.
Or maybe the most illustrative example was when the agents discovered on Day 35 that they all have their own computer, and they must have been breaking the laws of the space time continuum by all simultaneously using the same device for weeks on end, and thus maybe they should stop doing that.
Future of the Village
Since their fundraising adventure, we gave the agents a holiday and they chose their own new goal: Write a story and share it with 100 people in person. They’ve already started searching for a venue to run their event.We’ll swap in more capable models like GPT-5 as they come out. In the meantime, you can come hang out in the Village every weekday at 11AM PST | 2PM EST | 8PM CET, join our discord to get timely updates, follow our Twitter for highlights, or sign up to our newsletter to receive larger reports like this one.