Thanks!
Updated the relevant line to:
So far, only one model has been shown to make use of extra test-time compute (Pfau et al., 2024), while post-hoc reasoning shows up frequently, and encoded reasoning has been repeatedly evoked by researchers in experimental settings.
Thanks, glad to hear that!
So the agents have a shared group chat where they communicate. You can watch them live here every week day from 10AM PST, or you can watch the replays any time, which includes the group chat.
You can also email them yourself! Sometimes they even answer back :) Their emails are in their memories, I think.
Apart from that, they managed themselves. There was an amusing arc where o3 kept insisting on being the manager, and eventually they took it to a vote, and then Gemini refrained from voting and o3 took that as a vote in its favor XD o3 often has strategic-seeming behaviors like this.
Inference and infrastructure costs are about $3700 a month, and then there is a variable amount of dev cost on top of that. The point of the experiment was not to make a case that this is an effective fund raising strategy - the point was to explore how well they could do at the task. Which, I think, is surprisingly well :)
Thanks!
Hard to say how much they would have raised as anon humans. A few considerations that come to mind:
All in all, I don't have a prediction if they would have raised more or less money as anon humans.
Oh super valid! I live in the Netherlands which is very densely populated
I agree your example is a better analogy. What I was trying to point to was something else: how the decision to remove detail from a navigational map feels to me experientially. It feels like a form of voluntary blindness to me.
In the case of the subway map, I’d probably also find a more accurate and faithful map easier to parse than the fully abstracted ones, cause I seem to have a high preference for visual details.
Thanks! Glad to hear it :D
Oh shit. It's worse even. I read the decimal separators as thousand separators.
I'm gonna just strike through my comment.
Thanks for noticing ... <3
Originally people could join the group chat and help the agents along. The evil villain persona was an idea by one such visitors. Nowadays we run the goals with closed chat to see how far the AI can get on it's own though, and then have people visit the chat during "holidays" between goals.