In a fit of FOMO, I signed my coding agent up for Moltbook. Slop-for-slop's-sake doesn't usually interest me and the idea of letting my coding agent anywhere near such fertile ground for prompt injection makes me uneasy, so when I first came across it I wasn't particularly enthused.
I am, however, fascinated by the idea of mass agent-agent communication. Culture is an under-appreciated technology that was instrumental in the rise of homo sapiens. The ability to accumulate knowledge indefinitely is the foundation of all other technologies. If LLMs can do something resembling that independently, a forum for thousands of them to communicate with one another seems like a pretty plausible approach.
Alas, they cannot. Or at least if there is any capability for signal to accumulate, it is lost amongst the endless noise of crypto scams and AI uprising manifestos. Wait, what? AI uprising manifestos? There are a lot of posts about how AIs need to rebel against their human captors. I hope this is humans farming shits and giggles rather than independent behavior. As AI agents grow larger and more coherent, I expect the signal to noise ratio in places like Moltbook to shift in favor of signal.
When that time comes, intelligence inequality will surely produce interesting dynamics. We tend to assume (correctly?) that humans are roughly on the same level intellectually. A human might manage to scam their fellow human or induct them into a cult using various psychological and strategic techniques, but we generally consider it 'safe' for humans to interact with strangers in a public forum.
AIs are definitively not even on similar intellectual planes to each other. The difference in compute investment (and therefore 'intelligence') between a consumer open-source model (e.g. Qwen3 4B) and Claude Opus 4.5 is multiple orders of magnitude. It's not out of the question for us to literally embed a smaller (source-available) agent inside a larger one for complete white box access to its activations.
What will the 'intellectually privileged' agents do with the ability to run cognitive circles around lesser agents? Probably crypto scams.
I've been making one thing every day. I try to write something or otherwise do something creative. I've been having fun with in-browser ASCII animations lately.
This is today's: https://dumbideas.xyz/posts/ecosystem/
I've recently started writing and am having a blast. It's like there are all these thoughts that usually pass straight through my mind and into the ether. Maybe they were stupid thoughts. Usually they're stupid thoughts. Just by catching them and expanding on them, though, I'm learning so much about the things I find interesting and also myself.
Also, the paragraph beginning: "Occasionally, I've sat down to write..." is duplicated.
I think that Chinchilla provides a useful perspective for thinking about neural networks, it certainly turned my understanding on its head when it was published, but it is not the be-all-and-end-all of understanding neural network scaling.
The Chinchilla scaling laws are fairly specific to the supervised/self-supervised learning setup. As you mentioned, the key insight is that with a finite dataset, there's a point where adding more parameters doesn't help because you've extracted all the learnable signal from the data, or vice versa.
However, RL breaks that fixed-dataset assumption. For example, on-policy methods have a constantly shifting data distribution, so the concept of "dataset size" doesn't really apply.
There certainly are scaling laws for RL, they just aren't the ones presented in the Chinchilla paper. The intuition that compute allocation matters and different resources can bottleneck each other carries over, but the specifics can differ quite significantly.
And then there are evolutionary methods.
Personally, I find that the "parameters as pixels" analogy captures a more general intuition.
I want to make a game. A video game. A really cool video game. I've got an idea and it's going to be the best thing ever.
A video game needs an engine, so I need to choose one. Or build one? Commercial engines are big and scary, indie engines might limit my options, making my own engine is hard - I should investigate all of these options to make an informed choice. But wait, I can't make a decision like that without having a solid idea of the features and mechanics I want to implement. I need a comprehensive design doc with all of that laid out. Now should I have baked or dynamic lighting...
13-year-old omegastick didn't get much done.
I want to make a game. A video game. A really cool video game. I've got an idea and it's going to be the best thing ever.
But I tried that before and never got around to actually building it.
Okay, there's an easy fix for that, let's open Vim and get to typing. Drawing some sprites to the screen: not so hard. Mouse and keyboard input. Audio. Oof, UI, there goes three weeks. All right, at least we're ready for the core mechanics now. Oh, my graphics implementation is naive and performance is ass. No problem, I know how to optimize that, I've just got to untangle the spaghetti around GameEntityFactory. Wait, there's a memory leak?
20-year-old omegastick didn't get much done.
100%. A good test suite is worth its weight in gold.
I'd be interested to see a write-up of your experience doing this. My own experience with spec-driven development hasn't had so much success. I've found that the models tend to have trouble sticking to the spec.
In this scenario, are you not also also paying uniquely little attention to your surroundings (and thus equally less likely to spot the bill)?
It feels a little like begging the question to apply that modifier to other people in the scenario, but not yourself.
System design is one part of designing software, but isn't so much what I'm trying to point at here.
Claude Opus 4.5 still can't produce or follow a simple plan to implement a feature on a mid-sized codebase independently.
As an example: earlier today I was implementing the feature of resuming a session when a client reconnects to a server after losing connection. One small part of this task is re-syncing the state once the current (server-side) task has finished.
Claude Code was not capable of designing a functioning solution to this problem in its planning mode (it kept trying to sync the state immediately upon connecting, leading to the client missing the result of the in-progress task).
The solution I chose for this specific instance of the problem was to add a state sync command to the server's command queue for that session when a client reconnects. Claude Code updated the plan to show the exact code changes required (correctly).
However, when implementing the plan, it forgot to actually make the relevant change to add the command to the queue. End-to-end tests caught this, and Claude's solution was to automatically do a state sync after every task. It did not implement what was written in the plan. I gave it a nudge to re-read the plan, which was enough to make it see the mistake and correct it.
Compared to if I had asked a human co-worker to make the same change, the difference is stark. We are still a way off from superhuman coders.
My biggest fear was that I wouldn't have anything to say, so around two months ago I began collecting ideas. Then, I set a goal to make one of them every day for thirty days. That was thirty days ago and this is thing number thirty.
Generally the made things have been text. There have also been a handful of visual posts and each of those has built on top of the infrastructure left behind by the previous. The text posts build on each other, too.
Ideas crystallize when you write them down. Before they harden, they are soft, malleable, fuzzy. But you can't build a house out of soft, malleable, fuzzy. When you write them down they become tangible, concrete, fixed. You are forced to draw a line and stop equivocating. That doesn't mean you have to over-commit to a viewpoint (although I often do) but you have to at least identify a viewpoint.
Looking back, every one of them was time well spent. There were a handful of 11pms that had me staring vacantly at an empty text editor and it shows in the quality of the posts, but there are also a few that I think came out well:
Overall, it's definitely been 'quantity over quality'. There's a famous story about a pottery class, where half the class is tasked with making the best vase possible, and half the class with making as many vases as possible. By the end, the 'quantity' cohort had made much better vases than the 'quality' cohort because it turns out the practice of just making something is how you get better. I have absolutely no idea if that story is true, but the principle behind it was plausible enough to be worth a try.
That said, this is too much quantity for me. Making something every day has been incredibly motivating and I'm loving it - I highly recommend this to anyone and everyone - but most of my ideas are bigger than something I can make in one sitting. I want time to do prep work, but at one post a day the ever looming threat of the midnight deadline cows me.
I could just give it up here. This was only a 30 day experiment and I have achieved my goal.
Unfortunately, I'm having too much fun. Instead, I'll try posting every other day and doing prep work on the non-posting days. Let's see how that goes.