In my experience of programming, the really hard part was figuring out which packages weren’t installed or weren’t updated or were in the wrong folder, causing the test we’d done in class to completely fail to work in the same way on my own computer. The next really hard part was Googling everything the debugger spat out to find an explanation of how to make it go away.
this is only sometimes true in my work. i find there are two main kinds of work that i encounter:
for the former, i spend very little time dealing with this kind of stupid bug thing. for the latter, there is a much larger amount, but it is often not solvable via the "just google/slack-search for the error and whack it until it works" loop. often i need to deeply understand the code to debug it, and the reason it wasn't working was because of some kind of misunderstanding of what the computer was being told to do on my part. this also varies a lot by specific codebase; the worst codebases i have dealt with were impossible to understand and only fixable by whacking it.
my guess is i do a lot more of the first thing than the average lab employee
Claude Code does a lot more than code, but the name and command line scare people.
Anthropic realized a rebrand was in order. Two weeks later, we have Claude Cowork, written entirely by Claude Code.
Did you know that chat interfaces were always (mostly) secretly a command line?
This is still very much a research preview, available only for Claude Max users on Macs with a bunch of bugs and missing features. It will improve rapidly over time.
Cowork combines a lot of the power of Claude Code with the ordinary chat interface, giving it access to a folder on your computer and to Claude Code’s planning and agentic capabilities. It can use that folder as context, to download, to organize and create files, and it can be paired with Claude for Chrome and use your existing connectors.
The system prompt is here, the core non-tooling parts seem unchanged. This post will cover Claude Cowork, and also updates since last week on Claude Code.
Early Takes
What exactly can it do at this early stage?
Lenny Rachitsky tests Cowork with a set of 320 of his podcast transcripts and asks it to pull out the 10 most important themes and 10 most counterintuitive truths, and thinks it did a good job in its 15 minutes of work. Seemed solid to me.
The most credible signal of respect is admitting that a release killed your startup product, which we see here with Eigent.
This is the first feedback so far about what it’s intended to do:
Rename it away from code, normie figures out they can have it code.
If that’s true with the version that’s two weeks old, the sky’s the limit. We don’t have much data because there aren’t that many normies with $200/month Claude Max subscriptions.
Minimum Viable Product
It’s early days, and she reports there were still some other kinks being worked out. In particular, the connectors are having problems.
One thing Claire Vo noted was it asked for approvals on file openings too much. I have a similar complaint with Claude Code, that there’s a bunch of highly safe similar actions that shouldn’t need permission.
Claire also noted that Claude Cowork exposed too many technical files and notes about what it was doing to the user, such as the code used to generate things, which could be confusing to non-technical users. My guess is that such files can be stored in a subdirectory where such users won’t notice, which keeps it available for those who want it, and ‘tell me more about what you’re doing on a technical level’ can be a setting, since the users who want it set to no won’t even notice the option exists.
Maximum Viable Product
There is a huge overhang in AI capabilities.
Thus, a common pattern is that someone figures out a way to do useful things at all that humans are willing to learn how to use. And then we muddle down that road, and it’s not first best but it still wins big.
That’s what Claude Code was, and now that’s what Claude Cowork will be for normies. Presumably OpenAI and Google, and then others, will soon follow suit.
Backup Your Directory
If you’re worried Claude Cowork or Claude Code will delete a bunch of stuff in a directory, and you don’t want to use a full virtual sandbox solution, there’s a rather simple solution that also works, which is: Backup the directory, to a place Claude can’t get to it. Then if the worst happens, you restore the backup.
Meanwhile In Claude Code News
The latest guide to Claude Code, feedback seems very good. Key highlights:
Here’s another report of what Claude Code has been good for, with three big unlocks for APIs, connecting distinct products and running things regularly:
As always, one could have done all of this any number of other ways, but this deals with the problem of activation energy.
Dean Ball has, in the past month, used coding agents to do the following:
I’m not there yet, largely because we think in different ways, but largely because I’m just getting started with ‘oh right coding things just happens, do coding agent shaped things.’
Dean Ball nails it that coding agents are most helpful exactly when you don’t have to ship your software to third parties. I presume that the code underneath everything I’m having Claude build would horrify professional coders. That’s fine, because even in the places I do ship (cause why not ship, someone might find it useful) I’m not trying to not horrify people. What matters is it works, and that I’m ‘using coding agent shaped requests,’ as Dean puts it, to increasingly get things done.
The coding agents will still produce the most value for professional coders, because they can go into supercharged mode with them and get the most out of them, but that requires the professionals to swim upstream in ways the rest of us don’t have to.
So, say this is what you want:
Exactly. I haven’t built a custom fact checker yet, but the only thing stopping me is ‘it hadn’t yet occured to me it was sufficiently easy to do that’ combined with ‘I have not yet gotten around to it.’ Check back with me in six months and I bet I do have one, I’m actually building towards such things but it’s not near the top of that queue yet.
As Alex Albert puts it, you get to stop thinking doing something is ‘not worth your time,’ or for Simon Willison entire features are no longer ‘not worth your time’ at least not until they run into serious trouble.
Dean offers various additional coding agent thoughts, and a highly basic guide, in the rest of his weekly post.
Alex Tabarrok did his first Claude Code project. Noncoders skilling up is a big deal.
Joe Weisenthal did his first Claude Code project and now we have Havelock.ai, which gives us an ‘orality detector’ for text, essentially employing the Ralph Wiggum technique by continuously asking ‘what should I do to make it better?’
Linus Torvarlds (the creator of Linux) is doing at least some vibe coding, in this case using Antigravity.
Claude may not yet in its official test be a Pokemon master, but Claude Code is now somewhat of a RollerCoaster Tycoon, with various strengths and weaknesses. Dean Ball suggests you can use Claude Code to do game dev on new ‘[x] tycoon’ games as a niche topic learning exercise. Oliver Habryka challenges whether it’s good enough at game dev for this. As Patrick McKenzie points out, if the game is text based that helps a lot, since visual aspects are a key weakness for now.
You Are The Bottleneck
Kelsey Piper reports on her experience with using and yelling at Claude Code.
She and I are very similar types of programmers:
It’s not that it is easy to know what you want the computer to do, especially if you expand that to include ‘what do I even want to be trying to do today at all.’ Both the macro and micro ‘what are we even doing’ questions are hard. I still spent 90% of my time dealing with packages and syntax and setup and knowing exactly how to do it.
The problem is that, as Kelsey observes, you will spend your time on the bottleneck, whatever that bottleneck might be, and this will be frustrating, especially as this will often be something stupid, or the particular place Claude Code happens to act stupid given the way you’re prompting it.
I am happy to report that I haven’t been yelling at Claude Code when it messes up. But yeah, it messes up, because I keep trying to get it to do more until it messes up.
Claude Code Upgrades
Numman Ali says v2.1.3 has ‘solved the compaction issue’ so long as you use planning mode and explicitly ask the model for a comprehensive TODO list. It’s hard to tell, but I’ve certainly blown over the compaction line on many tasks and when I’ve saved the necessary context elsewhere it’s mostly turned out fine.
What Clade Code cannot do is allow its harness to be spoofed to use subscriptions. You can either use Claude Code, or you can access Claude via the API, but it’s a terms of service violation to spoof the harness to let you use your subscription allocation. I’d be inclined to let the harnesses stay in place despite the problems described here, so long as the unit economics are not too horrendous. In general I think Anthropic is too focused on getting to profitability quickly, even if you think OpenAI is rather too willing to burn money.
Anthropic reportedly cuts xAI and other major competitors off from Claude.
Oh No
In the interest of not silencing critics, Holly Elmore claims I’m bad now because I’m enthusiastic about getting use out of Claude Code, a ‘recursively self-improving agent.’
I affirm David Manheim’s response that there is no reason for an individual not to use such tools for their own purposes, or not to get excited about what it can do outside of potentially dangerous forms of self-improvement.
I do agree that the vibes in that post were a bit off by not also including awareness of where sufficiently advanced coding agents lead once they start self-improving in earnest, and there is value in having a voice like Holly’s that says the basic thing clearly.
However I also think that there is no contradiction between ‘recursive self-improvement is super dangerous and likely to get us all killed’ and ‘you should be taking full advantage of Claude Code for practical purposes and you’re leaving a lot on the table if you don’t.’
I’m In Danger
There is a new method called the ‘Ralph Wiggum’ technique, where you tell Claude Code continuously to ‘improve the code’ it has already written. Some say it works great, but the name does not inspire confidence.
The world is collectively underinvesting in optimizing and standardizing such techniques. Some well-designed version of this would presumably be great, and the more parallelization of agents is going on the more valuable it is to optimize non-interruption over token efficiency.
In The Beginning Was The Command Line
What is the difference between a command line and a chat interface?
Both are text in and text out.
Both allow attachments, at least in Claude Code mode.
Both can have sandboxes, run code, and so on.
The main real difference is that the terminal makes it annoying to edit prompts?
It’s almost entirely about perception. One feels like talk with an entity, one like commands and bash scripts. One looks like a slick modern UI, the other a stark black text box.
There is also a clear plan to have different system prompts, and to build in a different more user friendly set of default connectors and tools.
That plus the change in perception could be a really, really big deal.