I'm writing a book about epistemology. It's about The Problem of the Criterion, why it's important, and what it has to tell us about how we approach knowing the truth.
I've also written a lot about AI safety. Some of the more interesting stuff can be found at the site of my currently-dormant AI safety org, PAISRI.
What, if anything, has changed since your trip? I assume you didn't stay dead once it was over, though maybe you did?
I do a version of this workflow for myself using Claude as an editor/cowriter. Aside from the UI, are you doing anything more than what I can get from just handing Claude a good prompt and my post?
Committing to writing 1 post a week has been a really good practice for me. Although I decided to take December off for a variety of reasons, the need to keep publishing has forced me to become a more efficient writer, to write about topics I might otherwise have neglected, and of course to get more feedback on my writing than I otherwise would if I spent more time effortposting and less time actually posting. I expect to keep it up in 2026.
AI is much more impressive but not much more useful. They improved on many things they were explicitly optimised for (coding,
I feel like this dramatically understates what progress feels like for programmers.
It's hard to understand what a big deal 2025 was. Like if in 2024 my gestalt was "damn, AI is scary, good thing it hallucinates so much that it can't do much yet", in 2025 it was "holy shit, AI is scary useful!". AI really started to make big stride in usefulness in Feb/March of 2025 and it's just kept going.
I think the trailing indicators tell a different story, though. What they miss is that we're rapidly building products at lower unit operating costs that are going to start generating compounding returns soon. It's hard for me to justify this beyond saying I know what I and my friends are working on and things are gonna keep accelerating in 2026 because of it.
The experience of writing code is also dramatically transformed. A year ago if I wanted some code to do something it mostly meant I was going to sit at the keyboard and write code in my editor. Now it means sitting at my keyboard, writing a prompt, getting some code out, running it, iterating a couple times, and calling it a day, all with writing minimal code myself. It's basically the equivalent of going from writing assemble to a modern language like JavaScript in a single year, something that actually took us 40.
This will be missing in the stats, but there was some real reader fatigue on my end. It was a lot to keep up with, and I'm extremely glad we're back to a lower volume of posts daily!
I continue to be excited about this class of approaches. To explain why is roughly to give an argument for why I think self-other overlap is relevant to normative reasoning, so I will sketch that argument here:
But this sketch is easier explained than realized. We don't know exactly how humans come to care about others, so we don't know how to instrument this in AI. We also know that human care for others is not perfect because evil exists (in that humans sometimes intentionally violate norms with the intent to harm others), so just getting AI that cares is not clearly a full solution to alignment. But, to the extent that humans are aligned, it seems to be because they care about what others care about, and this research is an important step in the direction of building AI that cares about other agents, like humans.
I continue to really like this post. I hear people referencing the concept of "hostile telepaths" in conversation sometimes, and they've done it enough that I forgot it came from this post! It's a useful handle for a concept that, in particular, can be difficult for the type of person who is likely to read LessWrong to deal with because they themselves lack strong, detailed models of how others think, and so while they can feel the existence of hostile telepaths, lack a theory of what's going on (or at least did until this post explained it).
Similar in usefulness to Aella's writing about frame control.
I like this post, but I'm not sure how well it's aged. I don't hear people talking about being "triggered" so much anymore, and while I like this post's general point, its framing seems less centrally useful now that we are less inundated with "trigger warnings". Maybe that's because this post represents something of a moment as people began to realize that protecting themselves from being "triggered" was not necessarily a good thing, and we've as a cultural naturally shifted away from protecting everyone from anything bad ever happening to them. So while I like it, I have a hard time recommending it for inclusion.
Seems unlikely, as others point out. The prior that the USG may have better versions of things than industry or the general public do is reasonable, and in some cases is born out (the NSA presumably still has a large lead in crypto and sigint capabilities, for example), but for the USG to have better models than the labs would either require that they be getting those better models from the labs and then not allowing the labs to keep using them or to have hired a large cadre of "dark" AI/ML engineers who have been doing secret research for months-to-years that is beating what the public knows (again, not a totally unreasonable prior given the NSA, but we don't really even have rumors suggesting that this is going on).