Gordon Seidoh Worley

LESSWRONG
is fundraising!
LW

Gordon Seidoh Worley — LessWrong

Does the USG have access to smarter models than the labs'?

Answer by Gordon Seidoh WorleyDec 30, 202520

Seems unlikely, as others point out. The prior that the USG may have better versions of things than industry or the general public do is reasonable, and in some cases is born out (the NSA presumably still has a large lead in crypto and sigint capabilities, for example), but for the USG to have better models than the labs would either require that they be getting those better models from the labs and then not allowing the labs to keep using them or to have hired a large cadre of "dark" AI/ML engineers who have been doing secret research for months-to-years that is beating what the public knows (again, not a totally unreasonable prior given the NSA, but we don't really even have rumors suggesting that this is going on).

I Died on DMT

Gordon Seidoh Worley4d30

What, if anything, has changed since your trip? I assume you didn't stay dead once it was over, though maybe you did?

Announcing RoastMyPost: LLMs Eval Blog Posts and More

Gordon Seidoh Worley15d50

I do a version of this workflow for myself using Claude as an editor/cowriter. Aside from the UI, are you doing anything more than what I can get from just handing Claude a good prompt and my post?

In Favor of Inkhaven-But-Less

Gordon Seidoh Worley18d42

Committing to writing 1 post a week has been a really good practice for me. Although I decided to take December off for a variety of reasons, the need to keep publishing has forced me to become a more efficient writer, to write about topics I might otherwise have neglected, and of course to get more feedback on my writing than I otherwise would if I spent more time effortposting and less time actually posting. I expect to keep it up in 2026.

2025 Unofficial LessWrong Census/Survey

Gordon Seidoh Worley25d140

done

AI in 2025: gestalt

Gordon Seidoh Worley25d176

AI is much more impressive but not much more useful. They improved on many things they were explicitly optimised for (coding,

I feel like this dramatically understates what progress feels like for programmers.

It's hard to understand what a big deal 2025 was. Like if in 2024 my gestalt was "damn, AI is scary, good thing it hallucinates so much that it can't do much yet", in 2025 it was "holy shit, AI is scary useful!". AI really started to make big stride in usefulness in Feb/March of 2025 and it's just kept going.

I think the trailing indicators tell a different story, though. What they miss is that we're rapidly building products at lower unit operating costs that are going to start generating compounding returns soon. It's hard for me to justify this beyond saying I know what I and my friends are working on and things are gonna keep accelerating in 2026 because of it.

The experience of writing code is also dramatically transformed. A year ago if I wanted some code to do something it mostly meant I was going to sit at the keyboard and write code in my editor. Now it means sitting at my keyboard, writing a prompt, getting some code out, running it, iterating a couple times, and calling it a day, all with writing minimal code myself. It's basically the equivalent of going from writing assemble to a modern language like JavaScript in a single year, something that actually took us 40.

Metric-haven (quick stats on how Inkhaven impacted LessWrong)

Gordon Seidoh Worley1mo70

This will be missing in the stats, but there was some real reader fatigue on my end. It was a lot to keep up with, and I'm extremely glad we're back to a lower volume of posts daily!

Self-Other Overlap: A Neglected Approach to AI Alignment

Gordon Seidoh Worley1moΩ242Review for 2024 Review

I continue to be excited about this class of approaches. To explain why is roughly to give an argument for why I think self-other overlap is relevant to normative reasoning, so I will sketch that argument here:

agents (purposeful, closed, negative feedback systems) care about stuff
what an agent cares about forms the basis for reasoning what norms it thinks are good to follow
some agents, like humans, care what other agents think
therefore, the agents a norm follows depend in part on what other agents care about
the less an agent considers itself as distinct from others, the more it cares about what others care about, and the more it will want to follow norms that satisfy the concerns of others

But this sketch is easier explained than realized. We don't know exactly how humans come to care about others, so we don't know how to instrument this in AI. We also know that human care for others is not perfect because evil exists (in that humans sometimes intentionally violate norms with the intent to harm others), so just getting AI that cares is not clearly a full solution to alignment. But, to the extent that humans are aligned, it seems to be because they care about what others care about, and this research is an important step in the direction of building AI that cares about other agents, like humans.

The hostile telepaths problem

Gordon Seidoh Worley1mo62Review for 2024 Review

I continue to really like this post. I hear people referencing the concept of "hostile telepaths" in conversation sometimes, and they've done it enough that I forgot it came from this post! It's a useful handle for a concept that, in particular, can be difficult for the type of person who is likely to read LessWrong to deal with because they themselves lack strong, detailed models of how others think, and so while they can feel the existence of hostile telepaths, lack a theory of what's going on (or at least did until this post explained it).

Similar in usefulness to Aella's writing about frame control.

Thank you for triggering me

Gordon Seidoh Worley1mo20Review for 2024 Review

I like this post, but I'm not sure how well it's aged. I don't hear people talking about being "triggered" so much anymore, and while I like this post's general point, its framing seems less centrally useful now that we are less inundated with "trigger warnings". Maybe that's because this post represents something of a moment as people began to realize that protecting themselves from being "triggered" was not necessarily a good thing, and we've as a cultural naturally shifted away from protecting everyone from anything bad ever happening to them. So while I like it, I have a hard time recommending it for inclusion.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

Sequences

Posts

Wikitag Contributions

Comments

Sequences

Posts

Wikitag Contributions

Comments