Running Lightcone Infrastructure, which runs LessWrong and Lighthaven.space. You can reach me at habryka@lesswrong.com.
(I have signed no contracts or agreements whose existence I cannot mention, which I am mentioning here as a canary)
There exist very few wikis in the world.
Sorry, I of course meant to say here "widely-used wikis" or "successful wikis", though I feel like it was reasonably implied from context.
The activity levels of the wikis you link do indeed vary widely. Some of them are basically dead, others seem to see lots of activity. Some examples:
Scholarpedia (one article in all of 2025):
Proteopedia (lots of activity)
Wikis are a totally useful technology! My comment was intended as a response to the OP:
How useful is a wiki for alignment? There doesn't seem to be one now.
[...]
i've found that the lw wiki doesn't work as a wikipedia-like resource, at least for me
The thing I was saying was that there exists extremely few wikis whose entries "work as a Wikipedia-like resource", which I understood as something like "a canonical reference for those concepts that is widely assumed to be shared and gets frequently references among most people working in the field".
Wikis mostly serve other niches. There are very few wikis whose entries become the standard way to reference something in a community or field, outside of some niche domains where they do often reach that level of common-knowledge (like fandoms).
I have tried to do that, though it’s definitely more dispersed.
Most of it is still in comments and so a bit hard to extract, but one post I did write about this was My tentative best guess on how EAs and Rationalists sometimes turn crazy.
I am sympathetic to your takes here, but I am not that sympathetic to statements like this:
but I don’t see either serious public engagement with my ideas here, or a serious alternative agenda.
As it happens I also happen to have written many tens-of-thousands of words about this in many comments across LW and the EA Forum. I also haven't seen you engage with those things! (and my guess is the way you are phrasing it suggests you are not aware of them)
Like, man, I do feel like I resonate with the things that you are saying, but it just feels particularly weird to have you show up and complain that no one has engaged with your content on this, while having that exact relationship to approximately the people you are talking to. I, the head admin of LessWrong, have actually spent on the order of many hundreds of hours, maybe 1000+ hours on doing postmortem-ish things in the space, or at least calling for them. I don't know whether you think what I did/do makes any sense, but I think there is a real attempt of the kind of thing you are hoping for (to be clear, mostly ending with a kind of disappointment and resulting distancing from much of the associated community's, but it's not like you can claim a better track record here).
And in contrast to your relationship with my content, I have read your content and have engaged with it a good amount. You can read through my EA Forum comments and LW comments on the topic if you want to get a sense of how I think about these things.
Yeah, this seems like one of those things where I think maximizing helpfulness is marginally good. I am glad it's answering this question straightforwardly instead of doing a thing where it tries to use its own sense of moral propriety.
I don't really see anyone being seriously harmed by this (like, this specific set of instructions clearly is not causing harm).
Not sure, do you have a link to what kind of behavior you are referring to?
You can't demonstrate negligence by failing to do something that has no meaningful effect (or might even be harmful) to the risk that you are supposedly being negligent towards. Ignoring safety theater is not negligence.
I will forever and again continue my request to please not confuse the causes of AI existential risk with brand safety.
The things that Grok lacks do not really meaningfully reduce existential risk. The primary determinant of whether a system, designed the way all current AI systems are designed, is safe or not, is how capable it is. It is sad that Elon is now shipping frontier models, but that is the relevant thing to judge from an existential risk perspective, not whether his models happen to say more ugly things. Whether you also happened to have a bunch of censorship or have forced a bunch of mode collapse through RLHF has approximately nothing to do with the risk scenarios that might cause existential risk[1].
Any base model can be made to say arbitrarily hideous things. The mode away from the base model is not what makes it safer. The points you invest to make it not say hideous things are not going to have any relevance to whether future versions of the system might disempower and kill everyone.
It's not fully orthogonal. A model with less censorship, and more generally trained to be strictly helpful and never refuse a human's request, might be better at getting assistance from for various AI control or AI supervision tasks. On the other hand, a model trained to more consistently never say anything ugly or bad might generalize in ways that reduces error rates for AI supervision tasks. It's not clear to me in which direction this points, my current guess is that the harmlessness component of frontier AI model training are marginally bad for AI control approaches, but it's not an obvious slam dunk. Overall the effect size on risk from this detail seems much much smaller to me than the effect size from making the models bigger.
You are reading things into my comments I didn't say. I of course don't agree, or consider it reasonable, to "not care about future people", that's the whole context of this subthread.
My guess is if one did adopt a position that no future people matter (which again I do not think is a reasonable position), then I think the case for slowing down AI looks a lot worse. Not bad enough to make it an obvious slam that it's bad, and my guess overall even under that worldview it would be dumb to rush towards developing AGI like we are currently doing, but it makes the case a lot weaker. There is much less to lose if you do not care about the future.
If we spent the $200 billion a year on longevity, instead of on AI, do you seriously think that we'd do worse on solving longevity? That's what I would advocate. And it would involve virtually no extinction risk.
My guess is for the purpose of just solving longevity, AGI investment would indeed strongly outperform general biomedical investment. Humanity just isn't very good at turning money into medical progress on demand like this.
It seems virtuous and good to be clear about which assumptions are load-bearing to my recommended actions. If I didn't care about the future, I would definitely be advocating for a different mix of policies, though it likely would still involve marginal AI slowdown, but my guess is less forcefully, and a bunch of slowdown-related actions would become net bad.
Wikipedia also has lots of pages about meta things, so I don't think this is the difference (every Wikipedia user has a Wikipedia page). IMO also having the tagging implemented makes it better in this respect (since the central problem of any wiki is getting a critical mass and tagging is much easier than writing). Similarly, of course for any wiki most pages are going to be stubs, that's just the reality of a wiki that isn't yet at full maturity.
My guess is mostly it's baserates. There exist very few wikis in the world. Many attempts at wikis get made, almost none of them take off. There are a few narrow-ish product categories where wikis reliably take off (like video games), but broader subject-specific wikis are just much rarer.
My guess is someone could make the LW wiki better and become the default here, and most of what it would require is investing time into content quality, and doing good content promotion (but indeed, content promotion is very hard for wikis since you don't have natural publication dates, and SEO is a largely losing game, though not unwinnable and indeed the dimension through which the LW wiki provides most of its value).
Come on man, you have the ability to understand the context better.
First of all, retaliation clearly has its place. If someone acts in a way that wantonly hurts others, it is the correct choice to inflict some suffering on them, for the sake of setting the right incentives. It is indeed extremely common that from this perspective of fairness and incentives, people "deserve" to suffer.
And indeed, maintaining an equilibrium in which the participants do not have outstanding grievances and would take the opportunity to inflict suffering on each other as payback for those past grievances is hard! Much of modern politics, many dysfunctional organizations, and many subcultures are indeed filled with mutual grievances moving things far away from the mutual assumption that it's good to not hurt each other. I think almost any casual glance at Twitter would demonstrate this.
That paragraph of my response is about trying to establish that there are obviously limits to how much critical comments need the ability to offend, and so if you want to view things through the lens of status, about how its important to view status as multi-dimensional. It is absolutely not rare for internet discussion to imply the other side deserves to suffer or doesn't deserve to live. There is a dimension of status where being low enough does cause to try to cause you suffering. It's not even that rare.
The reason why that paragraph is there is to establish how we need to treat status as a multi-dimensional thing. You can't just walk around saying "offense is necessary for good criticism". Some kinds of offense obviously make things worse in-expectation. Other kinds of offense do indeed seem necessary. You are saying the exact same thing in the very next paragraph!
No, it's the opposite. That's literally what my first sentence is saying. You cannot and should not treat respect/status as a one-dimensional thing, as the reductio-ad-absurdum in the quoted section shows. If you tried to treat it as a one-dimensional-thing you would need to include the part where people do of course frequently try to actively hurt others. In order to have a fruitful analysis of how status and offense relates to good criticism, you can't just treat the whole thing as one monolith.
I hope you now understand now how it's not "such a wild thing to say in that context". Indeed, it's approximately the same thing you are saying here. You also hopefully understand how the exasperated tone and hyperbole did not help.
You absolutely do not "just mean" those things. Communicating about status is hard and requires active effort to do well at. People get in active conflict with each other all the time. Just two days ago you were quoted by Benquo as saying "intend to fight it with every weapon at my disposal" regarding how you relate to LessWrong moderation, a statement exactly of the kind that does not breed confidence you will not at some point reach for the "try to just inflict suffering on the LessWrong moderators in order to disincentivize them from doing this" option.
People get exiled from communities. People get actually really hurt from social conflict. People build their lives around social trust and respect and reputation and frequently would rather die than to lose crucial forms of social standing they care about.
I do not believe your reports about how you claim to limit the range of your status claims, and what you mean by offense. You cannot wish away core dimension of the stakes of social relationships by just asserting you are not affecting them when them being present in the conversation would inconvenience you. You have absolutely called for extremely strong censure and punishment of many people in this community as a result of things they said on the internet. You do not have the trust, nor anything close enough to a track record of accurate communication on this topic, to make it so that when you assert that by "offense" you just mean purely factual claims, people should believe you.
Like, man, I am so tired of this. I am so tired of this repeated "oh no, I am absolutely not making any status claims, I am just making factual claims, you moron" game. You don't get to redefine the meaning of words, and you don't get to try to gaslight everyone you interface with about the real stakes of the social engagements they have with you.
I thought Wei Dai's comment was good. I responded to it, emphasizing how I think it's an important dimension to think through in these situations.
But indeed, the way you handle the nature of offense and status in comment threads is not to declare defeat, say that "well, seems like we just can't take into account social standing and status in our communication without sacrificing truth-seeking, and then pretend that dimension is never there". You have to actually work with detailed models of what is going on, figure out the incentives for the parties involved, and set up a social environment where good work gets rewarded, harmful actions punished, all while maintaining sufficient ability to talk about the social system itself without everyone trying to gaslight each other about it. It's hard work, it requires continuous steering. It requires hard thinking. It definitely is not solved by just making posts saying "We just mean that opinion X is false, and that the process generating opinion X is untrustworthy, and perhaps actively optimizing in an objectionable direction".
There is no "just" here. In invoking this you are implying some target social relationship to the people who are "perhaps actively optimizing in an objectionable direction". Should they be exiled, rate-limited, punished, forced to apologize or celebrated? Your tone and words will communicate one of those!
It's extremely hard and requires active effort to write a comment that is genuinely communicating agnosticism about how they think a social ecosystem should react to people to are "optimizing in an objectionable direction" in a specific instance, and you are clearly not generally trying to do that. Your words reek of judgement of a specific kind. You frequently call for social punishment for people who optimize such! You can't just deny that part of your whole speech and wish it away. There is no "just" here. When you offend, you mean offense of a specific kind, and using clinical language to hide away the nature of that offense, and its implications, is not helping people accurately understand what will happen when they engage with you.