Jacob Watts

I love you!

Don't take me too seriously or trust me too much lol. Plus, there's a real chance that I already changed my mind on anything I've said in the past haha. 

Bit of an EA/rat if I do say so myself. Non violent anarchy is cool; I don't really care for big, opaque authoritarian control structures lol.

Intellectual iterests: all sorts of shit lol, but the current shortlist includes: 

  • modern physics (looking for good resources)
  • xxx_hacker-shit_xxx (compassionate, non-violent)
  • technological self-determinism and managing mass externalities
  • meta x-risk mitigation, PauseAI stuff
  • pwning tyrnanny



My current donation portfolio: there might be a link here at some point

Projects we could collab on: there might be a link here at some point

Art and stuff I'm into rn: there might be a link here at some point

Big intellectual influences: there might be a link here at some point

Wish list / gift list: there might be a link here at some point

Wiki Contributions


A lot of good people are doing a lot of bad things that they don't enjoy doing all the time. That seems weird. They even say stuff like "I don't want to do this". But then they recite some very serious sounding words or whatever and do it anyways.


Lol, okay on review that reads as priveledged. Easy for rectangle-havers to say. 

There is underlying violence keeping a lot of people "at work" and doing the things they don't want to do. An authoritarian violence keeping everyone in place.

The threat is to shelter, food, security, even humanity past a certain point. You don't "go along", we grind you into the ground. Or, "allow you to be ground by the environment we cocreated". 

Many people "do the thing they don't want" because they are under much greater threat of material scarcity or physical violence than I am at present and I want to respect that. 



For specific activities, I would suggest doubling down on activities that you already like to do or have interest in, but which you implicitly avoid "getting into" because they are considered low status. For example: improve your masturbation game, improve your drug game (as in plan fun companion activities or make it a social thing; not just saying do more/stronger drugs), get really into that fringe sub-genre that ~only you like, experiment with your clothes/hair style, explore your own sexual orientation/gender identity, just straight up stop doing any hobbies that you're only into for the status, etc. 

Answer by Jacob Watts54

I think the best way to cash in on the fun side of the fun/status tradeoff is probably mostly rooted in adopting a disposition and outlook that allows you to. I think most people self limit themselves like crazy to promote a certain image and that if you're really trying to extract fun-bang for your status-buck, then dissolving some of that social conditioning and learning to be silly is a good way to go. Basically, I think there's a lot of fun to be had for those who are comfortable acting silly or playful or unconventional. If you can unlock some of that as your default disposition, or even just a mode you can switch into, then I think practically any given activity will be marginally more fun. 

I think that most people have a capacity to be silly/playful in a way that is really fun, but that they stifle it mostly for social reasons and over time this just becomes a habitual part of how they interact with the world. I don't expect this point to be controversial. 

One of the main social functions of things like alcohol and parties seem to be to give people a clear social license to act silly, playful, and even ~outrageous without being judged harshly. I think that if one is able to overcome some of the latent self-protective psychological constraints that most people develop and gain a degree of genuine emotional indifference towards status, then they can experience much more playfulness and joy than most people normally permit themselves.

I know this isn't really a self contained "Friday night activity" in itself, but I think that general mindset shifts are probably the way to go if you're not terribly status-concerned and looking for ways to collect fun-rent on it. I think there's a lot to be said for just granting yourself the permission to be silly and have fun in general.

While I agree that there are notable differences between "vegans" and "carnists" in terms of group dynamics, I do not think that necessarily disagrees with the idea that carnists are anti-truthseeking. 

"carnists" are not a coherent group, not an ideology, they do not have an agenda (unless we're talking about some very specific industry lobbyists who no doubt exist). They're just people who don't care and eat meat.

It seems untrue that because carnists are not an organized physical group that has meetings and such, they are thereby incapable of having shared norms or ideas/memes. I think in some contexts it can make sense/be useful to refer to a group of people who are not coherent in the sense of explicitly "working together" or having shared newletters based around a subject or whatever. In some cases, it can make sense to refer to those people's ideologies/norms.

Also, I disagree with the idea that carnists are inherently neutral on the subject of animals/meat. That is, they don't "not care". In general, they actively want to eat meat and would be against things that would stop this. That's not "not caring"; it is "having an agenda", just not one that opposes the current status quo. The fact that being pro-meat and "okay with factory farming" is the more dominant stance/assumed default in our current status quo doesn't mean that it isn't a legitimate position/belief that people could be said to hold. There are many examples of other memetic environments throughout history where the assumed default may not have looked like a "stance" or an "agenda" to the people who were used to it, but nonetheless represented certain ideological claims.

I don't think something only becomes an "ideology" when it disagrees with the current dominant cultural ideas; some things that are culturally common and baked into people from birth can still absolutely be "ideology" in the way I am used to using it. If we disagree on that, then perhaps we could use a different term? 

If nothing else, carnists share the ideological assumption that "eating meat is okay". In practice, they often also share ideas about the surrounding philosophical questions and attitudes. I don't think it is beyond the pale to say that they could share norms around truth-seeking as it relates to these questions and attitudes. It feels unnecessarily dismissive and perhaps implicitly status quoist to assume that: as a dominant, implicit meme of our culture "carnism" must be "neutral" and therefore does not come with/correlate with any norms surrounding how people think about/process questions related to animals/meat.

Carnism comes with as much ideology as veganism even if people aren't as explicit in presenting it or if the typical carnist hasn't put as much thought into it. 

I do not really have any experience advocating publicly for veganism and I wouldn't really know about which specific espistemic failure modes are common among carnists for these sorts of conversations, but I have seen plenty of people bend themselves out of shape persevering their own comfort and status quo, so it really doesn't seem like a stretch to imagine that epistemic maladies may tend to present among carnists when the question of veganism comes up.

For one thing, I have personally seen carnists respond in intentionally hostile ways towards vegans/vegan messaging on several occasions. Partially this is because they see it as a threat to their ideas or their way of life or partially this is because veganism is a designated punching bag that you're allowed to insult in a lot of places. Often times, these attacks draw on shared ideas about veganism/animals/morality that are common between "carnists". 

So, while I agree that there are very different group dynamics, I don't think it makes sense to say that vegans hold ideologies and are capable of exhibiting certain epistemic behaviors, but that carnists, by virtue of not being a sufficiently coherent collection of individuals, could not have the same labels applied to them. 

Thanks! I haven't watched, but I appreciated having something to give me the gist!

Hotz was allowed to drive discussion. In debate terms, he was the con side, raising challenges, while Yudkowsky was the pro side defending a fixed position.

This always seems to be the framing which seems unbelievably stupid given the stakes on each side of the argument. Still, it seems to be the default; I'm guessing this is status quo bias and the historical tendency of everything to stay relatively the same year by year (less so once technology really started happening). I think AI safety outreach needs to break out of this framing or it's playing a losing game. I feel like, in terms of public communication, whoever's playing defense has mostly already lost. 

The idea that poking a single whole in EY's reasoning is also a really broken norm around these discussions that we are going to have to move past if we want effective public communication. In particular, the combination of "tell me exactly what an ASI would do" and "if anything you say sounds implausible, then AI is safe" is just ridiculous. Any conversation implicitly operating on that basis is operating in bad faith and borderline not worth having. It's not a fair framing of the situation. 

9. Hotz closes with a vision of ASIs running amok

What a ridiculous thing to be okay with?! Is this representative of his actual stance? Is this stance taken seriously by anyone besides him?

not going to rely on a given argument or pathway because although it was true it would strain credulity. This is a tricky balance, on the whole we likely need more of this.

I take it this means not using certain implausible seeming examples? I agree that we could stand to move away from the "understand the lesson behind this implausible seeming toy example"-style argumentation and more towards an emphasis on something like "a lot of factors point to doom and even very clever people can't figure out how to make things safe". 

I think it matters that most of the "technical arguments" point strongly towards doom, but I think it's a mistake for AI safety advocates to try to do all of the work of laying out and defending technical arguments when it comes to public facing communication/debate. If you're trying to give all the complicated reasons why doom is a real possibility, then you're implicitly taking on a huge burden of proof and letting your opponent get away with doing nothing more than cause confusion and nitpick. 

Like, imagine having to explain general relativity in a debate to an audience who has never heard about it. Your opponent continuously just stops you and disagrees with you; maybe misuses a term here and there and then at the end the debate is judged by whether the audience is convinced that your theory of physics is correct. It just seems like playing a losing game for no reason.

Again, I didn't see this and I'm sure EY handled himself fine, I just think there's a lot of room for improvement in the general rhythm that these sorts of discussions tend to fall into.

I think it is okay for AI safety advocates to lay out the groundwork, maybe make a few big-picture arguments, maybe talk about expert opinion (since that alone is enough to perk most sane people's ears and shift some of the burden of proof), and then mostly let their opponents do the work of stumbling through the briars of technical argumentation if they still want to nitpick whatever thought experiment. In general, a leaner case just argues better and is more easily understood. Thus, I think it's better to argue the general case than to attempt the standard shuffle of a dozen different analogies; especially when time/audience attention is more acutely limited.

Would the prize also go towards someone who can prove it is possible in theory? I think some flavor of "alignment" is probably possible and I would suspect it more feasible to try to prove so than to prove otherwise.

I'm not asking to try to get my hypothetical hands on this hypothetical prize money, I'm just curious if you think putting effort into positive proofs of feasibility would be equally worthwhile. I think it is meaningful to differentiate "proving possibility" from alignment research more generally and that the former would itself be worthwhile. I'm sure some alignment researchers do that sort of thing right? It seems like a reasonable place to start given an agent-theoretic approach or similar.

I appreciate the attempt, but I think the argument is going to have to be a little stronger than that if you're hoping for the 10 million lol.

Aligned ASI doesn't mean "unaligned ASI in chains that make it act nice", so the bits where you say:

any constraints we might hope to impose upon an intelligence of this caliber would, by its very nature, be surmountable by the AI


overconfidence to assume that we could circumscribe the liberties of a super-intelligent entity

feel kind of misplaced. The idea is less "put the super-genius in chains" and moreso to get "a system smarter than you that wants the sort of stuff you would want a system smarter than you to want in the first place".

From what I could tell, you're also saying something like ~"Making a system that is more capable than you act only in ways that you approve of is nonsense because if it acts only in ways that you already see as correct, then it's not meaningfully smarter than you/generally intelligent." I'm sure there's more nuance, but that's the basic sort of chain of reasoning I'm getting from you. 

I disagree. I don't think it is fair to say that just because something is more cognitively capable than you, it's inherently misaligned. I think this is conflating some stuff that is generally worth keeping distinct. That is, "what a system wants" and "how good it is at getting what it wants" (cf. Hume's guillotine, orthogonality thesis).

Like, sure, an ASI can identify different courses of action/ consider things more astutely than you would, but that doesn't mean it's taking actions that go against your general desires. Something can see solutions that you don't see yet pursue the same goals as you. I mean, people cooperate all the time even with asymmetric information and options and such. One way of putting it might be something like: "system is smarter than you and does stuff you don't understand, but that's okay cause it leads to your preferred outcomes". I think that's the rough idea behind alignment.

For reference, I think the way you asserted your disagreement came off kind of self-assured and didn't really demonstrate much underlying understanding of the positions you're disagreeing with. I suspect that's part of why you got all the downvotes, but I don't want you to feel like you're getting shut down just for having a contrarian take. 👍

The doubling time for AI compute is ~6 months



In 5 years compute will scale 2^(5÷0.5)=1024 times


This is a nitpick, but I think you meant 2^(5*2)=1024


In 5 years AI will be superhuman at most tasks including designing AI


This kind of clashes with the idea that AI capabilities gains are driven mostly by compute. If "moar layers!" is the only way forward, then someone might say this is unlikely. I don't think this is a hard problem, but I thing its a bit of a snag in the argument.


An AI will design a better version of itself and recursively loop this process until it reaches some limit

I think you'll lose some people on this one. The missing step here is something like "the AI will be able to recognize and take actions that increase its reward function". There is enough of a disconnect between current systems and systems that would actually take coherent, goal-oriented actions that the point kind of needs to be justified. Otherwise, it leaves room for something like a GPT-X to just kind of say good AI designs when asked, but which doesn't really know how to actively maximize its reward function beyond just doing the normal sorts of things it was trained to do. 

Such any AI will be superhuman at almost all tasks, including computer security, R&D, planning, and persuasion

I think this is a stronger claim than you need to make and might not actually be that well-justified. It might be worse than humans at loading the dishwasher bc that's not important to it, but if it was important, then it could do a brief R&D program in which it quickly becomes superhuman at dish-washer-loading. Idk, maybe the distinction I'm making is pointless, but I guess I'm also saying that there's a lot of tasks it might not need to be good at if its good at things like engineering and strategy.

Overall, I tend to agree with you. Most of my hope for a good outcome lies in something like the "bots get stuck in a local maximum and produce useful superhuman alignment work before one of them bootstraps itself and starts 'disempowering' humanity". I guess that relates to the thing I said a couple paragraphs ago about coherent, goal-oriented actions potentially not arising even as other capabilities improve.

I am less and less optimistic about this as research specifically designed to make bots more "agentic" continues. In my eyes, this is among some of the worst research there is.

Personally, I found it obvious that the title was being playful and don't mind that sort of tongue-in-cheek thing. I mean "utterly perfect" is kind of a give away that they're not being serious.

Great post!

As much as a I like LessWrong for what it is, I think it's often guilty of a lot of the negative aspects of conformity and coworking that you point out here. Ie. killing good ideas in their cradle. Of course, there are trade-offs to this sort of thing and I certainly appreciate brass-tacks and hard-nosed reasoning sometimes. There is also a need for ingenuity, non-conformity, and genuine creativity (in all of its deeply anti-social glory).

Thank you for sharing this! It helped me feel LessWeird about the sorts of things I do in my own creative/explorative processes and it gave me some new techniques/mindset-things to try.

I suspect there is some kind of internal asymmetry between how we process praise and rejection, especially when it comes to vulnerable things like our identities and our ideas. Back when I used to watch more "content creators" I remember they would consistently gripe that they could read 100 positive comments and still feel most affected by the one or two negative ones.

Well, cheers to not letting our thinking be crushed by the status quo! Nor by critics, internal or otherwise!

Load More