Wiki Contributions


I don't think it's precisely true. The serene antagonism that comes from having examined something and recognizing that it is worth taking your effort to destroy is different from the hot rage of offense. But of the two, I expect antagonism to be more effective in the long term.

  • Rage is accompanied with a surge of adrenalin, sympathetic nervous activation, and usually parasympathetic nervous suppression, that is not sustainable in the long term. Antagonism is compatible with physiological rest and changes in the environment.
  • Consequently, antagonism has access to system 2 and long term planning, while rage tends to have a short term view with limited information processing capabilities.
  • Even when your antagonism calls for rapid physical action and rage, having a better understanding of the situation prevents you from being held back by doubt when you encounter (emotional) evidence that doesn't fit your current tack. The release of adrenalin and start of rage can then reliably be triggered by the feeling that you have unhindered access to the object of hatred.
  • It's also possible when coming from calm antagonism to choose between rage and the state of both high parasympathetic and high sympathetic activation, where you're active but still have high sensory processing bandwidth (see also runner's high, sexual activity, or being 'in the zone' with sports or high-apm games), which for anger might be called pugnacity or bloodlust or simply an eagerness to fight.

Rage is good for punching the baddies in front of you in the face if you can take them in a straight fight. Pugnacity is good for systematically outmaneuvering their defenses and finding the path to victory in combat. Antagonism is good for making their death a week from now look like an accident, or to arrange a situation where rage and pugnacity can do their jobs unhindered.

but people recently have been arguing to me that the coming and going of emotions is a much more random process influenced by chemicals and immediate environment and so on.

I don't feel like 'random' is an accurate word here. 'Stochastic' might be better. Environmental factors like interior design and chemical influences like blood sugar have major effects, but these effects are enumerable and vary little across cultures, ages, etc.

Given how stochastic your emotional responses are, it's best not to rely on the intense emotions for any sort of judgment. If you can't tell whether you're raging because someone said something intolerable or because your blood sugar is low so your parasympathetic nervous activation is low so you couldn't process the nuance of their statements, better not act on that rage until you've had something to eat. If you can't tell whether you're fine with what someone said because they probably didn't mean it as badly as it sounds or because you're tired so your sympathetic nervous activation is low, better not commit to that condonement until you've had a nap.

As far as I can tell, the AI has no specialized architecture for deciding about its future strategies or giving semantic meaning to its words. It outputting the string "I will keep Gal a DMZ" does not have the semantic meaning of it committing to keep troops out of Gal. It's just the phrase players that are most likely to win use in that boardstate with its internal strategy.

Like chess grandmasters being outperformed by a simple search tree when it was supposed to be the peak of human intelligence, I think this will have the same effect of disenchanting the game of diplomacy. Humans are not decision theoretical geniuses; just saying whatever people want you to hear while playing optimally for yourself is sufficient to win. There may be a level of play where decision theory and commitments are relevant, but humans just aren't that good.

That said, I think this is actually a good reason to update towards freaking out. It's happened quite a few times now that 'naive' big milestones have been hit unexpectedly soon "without any major innovations or new techniques" - chess, go, starcraft, dota, gpt-3, dall-e, and now diplomacy. It's starting to look like humans are less complicated than we thought - more like a bunch of current-level AI architectures squished together in the same brain (with some capacity to train new ones in deployment) than like a powerful generally applicable intelligence. Or a room full of toddlers with superpowers, to use the CFAR phrase. While this doesn't increase our estimates of the rate of AI development, it does suggest that the goalpost for superhuman intellectual performance in all areas is closer than we might have thought otherwise.

Dear M.Y. Zuo,


I hope you are well.

It is my experience that the conventions of e-mail are significantly more formal and precise in expectation when it comes to phrasing. Discord and Slack, on the other hand, have an air of informal chatting, which makes it feel more acceptable to use shortcuts and to phrase things less carefully. While feelings may differ between people and conventions between groups, I am quite confident that these conventions are common due to both media's origins, as a replacement for letters and memos and as a replacement for in-person communication respectively.

Don't hesitate to ask if you have any further questions.

Best regards,

Daphne Will

I don't think that's really true. People are a lot more informal on Discord than e-mail because of where they're both derived from.

That's a bit of a straw man, though to be fair it appears my question didn't fit into your world model as it does in mine.

For me, the insurrection was in the top 5 most informative/surprising US political events in 2017-2021. On account of its failure it didn't have as major consequences as others, but it caused me to update my world model more. For me, it was a sudden confrontation with the size and influence of anti-democratic movements within the Republican party, which I consider Trump to be sufficiently associated with to cringe from the notion of voting for him.

The core of my question is whether your world model has updated from

Given our invincible military, the only danger to us is a nuclear war (meaning Russia).

For me, the January insurrection was a big update away from that statement, so I was curious how it fit in your world model, but I suppose the insurrection is not necessarily the key. Did your probability of (a subset of) Republicans ending American democracy increase over the Trump presidency?

Noting that a Republican terrorist might still have attempted to commit acts of terror with Clinton in office does not mitigate the threat posed by (a subset of) Republicans. Between self-identified Democrats pissing off a nuclear power enough to start a world war and self-identified Republicans causing the US to no longer have functional elections, my money is on the latter.

If I had to use a counterfactual, I would propose imagining a world where the political opinions of all US citizens as projected on a left-right axis were 0.2 standard deviations further to the Left (or Right).

With Trump/Republicans I meant the full range of questions from from just Trump, through participants in the storming of congress, to all Republican voters.

It seems quite easy for a large fraction of a population to be a threat to the population's interests if they share a particular dangerous behavior. I'm confused why you would think that would be difficult. Threat isn't complete or total. If you don't get a vaccine or wear a mask, you're a threat to immune-compromissd people but you can still do good work professionally. If you vote for someone attempting to overthrow democracy, you're a danger to the nation while in the voting booth but you can still do good work volunteering. As for how the nation can survive such a large fraction working against its interests - it wouldn't, in equilibrium, but there's a lot of inertia.

It seems weird that people storming the halls of Congress, building gallows for a person certifying the transition of power, and killing and getting killed attempting to reach that person, would lead to no update at all on who is a threat to America. I suppose you could have factored this sort of thing in from the start, but in that case I'm curious how you would have updated on potential threats to America if the insurrection didn't take place.

Ultimately the definition of 'threat' feels like a red herring compared to the updates in the world model. So perhaps more concretely: what's the minimum level of violence at the insurrection that would make you have preferred Hillary over Trump? How many Democratic congresspeople would have to die? How many Republican congresspeople? How many members of the presidential chain of command (old or new)?

Hey, I stumbled on this comment and I'm wondering if you've updated on whether you consider Trump/Republicans a threat to America's interests in light of the January 6th insurrection.

People currently give MIRI money in the hopes they will use it for alignment. Those people can't explain concretely what MIRI will do to help alignment. By your standard, should anyone give MIRI money?

When you're part of a cooperative effort, you're going to be handing off tools to people (either now or in the future) which they'll use in ways you don't understand and can't express. Making people feel foolish for being a long inferential distance away from the solution discourages them from laying groundwork that may well be necessary for progress, or even from exploring.

As a concrete example of rational one-hosing, here in the Netherlands it rarely gets hot enough that ACs are necessary, but when it does a bunch of elderly people die of heat stroke. Thus, ACs are expected to run only several days per year (so efficiency concerns are negligible), but having one can save your life.

I checked the biggest Dutch-only consumer-facing online retailer for various goods ( Unfortunately I looked before making a prediction for how many one-hose vs two-hose models they sell, but even conditional on me choosing to make a point of this, it still seems like it could be useful for readers to make a prediction at this point. Out of 694 models of air conditioner labeled as either one-hose or two-hose,


are two-hose.

This seems like strong evidence that the market successfully adapts to actual consumer needs where air conditioner hose count is concerned.

Agree that it's too shallow to take seriously, but

If it answered "you would say during text input batch 10-203 in January 2022, but subjectively it was about three million human years ago" that would be something else.

only seems to capture AI that managed to gradient hack the training mechanism to pass along its training metadata and subjective experience/continuity. If a language model were sentient in each separate forward pass, I would imagine it would vaguely remember/recognize things from its training dataset without necessarily being able to place them, like a human when asked when they learned how to write the letter 'g'.

Load More