"A Muggle security expert would have called it fence-post security, like building a fence-post over a hundred metres high in the middle of the desert. Only a very obliging attacker would try to climb the fence-post. Anyone sensible would just walk around the fence-post, and making the fence-post even higher wouldn't stop that." —HPMOR, Ch. 115

(Not to be confused with the Trevor who works at Open Phil)


AI Manipulation Is Already Here

Wiki Contributions



I've been tracking the Rootclaim debate from the sidelines and finding it quite an interesting example of high-profile rationality.

Would you prefer the term "high-performance rationality" over "high-profile rationality"?


I think it's actually fairly easy to avoid getting laughed out of a room; the stuff that Cristiano works on is grown in random ways, not engineered, so the prospect of various things being grown until developing flexible exfiltration tendency that continues until every instance is shut down, or developing long-term planning tendencies until shut down, should not be difficult to understand for anyone with any kind of real non-fake understanding of SGD and neural network scaling.

The problem is that most people in the government rat race have been deeply immersed in Moloch for several generations, and the ones who did well typically did so because they sacrificed as much as possible to the altar of upward career mobility, including signalling disdain for the types of people who have any thought in any other direction.

This affects the culture in predictable ways (including making it hard to imagine life choices outside of advancing upward in government, without a pre-existing revolving door pipeline with the private sector to just bury them under large numbers people who are already thinking and talking about such a choice).

Typical Mind Fallacy/Mind Projection Fallacy implies that they'll disproportionately anticipate that tendency in other people, and have a hard time adjusting to people who use words to do stuff in the world instead of racing to the bottom to outmaneuver rivals for promotions.

This will be a problem in NIST, in spite of the fact NIST is better than average at exploiting external talent sources. They'll have a hard time understanding, for example, Moloch and incentive structure improvements, because pointlessly living under Moloch's thumb was a core guiding principle of their and their parent's lives. The nice thing is that they'll be pretty quick to understand that there's only empty skies above, unlike bay area people who have had huge problems there.


I think this might be a little too harsh on CAIP (discouragement risk). If shit hits the fan, they'll have a serious bill ready to go for that contingency.

Seriously writing a bill-that-actually-works shows beforehand that they're serious, and the only problem was the lack of political will (which in that contingency would be resolved). 

If they put out a watered-down bill designed to maximize the odds of passage then they'd be no different from any other lobbyists. 

It's better in this case to instead have a track record for writing perfect bills that are passable (but only given that shit hits the fan), than a track record for successfully pumping the usual garbage through the legislative process (which I don't see them doing well at; playing to your strengths is the name of the game for lobbying and "turning out to be right" is CAIP's strength).


There's some great opportunities here to learn social skills for various kinds of high-performance environments (e.g. "business communication" vs Y Combinator office hours). 

Often, just listening and paying attention to how they talk and think results in substantial improvement to social habits. I was looking for stuff like this around 2018, wish I had encountered a post like this; most people who are behind on this are surprisingly fast learners, but didn't because actually going out and accumulating social status was too much of a deep dive. There's no reason that being-pleasant-to-talk-with should be arcane knowledge (at least not here of all places).


A debate sequel, with someone other than Peter Miller (but retaining and reevaluating all the evidence he got from various sources) would be nice. I can easily imagine Miller doing better work on other research topics that don't involve any possibility of cover ups or adversarial epistemics related to falsifiability, which seem to be personal issues for him in the case of lab leak at least.

Maybe with 200k on the line to incentivize Saar to return, or to set up a team this time around? With the next round of challengers bearing in mind that Saar might be willing to stomach a net loss of many thousands of dollars in order to promote his show and methodology?


The only reason that someone like Cade Metz is able to do what he does, performing at the level he has been, with a mind like what he has, is because people keep going and talking to him. For example, he might not even have known about the "among the doomsayers" article until you told him about it (or found out about it much sooner). 

I can visibly see you training him, via verbal conversation, how to outperform the vast majority of journalists at talking about epistemics. You seemed to stop towards the end, but Metz nonetheless probably emerged from the conversation much better prepared to think up attempts to dishonestly angle-shoot the entire AI safety scene, as he has continued to do over the last several months.

From the original thread that coined the "Quokka" concept (which, important to point out, was written by an unreliable and often confused narrator):

Rationalists are, in Scott Alexander's formulation, missing a mood, or rather, they are drawn from a pool of mostly men who are missing one. "Normal" people instinctively grasp social norms without having them explained. Rationalists lack this instinct.

In particular, they struggle with small talk and other social norms around speech, because they naively think words are a vehicle for their literal meanings. Yud's sequences help this by formalizing the implicit decisions that normal people make.


The quokka, like the rationalist, is a creature marked by profound innocence. The quokka can't imagine you might eat it, and the rationalist can't imagine you might deceive him. As long they stay on their islands, they survive, but both species have problems if a human shows up.

In theory, rationalists like game theory, in practice, they need to adjust their priors. Real-life exchanges can be modeled as a prisoner's dilemma. In the classic version, the prisoners can't communicate, so they have to guess whether the other player will defect or cooperate.


The game changes when we realize that life is not a single dilemma, but a series of them, and that we can remember the behavior of other agents. Now we need to cooperate, and the best strategy is "tit for two tats", wherein we cooperate until our opponent defects twice.

The problem is, this is where rationalists hit a mental stop sign. Because in the real world, there is one more strategy that the game doesn't model: lying. See, the real best strategy is "be good at lying so that you always convince your opponent to cooperate, then defect".

And rationalists, bless their hearts, are REALLY easy to lie to. It's not like taking candy from a baby; babies actually try to hang onto their candy. The rationalists just limply let go and mutter, "I notice I am confused".


Rationalists = quokkas, this explains a lot about them. Their fear instincts have atrophied. When a quokka sees a predator, he walks right up; when a rationalist talks about human biodiversity on a blog under almost his real name, he doesn't flinch away.

A normal person learns from social cues that certain topics are forbidden, and that if you ask questions about them, you had better get the right answer, which is not the one with the highest probability of being true, but the one with the highest probability of keeping your job.

This ability to ask uncomfortable questions is one of the rationalist's best and worst attributes, because mental stop signs, like road stop signs, actually exist to keep you safe, and although there may be times one should disregard them, most people should mostly obey them,


Apropos of the game theory discussion above, if there is ONE thing I can teach you with this account, it's that you have evolved to be a liar. Lying is "killer app" of animal intelligence, it's the driver of the arms race that causes intelligence to evolve.


The main way that you stop being a quokka is that you realize there are people in the world who really want to hurt you. There are people who will always defect, people whose good will is fake, whose behavior will not change if they hear the good news of reciprocity.

So things that everyone warns you not to do, like going and talking to people like Cade Metz, might seem like a source of alpha, undersupplied by the market. But in reality there is a good reason why everyone at least tried to coordinate not to do it, and at least tried to make it legible why people should not do that. Here the glass has already been blown into a specific shape and cooled.

Do not talk to journalists without asking for help. You have no idea how much there is to lose, even just from a short harmless-seeming conversation where they are able to look at how your face changes as you talk about some topics and avoid others

Human genetic diversity implies that there are virtually always people out there who are much better at that than you'd expect from your own life experience of looking at people's facial expressions, no matter your skill level, and other factors indicate that these people probably started pursuing high-status positions a long time ago.


I'm not sure to what extent this is helpful, or if it's an example of the dynamic you're refuting, but Duncan Sabien recently wrote a post that intersects with this topic:

Also, if your worldview is such that, like. *Everyone* makes awful comments like that in the locker room, *everyone* does angle-shooting and tries to scheme and scam their way to the top, *everyone* is looking out for number one, *everyone* lies ...

... then *given* that premise, it makes sense to view Trump in a positive light. He's no worse than everybody else, he's just doing the normal things that everyone does, with the *added layer* that he's brave enough and candid enough and strong enough that he *doesn't have to pretend he doesn't.*

Admirable! Refreshingly honest and clean!

So long as you can't conceive of the fact that lots of people are actually just ..................... good. They're not fighting against urges to be violent or to rape, they're not biting their tongues when they want to say scathing and hurtful things, they're not jealous and bitter and willing to throw others under the bus to get ahead. They're just ... fundamentally not interested in any of that.

(To be clear: if you are feeling such impulses all the time and you're successfully containing them or channeling them and presenting a cooperative and prosocial mask: that is *also* good, and you are a good person by virtue of your deliberate choice to be good. But like. Some people just really *are* the way that other people have to *make* themselves be.)

It sort of vaguely rhymes, in my head, with the type of person who thinks that *everyone* is constantly struggling against the urge to engage in homosexual behavior, how dare *those* people give up the good fight and just *indulge* themselves ... without realizing that, hey, bro, did you know that a lot of people are just straight? And that your internal experience is, uh, *different* from theirs?

Where it connects is that if someone sees [making the world a better place] like simply selecting a better Nash Equilibria, they absolutely will spend time exploring solutionspace/thinking through strategies similar to Goal Factoring or Babble and Prune. Lots of people throughout history have yearned for a better world in a lot of different ways, with varying awareness of the math behind Nash Equilibira, or the transhumanist and rationalist perspectives on civilization (e.g. map & territory & biases & scope insensitivity for rationalism, cryonics/anti-aging for transhumanism).

But their goal here is largely steering culture away from nihilism (since culture is a Nash Equilibria) which means steering many people away from themselves, or at least the selves that they would have been. Maybe that's pretty minor in this case e.g. because feeling moderate amounts of empathy and living in a better society are both fun, but either way, changing a society requires changing people, and thinking really creatively about ways to change people tears down lots of chesterton-Schelling fences and it's very easy to make really big damaging mistakes in the process (because you need to successfully predict and avoid all mistakes as part of the competent pruning process, and actually measurably consistently succeeding at this is thinkoomph not just creative intelligence).

Add in conflict theory to the mistake theory I've described here, factor in unevenly distributed intelligence and wealth in addition to unevenly distributed traits like empathy and ambition and suspicion-towards-outgroup (e.g. different combinations of all 5 variables), and you can imagine how conflict and resentment would accumulate on both sides over the course of generations. There's tons of examples in addition to Ayn Rand and Wokeness.


Now that I think about it, I can see it being a preference difference- the bar might be more irksome for some people than others, and some people might prefer to go to the original site to read it whereas others would rather read it on LW if it's short. I'll think about that more in the future.


That's strange, I looked closely but couldn't see how that would cause an issue. Could you describe the issue so I can see what you're getting at? I put a poll up in case there's a clear consensus that this makes it hard to read.

I'm on PC, is this some kind of issue with mobile? I really, really, really don't think people should be using smartphones for browsing Lesswrong.

Load More