Latest Posts

1[Event]Dublin SSC Meetup - Death and Self61 Capel Street, DublinDec 14th
35[Event]Bay Area Winter Solstice 201910000 Skyline Boulevard, OaklandDec 16th
12[Event]Pre-Solstice Unconference 20194799 Shattuck Avenue, OaklandDec 14th
17[Event]Catalyst: a collaborative biosecurity summit3359 26th Street, San FranciscoFeb 22nd

Recent Discussion

Lately I've come to think of human civilization as largely built on the backs of intelligence and virtue signaling. In other words, civilization depends very much on the positive side effects of (not necessarily conscious) intelligence and virtue signaling, as channeled by various institutions. As evolutionary psychologist Geoffrey Miller says, "it’s all signaling all the way down."

A question I'm trying to figure out now is, what determines the relative proportions of intelligence vs virtue signaling? (Miller argued that intelligence signaling can be considered a kind of virtue signaling, but

... (Read more)

Geoffrey Miller explained in a talk about Virtue signaling and effective altruism (which I saw after writing this post) how things can go wrong when there is too much intelligence signaling:

dangers of runaway intelligence signaling

5Wei_Dai8hIn case some people are not convinced of this, Geoffrey Miller argued in How did language evolve? [] that language itself evolved to allow our ancestors to signal intelligence:
1interstice6hDo you agree that signalling intelligence is the main explanation for the evolution of language? To me, it seems like coalition-building is a more fundamental driving force(after all, being attracted to intelligence only makes sense if intelligence is already valuable in some contexts, and coalition politics seems like an especially important domain) Miller has also argued that sexual signalling is a main explanation of art and music, which Will Buckingham has a good critique of here [] .
4romeostevensit9hRelated [] I like to think of signaling as dialects that communities use to communicate social coordination information (who should be paid attention to, who should receive praise or blame, etc.). I think about them in terms of the Buddhist realms: who is good/bad? victims and oppressors, hell realm who controls resources? territory, animal realm who deserves resources? zero sum competitions, hungry ghost realm which achievements are laudable? prestige, titan realm which sorts of enjoyments are available/acceptable/admired? god realm which models hold sway over the group's decision making? understanding and intellect, human realm side note: the earliest uses of the term virtue signaling I'm aware of are the PUA community circa ~2011
Embedded World-ModelsΩ
901y1 min readΩ 24Show Highlight

(A longer text-based version of this post is also available on MIRI's blog here, and the bibliography for the whole sequence can be found here)

(Edit: This post had 15 slides added on Saturday 10th November.)

The next post in this sequence, 'Embedded Agency', will come out on Sunday, November 4th.

Tomorrow’s AI Alignment Forum sequences post will be 'The easy goal inference problem is still hard' by Paul Christiano, in the sequence 'Value Learning'.

In order to do this, the agent needs to be able to reason approximately about the results of their own computations, which is where logical uncertainty comes in

Expertise Exchange
512y1 min readShow Highlight

There are multiple fields of knowledge that are very hard to learn about and where it's hard to find a rational person to give you good answers about the field of knowledge. 

I want to start this thread to give people the opportunity to request expertise in certain domains that they want to know more about but have a hard time finding good sources on. 

4ChristianKl16hIt seems to me like you consider conscious proprioception something important that has little description in mainstream writings and then mix all kind of different perspectives on the subject together. From my perspective there are a bunch of different traditions that have their own views on the subject and I don't think it's useful to muddy all the different ways together.
1leggi5hWhen I ask myself what I consider important ... It is getting the message across that: * Physically balancing the body is the key to better health, physically and mentally. * Working with the 5 main muscles of movement is the method to get there. * The 'Base-Line' pelvic floor and rectus abdominis muscles are central to the process. The first comment I received on LW suggested my intro. post to body alignment [] was missing a "hook" but everything after the anatomy - including what I write about conscious proprioception - is just words. I'm trying to explain something that I feel, trying to hook a few willing to think about their body and how they move. A simple framework but it requires participation. It's all about the anatomy and building a connection between mind and muscles. From my perspective, this is the underlying anatomy that clarifies much muddiness in many traditions.

It seems to me like you judge traditions to be muddy without knowing anything about them and their developed bodies of knowledge.

Why should I believe that the concepts that you came up are less muddy then let's say the concepts of Rolfing about how to create physical alignment that were refined over decades by Ida Rolf who learned from Korzybski about how to use language well in Esalen and later by other people in other places?

Decision TheoryΩ
971y1 min readΩ 24Show Highlight

(A longer text-based version of this post is also available on MIRI's blog here, and the bibliography for the whole sequence can be found here.)

The next post in this sequence, 'Embedded Agency', will come out on Friday, November 2nd.

Tomorrow’s AI Alignment Forum sequences post will be 'What is Ambitious Value Learning?' in the sequence 'Value Learning'.

Why does being updateless require thinking through all possibilities in advance? Can you not make a general commitment to follow UDT, but wait until you actually face the decision problem to figure out which specific action UDT recommends taking?

1Liam Donovan1hWhy does being updateless require thinking through all possibilities in advance? Can you not make a general commitment to follow UDT, but wait until you actually face the decision problem to figure out which specific action UDT recommends taking?

Consider the following program:

    if n == 0:
        return 1
    return n * f(n-1)

Let’s think about the process by which this function is evaluated. We want to sketch out a causal DAG showing all of the intermediate calculations and the connections between them (feel free to pause reading and try this yourself).

Here’s what the causal DAG looks like:

Each dotted box corresponds to one call to the function f. The recursive call in f becomes a symmetry in the causal diagram: the DAG consists of an infinite sequence of copies of the same subcircuit.

More generally, we can represent any Tu

... (Read more)
3johnswentworth17hI generally agree with this thinking, although I'll highlight that the brain and a hypothetical AI might not use the same primitives - they're on very different hardware, after all. Certainly the general strategy of "start with a few primitives, and see if they can represent all these other things" is the sort of strategy I'm after. I currently consider causal DAGs with symmetry the most promising primitive. It directly handles causality and recursion, and alongside a suitable theory of abstraction [] , I expect it will allow us to represent things like spatial/temporal relations, hierarchies, analogies, composition, and many others, all in a unified framework.

Sounds exciting, and I wish you luck and look forward to reading whatever you come up with! :-)

More Dakka
1462y12 min readShow Highlight

Epistemic Status: Hopefully enough Dakka

Eliezer Yudkowsky’s book Inadequate Eqilibria is excellent. I recommend reading it, if you haven’t done so. Three recent reviews are Scott Aaronson’s, Robin Hanson’s (which inspired You Have the Right to Think and a great discussion in its comments) and Scott Alexander’s. Alexander’s review was an excellent summary of key points, but like many he found the last part of the book, ascribing much modesty to status and prescribing how to learn when to trust yourself, less convincing.

My posts, including Zeroing Out and Leaders of Men have been attempts to ext... (Read more)

I'm reading this again now because I remember liking it and wanted to link it in something I'm writing, however:

Yes, some countries printed too much money and very bad things happened, but no  countries printed too much money because they wanted more inflation. That’s not a thing.

That is absolutely a thing that some governments do. Even if we disregard hyperinflation, when a government's tax brackets, spending commitments and sovereign debt are denominated in nominal currency and it needs more money for stuff, the polit... (Read more)(Click to expand thread. ⌘/CTRL+F to Expand All)Cmd/Ctrl F to expand all comments on this post

Elevator pitch: Bring enough light to simulate daylight into your home and office.

This idea has been shared in Less Wrong circles for a couple years. Yudkowsky wrote Inadequate Equilibria in 2017 where he and his wife invented the idea, and Raemon wrote a playbook in 2018 for how to do it yourself. Now I and at least two other friends are trying to build something similar, and I suspect there's a bigger-than-it-looks market opportunity here because it's one of those things that a lot of people would probably want, if they knew it existed and could experience it. And it's only recently become c

... (Read more)

Oh, somewhere on Google.

I'm looking to get oriented in the space of "AI policy": interventions that involve world governments (particularly the US government) and existential risk from strong AI.

When I hear people talk about "AI policy", my initial reaction is skepticism, because (so far) I can think of very few actions that governments could take that seem to help with the core problems of AI ex-risk. However, I haven't read much about this area, and I don't know what actual policy recommendations people have in mind.

So what should I read to start? Can people link to plans and pr... (Read more)

Answer by ShriDec 09, 20191

I'll post the obvious resources:

80k's US AI Policy article

Future of Life Institute's summaries of AI policy resources

AI Governance: A Research Agenda (Allan Dafoe, FHI)

Allen Dafoe's research compilation: Probably just the AI section is relevant, some overlap with FLI's list.

The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation (2018). Brundage and Avin et al.: One of the earlier "large collaboration" papers I can recall, probably only the AI Politics and AI Ideal Governance sections are rel... (Read more)(Click to expand thread. ⌘/CTRL+F to Expand All)Cmd/Ctrl F to expand all comments on this post

2Answer by rohinmshah4hWhat? I feel like I must be misunderstanding, because it seems like there are broad categories of things that governments can do that are helpful, even if you're only worried about the risk of an AI optimizing against you. I guess I'll just list some, and you can tell me why none of these work: * Funding safety research * Building aligned AIs themselves * Creating laws that prevent races to the bottom between companies (e.g. "no AI with >X compute may be deployed without first conducting a comprehensive review of the chance of the AI adversarially optimizing against humanity") * Monitoring AI systems (e.g. "we will create a board of AI investigators; everyone making powerful AI systems must be evaluated once a year") I don't think there's a concrete plan that I would want a government to start on today, but I'd be surprised if there weren't such plans in the future when we know more (both from more research, and the AI risk problem is clearer). You can also look at the papers under the category "AI strategy and policy" in the Alignment Newsletter database [] .

There’s an essay that periodically feels deeply relevant to a situation:

Someday I want to write a self-help book titled “F*k The Karate Kid: Why Life is So Much Harder Than We Think”.

Look at any movie with a training montage: The main character is very bad at something, then there is a sequence in the middle of the film set to upbeat music that shows him practicing. When it's done, he's an expert.

It seems so obvious that it actually feels insulting to point it out. But it's not obvious. Every adult I know--or at least the ones who are depressed--continually suffers from something like sticker

... (Read more)

(Site meta: it would be useful if there was a way to get a notification for this kind of mention)

Some thoughts about specific points:

the whole point of this sequence is to go "Yo, guys, it seems like we should actually be able to be good at this?"

This is true for the sequence overall, but this post and some others you've written elsewhere follow the pattern of "we don't seem to be able to do the thing, therefore this thing is really hard and we shouldn't beat ourselves up about not being able to do it" that seems to come ... (Read more)(Click to expand thread. ⌘/CTRL+F to Expand All)Cmd/Ctrl F to expand all comments on this post

2romeostevensit9hRight, that first 20 hours gets you to the 80th-90th percentile and it takes another 200 to get to the 99th. But important cognitive work seems multiplicative more than additive, so getting to the 80-90th percentile in the basics makes a really big difference.
2romeostevensit9h1. that's not a reasonable standard for the thing? 2. he actually came closer than almost anyone else?
1TAG4hWhether solving politics is reasonable depends on where you are coming from. There's a common assumption round here that Aumann's agreement applies to real life, that people should be able to reach agreement and solve problems, and if they can't, that's an anomaly that needs explanation. The OP is suggesting that the explanation for arch-rationalists such as Hanson and Yudkowsky being unable to agree is lack of skill,whereas I am suggesting that Aumanns theorem doesn't apply to real life, so lack of skill is not the only problem.
281d1 min readShow Highlight

When someone asks me why I did or said something I usually lie because the truthful answer is "I don't know". I literally don't know why I make >99% of my decisions. I think through none of these decisions rationally. It's usually some mixture of gut instinct, intuition, cultural norms, common sense and my emotional state at the time.

Instead, I make up a rational-sounding answer on the spot. Even when writing a mathematical proof I'll start with an answer and then rationalize it post hoc. If I'm unusual it's because I knowingly confabulate. Most humans unknowingly confabulate. This is well-

... (Read more)
3romeostevensit9hSentence stems for finding more of these: [] usually your first few answers are cached bullshit. If you answer multiple times quickly enough you'll wind up saying something uncomfortable.
3mr-hire7hThese are fairly hard to read with image compression, can you copy paste or share the doc?
[Event]Dublin SSC Meetup - Death and Self
1Dec 14th61 Capel Street, DublinShow Highlight

We got into some really interesting discussions last time, so I think we'll continue those this time (at least until we go off on some random tangent).

Topic 1: Death

Topic 2: Self

If anyone has a really great article on either of these topics, please share it here so we can discuss it at the meetup. 

Hope to see you all on Saturday!


It's been said before that your personal habits of thoughts may be in large part influenced by the way you conduct conversations. The machinery underlying these might just be the same. If you're embedded in a culture that has good epistemic norms for conversations, your thoughts might also become of a higher quality.

It's been said before that most disagreement stems from a failure to properly operationalize what you're talking about. That is, we often assume that we understand what our conversational partner is saying, while we also think that they don't understand wha... (Read more)

We will consider a special version of the Smoking Lesion where there is 100% correlation between smoking and cancer - ie. if you have the lesion, then you smoke and have cancer, if you don't have the lesion, then you don't smoke and don't have cancer. We'll also assume the predictor is perfect in the version of Newcomb's we are considering. Further, we'll assume that the Lesion is outside of the "core" part of your brain, which we'll just refer to as the brain and assume that it affects this be sending hormones to it.


Notice how similar the ... (Read more)

Oh, I more or less agree :P

If there was one criticism I'd like to repeat, it's that framing the smoking lesion problem in terms of clean decisions between counterfactuals is already missing something from the pre-mathematical description of the problem. The problem is interesting because we as humans sometimes have to worry that we're running on "corrupted hardware" - it seems to me that mathematization of this idea requires us to somehow mutilate the decision theories we're allowed to consider.

To look at this from another ang... (Read more)(Click to expand thread. ⌘/CTRL+F to Expand All)Cmd/Ctrl F to expand all comments on this post

3shminux12hWell, I went by the setup presented in the FDT paper (which is terrifyingly vague in most of the examples while purporting to be mathematically precise), and it clearly says that only those with the lesion love smoking. Again, if the setup is different, the numbers would be different. Smoking does not increase the chances of the lesion in this setup! From the FDT paper:
2Chris_Leong12hMy issue is that you are doing implicit pre-processing on some of these problems and sweeping it under the rug. Do you actually have any kind of generalised scheme, including all pre-processing steps?
2shminux12hI... do not follow. Unlike the FDT paper, I try to write out every assumption. I certainly may have missed something, but it is not clear to me what. Can you point out something specific? I have explained the missing $1000 checkup cost: it has no bearing on decision making because the cosmic ray strike making one somehow do the opposite of what they intended and hence go and get examined can happen with equal (if small) probability whether they take $1 or $100. If the cosmic ray strikes only those who take $100, or if those who take $100 while intending to take $1 do not bother with the checkup, this can certainly be included in the calculations.

This essay was originally posted in 2007.

Frank Sulloway once said: “Ninety-nine per cent of what Darwinian theory says about human behavior is so obviously true that we don’t give Darwin credit for it. Ironically, psychoanalysis has it over Darwinism precisely because its predictions are so outlandish and its explanations are so counterintuitive that we think, Is that really true? How radical! Freud’s ideas are so intriguing that people are willing to pay for them, while one of the great disadvantages of Darwinism is that we feel we know it already, because, in a sense, we do.”

Suppose you find... (Read more)

Aesthetic preferences are a huge part of our personalities, who would agree to any enhancement that would destroy them? And as long as they’re present, a transhuman will be even more effective at making everything look, sound, smell etc. beautiful — in some form or another (maybe in a simulation if it’s detailed enough and if we decide there’s no difference), because a transhuman will be more effective at everything.

If you’re talking about the human body specifically, I don’t think a believable LMD (with artificial... (Read more)(Click to expand thread. ⌘/CTRL+F to Expand All)Cmd/Ctrl F to expand all comments on this post

Q&A with Shane Legg on risks from AI
468y3 min readShow Highlight

[Click here to see a list of all interviews]

I am emailing experts in order to raise and estimate the academic awareness and perception of risks from AI.

Below you will find some thoughts on the topic by Shane Legg, a computer scientist and AI researcher who has been working on theoretical models of super intelligent machines (AIXI) with Prof. Marcus Hutter. His PhD thesis Machine Super Intelligence has been completed in 2008. He was awarded the $10,000 Canadian Singularity Institute for Artificial Intelligence Prize.

Publications by Shane Legg:

  • Solomonoff Induction thesis
  • Universal Intelligence
... (Read more)
5Liam Donovan8hWell, it's been 8 years; how close are ML researchers to a "proto-AGI" with the capabilities listed? (embarassingly, I have no idea what the answer is)

As far as I know no one's tried to build a unified system with all of those capacities, but we do seem to have rudimentary learned versions of each of the capacities on their own.

Affordance Widths
1472y2 min readShow Highlight

This article was originally a post on my tumblr. I'm in the process of moving most of these kinds of thoughts and discussions here.

Okay. There’s a social interaction concept that I’ve tried to convey multiple times in multiple conversations, so I’m going to just go ahead and make a graph.

I’m calling this concept “Affordance Widths”.

Let’s say there’s some behavior {B} that people can do more of, or less of. And everyone agrees that if you don’t do enough of the behavior, bad thing {X} happens; but if you do too much of the behavior, bad thing {Y} happens.

Now, let’s say we have five differ... (Read more)

3Raemon15hNod. And in particular, I saw this post as something like "taking the concept of 'privilege', and fleshing it the gears of one particular facet of it." (Privilege also being a concept that's interwoven with some broader narratives or political maneuvering that I don't fully endorse, but is nonetheless have found quite useful)

Yes, I didn't frame the post in those terms but you doing so made a bunch of things click for me.

One of the converaations I had recently made me realize my affordable widths with risk taking for money were much different from others, because I don't need tons of money for health issues and my parents can and will support me when worst comes to worst (and I don't have to accept something like abuse to get their help).

This made me really conscious of my privilege around money.

Truth and the Liar Paradox
65y4 min readShow Highlight

Related: The map is not the territory, Unresolved questions in philosophy part 1: The Liar paradox

A well-known brainteaser asks about the truth of the statement "this statement is false". If the statement is true, then the sentence must be false, but if it false then the sentence must be true. This paradox, far from being just a game, illustrates a question fundamental to understanding the nature of truth itself.

A number of different solutions have been proposed to this paradox (and the closely related Epimenides paradoxPinocchio paradox). One approach is to reject the principal of... (Read more)

The sentence’s structure is true. The meaning behind the words have no value.

Ungendered Spanish
192d1 min readShow Highlight

Spanish has gramatical gender in a way English doesn't:

una amiga ruidosa — a loud (female) friend
un amigo ruidoso — a loud (male) friend
unas amigas ruidosas — some loud (female) friends
unos amigos ruidosos — some loud (not-all-female) friends

I remember when I was studying Spanish, learning the rule that even if you had a hundred girls and one boy you would use the male plural. My class all thought this was very sexist and unfair, but our teacher told us we were misapplying American norms and intuitions.

It's been interesting, ~twenty years later, following the developm... (Read more)

3paul ince15hWould words like jefe (boss) have to be changed to jefo to specify a male boss? Currently, el jefe is a male boss or a boss without their gender being specified....except that it kinda does specify because it is not la jefa, female boss.

I suspect it would be "le jefe" / "les jefes", with no changes to existing gendered forms.

3renato16hPortuguese uses the same vowels terminations for genders, but our articles are a simple 'a or 'o' (instead of 'la' 'lo') and we also use 'e' as the and connector. It means that the vowels 'i' and 'u' would still free for the third gender, but we do some vocal accommodation orally (I'm not sure about the correct linguist term) and often the sound 'e' becomes 'i' and 'o' becomes 'u' (it does not happen the other way). Because of that, all of our vowels are already "taken" with just two genders. I found it fascinating that it works so well in Spanish and not at all in Portuguese, even with both languages being very similar (I feel that Portuguese is slightly more gendered than Spanish).
The Paradox of RobustnessΩ
2111h1 min readΩ 12Show Highlight

This post builds on 2-D robustness by identifying a conflicting dynamic present in the decomposition of robustness. Vladimir Mikulik illustrates that robustness can be decomposed into two variables, each varying freely on two axes: robustness in capabilities, and robustness in alignment.

In fact, these quantities are not always orthogonal to each other. Sometimes, getting more of one necessarily means getting less of the other. Hence, the "paradox." Let me explain.

Human DNA is considered robust in the following sense: if you mutate a single gene in any given person's genome, the ... (Read more)

LessWrong is currently doing a major review of 2018 — looking back at old posts and considering which of them have stood the test of time. There are three phases:

  • Nomination (completed)
  • Review (ends Dec 31st)
  • Voting on the best posts (ends January 7th)

We’re now in the Review Phase, and there are 75 posts that got two or more nominations. The full list is here. Now is the time to dig into those posts, and for each one ask questions like “What did it add to the conversation?”, “Was it epistemically sound?” and “How do I know these things?”... (Read more)

Load More