Recent Discussion

Moloch feeds on opportunity
261d2 min readShow Highlight

This is very likely my most important idea, and I've been trying to write about it for years, but it's too important to write about it badly, so I haven't ever mustered the courage. I guess I'll just do a quick test round.

It starts with this "hierarchy of needs" model, first popularized my Maslow, that we humans tend to focus on one need at the time.

  • If you're hungry, you don't care about aesthetics that much
  • If you are competing for status, you can easily be tempted to defect on coordination problems
  • etc

I like to model this roughly as an ensemble of subag... (Read more)

But here's the kicker: in this globalist hyper-connected century, we don't really run out of perceived opportunity anymore. What does happen, is that we're perpetually stuck with motivations that people in the past would have perceived as morally depraved

Agreed but it's also worth noting that this can run the other way too. This globalist hyper-connected century can also provide us with motivations that seem unusually noble. Part of the 80000 Hours schtick is the idea that we're uniquely advantaged to do extremely good things for... (Read more)(Click to expand thread. ⌘/CTRL+F to Expand All)Cmd/Ctrl F to expand all comments on this post

2G Gordon Worley III8hThis is an aside, but I want to think about what this means in terms of predictive processing. What would it mean, in PP terms, to have an experience that could be reified as "running out of perceived opportunity"? The running out part seems straightforward: that's having succeeded in hitting a setpoint (and this fits with the satiation explanation you've been using). So that would make the setpoint perceived opportunity, but what does that mean? I think the trick is to understand it as expectation of getting what we want. That is, an opportunity is an expectation to minimize deviation from some setpoint, in this case let's say for status (keeping in mind that at a neurological level there is almost certainly not a single control system for status on a single variable, it instead being a thing made up of many little parts that get combined together in correlated ways that allow us to reasonable lump them together as "status"). Thus it seems this phenomenon of status satisficing is explainable and would be predicted by PP, contingent on there being neurological encoding of status via setpoints, and status is such a robust phenomenon in humans that it seems unlikely that this would not be the case.
6toonalfrink11hYou're good. You're just confessing something that is true for most of us anyway. Up to a point. It is certainly true that status motivations have led to great things, and I'm personally also someone who is highly status-driven but manages to mostly align that drive with at least neutral things, but there's more. The other great humanist psychologist besides Maslow was Adam Rogers. His thinking can be seen as an expansion on this "subagent motivation is perceived opportunity" idea. He proposed an ideal vs an actual self. The ideal self is what you imagine you could and should be. Your actual self is what you imagine you are. The difference between ideal self and actual self, he said, was the cause of suffering. I believe that Buddhism backs this up too. I'd like to expand on that and say that the difference between ideal self (which seems like a broader class of things that includes perceived opportunity but also social standards, the conditions you're used to, biological hardwiring, etc) and your actual self is the thing that activates your subagents. The bigger the difference, the more your subagents are activated by this difference. Furthermore, the level of activation of your subagents causes cognitive dissonance (a.k.a. akrasia), i.e. one or multiple of your subagents not getting what they want even though they're activated. And THAT is my slightly-more-gears-level model of where suffering comes from. So here's what I think is actually going on with you: you're torn between multiple motivations until the status subagent comes along and pulls you out of your deadlock because it's stronger than everything else. So now there's less cognitive dissonance and you're happy that this status incentive came along. It cut your gordian knot. However, I think it's also possible to resolve this dissonance in a more constructive way. I.e. untie the knot. In some sense the status incentive pushes you into a local optimum. I realise that I'm probably hard to follow. The
2Bucky5hI think that’s a good explanation. I agree that the solution to Akrasia I describe is kind of hacked together and is far from ideal. If you have a better solution to this I would be very interested and it would change my attitude to status significantly. I suspect that this is the largest inferential gap you would have to cross to get your point across to me, although as I mentioned I’m not sure how central I am as an example. I’m not sure suffering is the correct frame here - I don’t really feel like Akrasia causes me to suffer. If I give in then I feel a bit disappointed with myself but the agent which wants me to be a better person isn’t very emotional (which I think is part of the problem). Again there may be an inferential gap here.
Eggy Waffles
827m1 min readShow Highlight

A few months ago I posted about making eggy crepes as a way to get the kids to eat more protein. Lately they've been interested in waffles, and I wanted to figure out how to make something similar. It turns out if you search for [eggy waffles] you tend to find people making waffles that have only slightly more eggs than usual. We can do better than that:

  • 3 Eggs
  • 1/4C full fat Greek yoghurt
  • 2T butter
  • 2T flour
As with the crepes, they taste a bit like eclairs, but that's not a bad thing and the kids like them a lot.

It's possible that they could use a bit less flour, but if... (Read more)

In How I do research, TurnTrout writes:

[I] Stare at the problem on my own, ignoring any existing thinking as much as possible. Just think about what the problem is, what's confusing about it, what a solution would look like. In retrospect, this has helped me avoid anchoring myself. Also, my prior for existing work is that it's confused and unhelpful, and I can do better by just thinking hard.

The MIRI alignment research field guide has a similar sentiment:

It’s easy to fall into a trap of (either implicitly or explicitly) conceptualizing “research” as “first studying and learning what’s al

... (Read more)
8Answer by rohinmshah2hI basically disagree with the recommendation almost always, including for AI alignment. I do think that I often see the sentiment, "I'm going to learn linear algebra, probability theory, computational complexity, machine learning and deep RL, and then I'll have the prerequisites to do AI safety". (Possible reasons for this: the 80K AI safety syllabus [], CHAI's bibliography [], a general sense that you have to be an expert before you can do research.) This sentiment seems wrong to me; you definitely can and should think about important questions before learning everything that could potentially be considered "background". The advice sounds to me like "when you feel like existing research would be useful, then go ahead and look at it, but don't feel like it's necessary", whereas I would say "as soon as you have questions, which should be almost immediately, one of the first things you should do is find the existing research and read it". The justification for this is the standard one -- people have already done a bunch of work that you can take advantage of. The main disadvantage of this approach is that you lose the opportunity to figure things out from first principles. When you figure things out from first principles, you often find many branches that don't work out, which helps build intuitions about why things are the way they are, which you don't get nearly as well by reading about research, and you can't go back and figure things out from first principles because you already know the answer. But this first-principles-reasoning is extremely expensive (in time), and is almost never worthwhile. Another potential disadvantage is that you might be incorrectly convinced that a technique is good, because you don't spot the flaws in it when reading existing research, even though you could have figured it out from first principles. My preferred solution is to become good at

I often see the sentiment, "I'm going to learn linear algebra, probability theory, computational complexity, machine learning and deep RL, and then I'll have the prerequisites to do AI safety".

Yeah, that feels like a natural extension of "I'm not allowed to have thoughts on this yet, so let me get enough social markers to be allowed to think for myself." Or " be allowed a thinking license."

1Matthew Barnett1hSee also, my shortform post about this [] .
1Answer by FactorialCode4hI think this is mainly a function of how established the field is and how much time you're willing to spend on the subject. The point of thinking about a field before looking at the literature is to avoid getting stuck in the same local optima as everyone else. However, making progress by yourself is far slower than just reading what everyone has already figured out. Thus, if you don't plan to spend a large amount of time in a field , it's far quicker and more effective to just read the literature. However, if you're going to spend a large amount of time on the problems in the field, then you want to be able to "see with fresh eyes" before looking at what everyone else is doing. This prevents everyone's approaches from clustering together. Likewise, in a very well established field like math or physics, we can expect everyone to already have clustered around the "correct answer". It doesn't make as much sense to try and look at the problem from a new perspective, because we already have very good understanding of the field. This reasoning break down once you get to the unsolved problems in the field. In that case, you want to do your own thinking to make sure you don't immediately bias your thinking towards solutions that others are already working on.

I’m confused why people are so bad at dating. It seems to me like there are tons of $20 bills lying on the ground which no one picks up.

For example, we know that people systematically choose unattractive images for their dating profiles. Sites like PhotoFeeler cheaply (in some cases, freely) resolve this problem. Since photo quality is one of the strongest predictors of number of matches, you would think people would be clamoring to use these sites. And yet, not many people use them.

In the off-line dating world, it surprises me how few self-help books are about dating. Right now, zer... (Read more)

Maximizing proportion of time spent in an enjoyable relationship seems to be the dominant metric for success at dating. It predicts a wide range of behaviors related to dating:

  • Partners rarely break up unless one of the following factors are met
    • They don't enjoy their relationship
    • They believe they can quickly go from their current relationship to another enjoyable relationship (i.e. cheating, breaking up, and then dating their paramour)
  • When people are single they often attempt to get into an enjoyable relationship
  • People express that they don't want
... (Read more)(Click to expand thread. ⌘/CTRL+F to Expand All)Cmd/Ctrl F to expand all comments on this post (a)

By the architect of Brexit, who is now one of Boris Johnson's main advisors ("Special Adviser to the Prime Minister").

Cummings has a good amount of intellectual overlap with rationalist writers. (Check out his blogroll, in the sidebar on the right!)

From post's introduction:

This blog looks at an intersection of decision-making, technology, high performance teams and government. It sketches some ideas of physicist Mic
... (Read more)

Reflecting on this more, I'm realizing that the most important strategist of the current UK government is a straight-up rationalist (e.g. his post on AGI safety (a)).


I have come to a realisation a bit later than I should have. Although I am still quite young and definitely have time to act on this realisation now, I wish I had started sooner.

I am studying to become a teacher, and I hope to go into education policy later, with quite some large ambition in mind. And yet, my social skills are quite poor, and I have hardly any charisma. I seek to change this. I know that much of the cause of my poor social skills is never having created or found opportunities to develop them in the natural developmental path of a child/teenager.

And so I take to reading books i... (Read more)

This was one of the most thought-provoking posts I read this month. Mostly because I spent a really large number of hours of my life sleeping, and also significantly increased the amount that I've been sleeping over the past three years, and this has me seriously considering reducing that number again. 

The opening section of the article: 


Matthew Walker (a) is a professor of neuroscience and psychology at the University of California, Berkeley, where he also leads the Center for Human Sleep Science.

His book Why We Sleep (a) was published in September 2017. Part survey of sleep r

... (Read more)

I appreciate any debunking, at least a little.

I found this of interest, if for no other reason than to trust claims made by the author of the 'debunked' book less. The details about the specific claims debunked were also (mildly) informative.

Rule Thinkers In, Not Out
10210mo3 min readShow Highlight

Imagine a black box which, when you pressed a button, would generate a scientific hypothesis. 50% of its hypotheses are false; 50% are true hypotheses as game-changing and elegant as relativity. Even despite the error rate, it’s easy to see this box would quickly surpass space capsules, da Vinci paintings, and printer ink cartridges to become the most valuable object in the world. Scientific progress on demand, and all you have to do is test some stuff to see if it’s true? I don’t want to devalue experimentalists. They do great work. But it’s appropriate that Einstein i... (Read more)

experimental proof that hidden variables is wrong (through the EPR experiments)

Local hidden variable theories were disproved. But that is not at all surprising given that QM is IMHO non-local, as per Einstein's "spooky nonlocality".

It is interesting that often even when Einstein was wrong, he was fruitful. His biggest mistake, as he saw it, was the cosmological constant, now referred to as dark energy. Nietzsche would have approved.

On QM his paper led to Bell's theorem and real progress. Even though his claim was wrong.

5Dr_Manhattan10hI suspect this might be a subtler point? [] suggests really valuable contributions are more bottlenecked on obsession rather than being good at directing attention in a "valuable" direction
5Dr_Manhattan10hI think in part these could be "lessons relevant to Sarah", a sort of a philosophical therapy that can't be completely taken out of context. Which is why some of these might seem of low relevance or obvious.
2Dr_Manhattan10h [] seems to be promoting a similar idea

This is a link post for:

I'm not going to quote the content at the link itself. [Should I?]

David Chapman – the author of the linked post – claims that "meta-rational" methods are necessary to 'reason reasonably'. I, and I think a lot of other people broadly part of the greater rationalist community, have objected to that general distinction. I still stand by that objection, at least terminologically.

But with this post and other previous recent posts/'pages' that he's posted at his site Meaningness, I think I'm better understanding

... (Read more)

One of the most pleasing things about probability and expected utility theory is that there are many coherence arguments that suggest that these are the “correct” ways to reason. If you deviate from what the theory prescribes, then you must be executing a dominated strategy. There must be some other strategy that never does any worse than your strategy, but does strictly better than your strategy with certainty in at least one situation. There’s a good explanation of these arguments here.

We shouldn’t expect mere humans to be able to notice any failures of coherence in a superintelligent agent,... (Read more)

11Vanessa Kosoy10hReviewIn this essay, Rohin sets out to debunk what ey perceive as a prevalent but erroneous idea in the AI alignment community, namely: "VNM and similar theorems imply goal-directed behavior". This is placed in the context of Rohin's thesis that solving AI alignment is best achieved by designing AI which is not goal-directed. The main argument is: "coherence arguments" imply expected utility maximization, but expected utility maximization does not imply goal-directed behavior. Instead, it is a vacuous constraint, since any agent policy can be regarded as maximizing the expectation of some utility function. I have mixed feelings about this essay. On the one hand, the core argument that VNM and similar theorems do not imply goal-directed behavior is true. To the extent that some people believed the opposite, correcting this mistake is important. On the other hand, (i) I don't think the claim Rohin is debunking is the claim Eliezer had in mind in those sources Rohin cites (ii) I don't think that the conclusions Rohin draws or at least implies are the right conclusions. The actual claim that Eliezer was making (or at least my interpretation of it) is, coherence arguments imply that if we assume an agent is goal-directed then it must be an expected utility maximizer, and therefore EU maximization is the correct mathematical model to apply to such agents. Why do we care about goal-directed agents in the first place? The reason is, on the one hand goal-directed agents are the main source of AI risk, and on the other hand, goal-directed agents are also the most straightforward approach to solving AI risk. Indeed, if we could design powerful agents with the goals we want, these agents would protect us from unaligned AIs and solve all other problems as well (or at least solve them better than we can solve them ourselves). Conversely, if we want to protect ourselves from unaligned AIs, we need to generate very sophisticated long-term plans of action in the physical world, possi
This is placed in the context of Rohin's thesis that solving AI alignment is best achieved by designing AI which is not goal-directed.
I remain skeptical about Rohin's thesis that we should dispense with goal-directedness altogether

Hmm, perhaps I believed this when I wrote the sequence (I don't think so, but maybe?), but I certainly don't believe it now. I believe something more like:

  • Humans have goals and want AI systems to help them achieve them; this implies that the human-AI system as a whole should be goal-directed.
  • One particula
... (Read more)(Click to expand thread. ⌘/CTRL+F to Expand All)Cmd/Ctrl F to expand all comments on this post

This is the online counterpart to the in-person event that will be running simultaneously at the Bay Area Solstice Unconference. You, the lucky few who see this post in the next 24 hours, are invited!

Bookmark a link to this Google doc.

The doc will open for editing on Saturday, December 14, 2019, at 2:30PM Pacific Standard Time (22:30 UTC), and the activity will end 30 minutes later. At the end we will vote on whether the final product should be posted on LessWrong.

Noticing the Taste of Lotus
2292y3 min readShow Highlight

Recently I started picking up French again. I remembered getting something out of Duolingo a few years ago, so I logged in.

Since the last time I was there, they added an “achievements” mechanic:

I noticed this by earning one. I think it was “Sharpshooter”. They gave me the first of three stars for something like doing five lessons without mistakes. In the “achievements” section, it showed me that I could earn the second star by doing twenty lessons in a row flawlessly.

And my brain cared.

I watched myself hungering to get the achievements. These arbitrary things that someone had just stuck on the... (Read more)

This resonates with me especially for having purchased a manual transmission vehicle specifically so that I would not succumb to the temptation to hurl my tons-heavy machine at everything that isn't an iPhone Retina display and is thus goes unseen.

Examples of Causal AbstractionΩ
131d4 min readΩ 6Show Highlight

I’m working on a theory of abstraction suitable as a foundation for embedded agency and specifically multi-level world models. I want to use real-world examples to build a fast feedback loop for theory development, so a natural first step is to build a starting list of examples which capture various relevant aspects of the problem.

These are mainly focused on causal abstraction, in which both the concrete and abstract model are causal DAGs with some natural correspondence between counterfactuals on the two. (There are some exceptions, though.) The list isn’t very long; I’ve... (Read more)

4G Gordon Worley III7hSomewhat related to the electrical circuits example, there might be something similar in software engineering, with levels being something like (depending on the programming paradigm): * CPU instructions * byte code or op code or assembly * AST * programming language instructions * statements * functions * modules and classes * patterns and DSLs * processes * applications/products

Yes definitely. I've omitted examples from software and math because there's no "fuzziness" to it; that kind of abstraction is already better-understood than the more probabilistically-flavored use-cases I'm aiming for. But the theory should still apply to those cases, as the limiting case where probabilities are 0 or 1, so they're useful as a sanity check.

Author's Note: This post is a bunch of mathy research stuff with very little explanation of context. Other posts in this sequence will provide more context, but you might want to skip this one unless you're looking for mathy details.

Suppose we have a medical sensor measuring some physiological parameter. The parameter has a constant true value , and the sensor takes measurements over a short period of time. Each measurement has IID error (so the measurements are conditionally independent given ). In the end, the measurements are averaged together, and there’s a ... (Read more)

4Hazard7hI enjoyed this! I had to read through the middle part twice; is the idea of the basically "it depends on what the distributions are, but there is another simple stat you can computer from the Mi, which combined with their average, gives you all the info you need"? I liked that this was a simple example of how choices in the way you abstract do or don't lose different information.
"it depends on what the distributions are, but there is another simple stat you can computer from the , which combined with their average, gives you all the info you need"

Yes, assuming it's a maximum entropy distribution (e.g. normal, dirichlet, beta, exponential, geometric, hypergeometric, ... basically all the distributions we typically use as fundamental building blocks). If it's not a maximum entropy distribution, then the relevant information can't be summarized by a simple statistic; we need to keep around the whole distrib... (Read more)(Click to expand thread. ⌘/CTRL+F to Expand All)Cmd/Ctrl F to expand all comments on this post

Matthew Barnett's ShortformΩ
74mo1 min readΩ 2Show Highlight

I intend to use my shortform feed for two purposes:

1. To post thoughts that I think are worth sharing that I can then reference in the future in order to explain some belief or opinion I have.

2. To post half-finished thoughts about the math or computer science thing I'm learning at the moment. These might be slightly boring and for that I apologize.

Should effective altruists be praised for their motives, or their results?

It is sometimes claimed, perhaps by those who recently read The Elephant in the Brain, that effective altruists have not risen above the failures of traditional charity, and are every bit as mired in selfish motives as non-EA causes. From a consequentialist view, however, this critique is not by itself valid.

To a consequentialist, it doesn't actually matter what one's motives are as long as the actual effect of their action is to do as much good as possible. This is the pri... (Read more)(Click to expand thread. ⌘/CTRL+F to Expand All)Cmd/Ctrl F to expand all comments on this post

Sorry for the flashy headline, but I genuinely feel this might be the best idea I ever had.

After the invention of the horseless carriage, it supposedly took people years to realize that without reins to hold, the driver could now sit inside the cabin. Change of the core technology allowed a rethink of the entire product (the vehicle) but that rethink was hard.

With autonomous cars, we already have a rethink. Autonomous vehicles can be called to a person who needs it, so they should not be owned (and sit idle while the owner doesn't need to move) but should be taxis. Lots of people, includi... (Read more)

1jmh10hNot wrong I think, as most others do as well it appears. But I think it better seen (as others note) as a good insight that has a place in the large mix rather than as some complete alternative to the current thoughts. That seems to be a fairly normal trajectory for innovations -- they need to work at the edges. So the autonomous vehicles get started in places like mining where it's a more controlled and limited setting or factory floors. Slowly they can move to broader application, truck fleets (while I've not heard of them I would think something like patrol boats for customs would be something of a no brainer) and then into things like moving people via taxis. I think the idea of separating the engine for the compartment just works into that a bit more slowly as the applications (an standards for coupling) get worked out. I would think the coupling aspect might be the area to start working on as that doesn't seem like it's on anyone's radar (though perhaps research in mining and rail yard areas might show something) but might dovetail well with existing plans - such as taxis. For instance, with urban travel (and perhaps even more so in suburban, not sure) it is likely that trips originating at one location, or even along a common path, could be more efficiently services even when the end destinations are vastly different with one taxi engine for part of the trip then one of more of the passenger containers disconnected at some point to where another taxi picks it up for the next led. Probably good logistic support could be found with rail management tools for starting points.
3chaosmage12hIn a car park? But they will be way more densely packed than cars in car parks, because no humans need access. The cabins get placed there and retrieved from there by autonomous engines.

Good answer!

I was thinking about people living in detached homes in residential neighborhoods, i.e. places where I would expect local politics to prevent car parks ('parking lots' in my colloquialisms) from being built at all.

Arguing about housing
171mo2 min readShow Highlight

Somerville, like a lot of popular areas, has a problem that there are many more people who want houses than there are houses. In the scheme of things this is not a bad problem to have; mismatches in the other direction are probably worse. But it's still a major issue that is really hurting our community. I've been getting into a lot of discussions, and here are some ideas I find myself saying a lot:

  • With the level of housing crisis we have right now I'm going to be in favor of basically any proposal that builds more bedrooms. Affordable housing, market rate housing, public housing,

... (Read more)

There's probably (at least) something to that idea. I imagine commercial construction is similarly constrained as residential. It's pretty common to hear that commercial rents are high in the places where residential rents are too.

An1lam's Short Form Feed
141y1 min readShow Highlight

In light of reading Hazard's Shortform Feed -- which I really enjoy -- based on Raemon's Shortform feed, I'm making my own. There be thoughts here. Hopefully, this will also get me posting more.

Link post for a short post I just published describing my way of understanding Simpson's Paradox.

Bíos brakhús
13moShow Highlight

Models that really pop. All models are bullshit, but some are useful. Or, all models are true in some regimes but not others, and some are true in a large regime. A model that is true in a large regime is threatening to humans because someone might See Like A State with it. Pointing out regimes where a model isn't true is often taken as an attack (and indeed it might be an attack, i.e. its intended purpose and/or one of its main effects might be to dispel any political will that may have been swirling around to understand the model and make it part of the

... (Read more)(Click to expand thread. ⌘/CTRL+F to Expand All)Cmd/Ctrl F to expand all comments on this post

Consider the following program:

    if n == 0:
        return 1
    return n * f(n-1)

Let’s think about the process by which this function is evaluated. We want to sketch out a causal DAG showing all of the intermediate calculations and the connections between them (feel free to pause reading and try this yourself).

Here’s what the causal DAG looks like:

Each dotted box corresponds to one call to the function f. The recursive call in f becomes a symmetry in the causal diagram: the DAG consists of an infinite sequence of copies of the same subcircuit.

More generally, we can represent any Tu

... (Read more)
Causal DAGs with symmetry are how we do this for Turing-computable functions in general. They show the actual cause-and-effect process which computes the result; conceptually they represent the computation rather than a black-box function.

This was the main interesting bit for me.

Load More