Recent Discussion

(Disclaimer: Philip Trammell is planning to rewrite this blogpost and make it clearer and more precise, but I think there's something good in this direction. Sharing it to see what the rest of the community thinks.)

Some smart people, including some of my friends, believe that advanced AI poses a serious threat to human civilization in the near future, and that AI safety research is therefore one of the most valuable uses, if not the very most valuable use, of philanthropic talent and money. But most smart people, as far as I can judge their behavior—including some, like Mark Zuckerberg and R

... (Read more)
The majority of those who best know the arguments for and against thinking that a given social movement is the world's most important cause, from pro-life-ism to environmentalism to campaign finance reform, are presumably members of that social movement.

This seems unlikely to me given my reactions to talking to people in other movements, including the ones you mentioned. The idea that what they're arguing for is "the world's most important cause" hasn't explicitly been considered by most of them, and for those who have, few have done any sort of rigorous analysis.

By contrast, part of the big sell of EA is that it actively searches for the world's biggest causes.

1Pattern39mEvidence itself is agnostic, and not an argument.
2Matthew Barnett1hThis makes sense. However, I'd still point out that this is evidence that the arguments weren't convincing, since otherwise they would have used the same arguments, even though they are different people.
4Wei_Dai2hMay I beseech you to be more careful about using "optimism" and words like it in the future, because I'm really worried about strategy researchers and decision makers getting the wrong impression [] from AI safety researchers about how hard the overall AI risk problem is, and for some reason I keep seeing people say that they're "optimistic" (or other words to that effect) when they mean optimistic about some sub-problem of AI risk instead of AI risk as a whole, but they don't make that clear. In many cases it's pretty predictable that people outside technical AI safety research (or even inside, like in this case) would often misinterpret that as being optimistic about AI risk.

In the last few months, I've gotten increasingly alarmed by leftist politics in the US, and the epistemic conditions that it operates under and is imposing wherever it gains power. (Quite possibly the conditions are just as dire on the right, but they are not as visible or salient to me, because most of the places I can easily see, either directly or through news stories, i.e., local politics in my area, academia, journalism, large corporations, seem to have been taken over by the left.)

I'm worried that my alarmism is itself based on confirmation bias, tribalism, catastrophizing, or any number

... (Read more)
2habryka10mThat's roughly correct. The important caveat is that we do want to avoid the site being dominated by discussion of politics, and so are likely going to reduce the visibility of that discussion somewhat, in order to compensate for the natural tendencies of those topics to consume everything (I am not yet really sure how precisely we would go about that, since it hasn't been an issue so far), and also because I really want to avoid newcomers first encountering all the political discussion (and selecting on newcomers who come for the political discussion).
3Wei_Dai2h(I was waiting for a mod to chime in so I don't have to, but ...) I believe this is one of the reasons for confining political topics to "personal blogposts" which are not shown by default on the front page. My understanding is that they're prepared to impose further measures to reduce engagement with political discussions if they start to get out of hand. I guess (this is just speaking for myself) that if worst comes to worst we can always just impose a hard ban on political topics. (By "worst comes to worst" I mean in the sense of political discussions getting out of hand on LW. A worse problem, that I worry more about, is LW getting "canceled" by outsiders, in which case even banning political topics may be too late. I think we may want to pre-emptively impose more safeguards for that reason, like maybe making object-level political posts only visible to users over some karma threshold?)

Oops, looks like we commented at the same time. You basically said the same thing I did, so I am glad we're on the same page.

2ryan_b3hYes, I think it was comparably serious, although because the landscape was different the consequences were also different; in general I expect modern events to be higher variance. The first similarity is the primacy of cultural products, in particular media and the arts. This was the period when there was a concerted effort to destroy the fantasy genre, and pressure was brought to bear to cancel tv shows and concerts which were deemed too occult. The influence on academia was negligible as far as I can tell, but I suggest taking another look at government: among other things it seriously distorted a fraction of the justice system because it became common for the public to worry about whether there was a satanic cult present, which diverted resources into investigating things that weren't there (like cults) or focused attention on suspects for nonsensical reasons like whether they owned metal albums. This was also the same period that gave us the modern system of censorship, like film ratings, adult content warning stickers on CDs, etc. While censorship wasn't driven solely by the panic, the people swept up in it did work hard to capitalize on these mechanisms to further their aims. I find it helps to view this kind of cultural event from the perspective of the institutions that make up its center of gravity. For example, one way to make sense of the differences between these movements is that the Red Scare was centered on the federal government and national media outlets; the Satanic Panic was centered on local churches and local government institutions like law enforcement, schools, and libraries; and the social justice movement is centered on universities and the internet. That being said, your point about communism being a real thing that made sense to be concerned about is a good one: there never was a nation-spanning web of devil-worshipping cults that conducted ritual murder and sought to brainwash the youth of America. By contrast there is a nation-spann
Moral public goods
1133d4 min readShow Highlight

Automatically crossposted

Suppose that a kingdom contains a million peasants and a thousand nobles, and:

  • Each noble makes as much as 10,000 peasants put together, such that collectively the nobles get 90% of the income.
  • Each noble cares about as much about themselves as they do about all peasants put together.
  • Each person’s welfare is logarithmic in their income.

Then it’s simultaneously the case that:

  1. Nobles prefer to keep money for themselves rather than donate it to peasants—money is worth 10,000x as much to a peasant, but a noble cares 1,000,000 times less about the peasant’s welfare.
  2. Nobles pref
... (Read more)
3paulfchristiano3hFor consequentialists, the gap between "charitable enough to give" and "charitable enough to support redistribution" seems to be more than a million-fold; if so, I don't think it warrants that "just barely" modifier.

I'm confused about the 'million-fold' claim; I thought that if a noble dialed up their "caring about peasants" by 100x, so that rather than a factor of 1e-6 it was 1e-4, then they would have utility 4 and derivative 1e-4 from their income, a peasant would have utility 1 and derivative 1 from their income, and so the noble would be indifferent between holding onto a dollar and giving it away, and so any increases above 100x (like 101x) cause some gifts to happen.

(Like, this is where the <1% in your post comes from, right?)

I don't think it warrants that "

... (read more)
1jmh6hYou set the stage to call redistribution a moral public good based on a representative agent for each class. I don't think that gets you there so essentially you assume this moral public good conclusion and then draw out some implications. I question that you argument offered does imply redistribution can behave like a public good in a real world setting. I might but I don't think your approach get you there. If so, then the rest of the post is just musing about a possible implications if one just assumes redistribution dynamics have some similarity with those of public goods. I think Dagon points out might not even be the case, it may purely be threshold funding that is driving the behavior in which case applying the logic of public goods may lead one astray. Similarly, I'm not sure that adding "moral" is actually doing any real work. Seems more of a mood setting rhetorical device.
3Wei_Dai16hI don't think such a comparison would make sense, since different public goods have different room for funding. For example the World Bank has a bigger budget than the WHO, but development/anti-poverty has a lot more room for funding (or less diminishing returns) than preventing global pandemics. My sense that there's little effort at coordination for global poverty comes from this kind of comparison: US unilateral foreign aid [] (not counting private charitable donations): US donation to the World Bank (which is apparently determined by negotiation among the members) in 2007: $3.7 billion [] (this covers 2 years I believe). Lanrian mentioned [] an effort to coordinate foreign aid (ODA) but the effort seems very weak compared to other public good coordination efforts, because there is no enforcement mechanism (not even public shaming, as when was the last time you heard anything about this?). According to this document []: I guess "public goods" is part of what's happening, given that some non-zero level of coordination exists, but it seems like a relatively small part and I'm not sure that it explains what you want it to explain, or even what it is that you want to explain (since you didn't answer the question I asked previously about this). ETA: I added a statement to my top level comment to correct "I don’t think I’ve ever heard of any efforts to coordinate internationally on foreign aid."

This post was written for Convergence Analysis by Michael Aird, based on ideas from Justin Shovelain and with ongoing guidance from him. Throughout the post, “I” will refer to Michael, while “we” will refer to Michael and Justin or to Convergence as an organisation.

Epistemic status: High confidence in the core ideas on an abstract level. Claims about the usefulness of those ideas, their practical implications, and how best to concretely/mathematically implement them are more speculative; one goal in writing this post is to receive feedback on those things. I’m quite new to many of the concepts

... (Read more)

Two scenarios:

  • I take a vision or language model which was cutting edge in 2000, and run it with a similar amount of compute/data to what's typically used today.
  • I take a modern vision or language model, calculate how much money it costs to train, estimate the amount of compute I could have bought for that much money in 2000, then train it with that much compute

In both cases, assume that the number of parameters is scaled to available compute as needed (if possible), and we generally adjust the code to reflect scalability requirements (while keeping the algorithm itself the same).

Which of t... (Read more)

1Pattern2hWill the experiment be run? What is the experiment? What is the question? Guess A. Is the difference (between 2000 and today) modern compute? Guess B. Is the difference (between 2000 and today) modern compute costs? But the experiment doesn't seem to be about A or B. More likely it's about both: Which is more important (to modern ML performance (in what domain?*)): * Typical compute (today versus then)? * Or typical compute cost (today versus then)? (Minor technical note - if you're comparing results from the past, to results today, while it might be impossible to go back in time and test these things for a control group, rather than taking 'things weren't as good back then' for granted, this should be tested as well for comparison. (Replicate earlier results.**) This does admit other hypotheses. For example, 'the difference between 2020 and 2000 is that training took a long time, and if people set things up wrong, they didn't get feedback for a long time. Perhaps modern compute enables researchers to set ML programs up correctly despite the code not being written right the first time.') A and B can be rephrased as: * Do we use more compute today, but spend 'the same amount'? * Do we spend 'more' on compute today? *This might be intended as a more general question, but the post asks about: **The most extreme version would be getting/recreating old machines and then re-running old ML stuff on them.

The underlying question I want to answer is: ML performance is limited by both available algorithms and available compute. Both of those have (presumably) improved over time. Relatively speaking, how taut are those two constraints? Has progress come primarily from better algorithms, or from more/cheaper compute?

Steelmanning Inefficiency
186y16 min readShow Highlight

When considering writing a hypothetical apostasy or steelmanning an opinion I disagreed with, I looked around for something worthwhile, both for me to write and others to read. Yvain/Scott has already steelmanned Time Cube, which cannot be beaten as an intellectual challenge, but probably didn't teach us much of general use (except in interesting dinner parties). I wanted something hard, but potentially instructive.

So I decided to steelman one of the anti-sacred cows (sacred anti-cows?) of this community, namely inefficiency. It was interesting to find that it was a little easier tha... (Read more)


I'm currently writing a guide on “how to use steelmanning”. I've got a few strategies that I apply to force myself to find the best version of others arguments, but I would like to know if you have some ? (for example : “trying to explain the argument to someone else, to help convince myself that it's an argument that can be supported”).

N. M.

PS : very interesting post btw

There's been a long history of trying to penalise an AI for having a large impact on the world. To do that, you need an impact measure. I've designed some myself, back in the day, but they only worked in narrow circumstances and required tricks to get anything useful at all out from them.

A more promising general method is attainable utility. The idea is that, as an agent accumulates power in the world, they increase their ability to affect a lot of different things, and could therefore achieve a lot of different goals.

So, if an agent starts off unable to achieve many goals, but suddenly it can

... (Read more)
4TurnTrout3hI think it's really great to have this argument typed up somewhere, and I liked the images. There's something important going on with how the agent can make our formal measurement of its power stop tracking the actual powers it's able to exert over the world, and I think solving this question is the primary remaining open challenge in impact measurement. The second half of Reframing Impact (currently being written and drawn) will discuss this in detail, as well as proposing partial solutions to this problem. The agent's own power plausibly seems like a thing we should be able to cleanly formalize in a way that's robust when implemented in an impact measure. The problem you've pointed out somewhat reminds me of the easy problem of wireheading, in which we are fighting against a design choice rather than value specification difficulty. [] How is A getting reward for SA being on the blue button? I assume A gets reward whenever a robot is on the button? Is the +1 a typo? Depends on how much impact is penalized compared to normal reward. This isn't necessarily true. Consider R as the reward function class for all linear functionals over camera pixels. Or, even the max-ent distribution over observation-based reward functions. I claim that this doesn't look like 20 billion Q's. ETA: I'd also like to note that, while implicitly expanding the action space in the way you did (e.g. "A can issue requests to SA, and also program arbitrary non-Markovian policies into it") is valid, I want to explicitly point it out.

The impact measure is something like "Don't let the expected value of change; under the assumption that will be an -maximiser".

The addition of the subagent transforms this, in practice, to either "Don't let the expected value of change", or to nothing. These are ontologically simpler statements, so it can be argued that the initial measure failed to properly articulate "under the assumption that will be an -maximiser".

4Stuart_Armstrong3hYes. If A needs to be there in person, then SA can carry it there (after suitably crippling it). Yes; re-written it to be Ωγk+1. Yep. That's a subset of "It can use its arms to manipulate anything in the eight squares around itself.", but it's worth pointing it out explicitly.
Hedonic asymmetries
803d1 min readShow Highlight

Automatically crossposted

Creating really good outcomes for humanity seems hard. We get bored. If we don’t get bored, we still don’t like the idea of joy without variety. And joyful experiences only seems good if they are real and meaningful (in some sense we can’t easily pin down). And so on.

On the flip side, creating really bad outcomes seems much easier, running into none of the symmetric “problems.” So what gives?

I’ll argue that nature is basically out to get us, and it’s not a coincidence that making things good is so much harder than making them bad.

First: some other explanations

Two commo

... (Read more)

Perhaps hobbies are areas where people understand this about themselves, albeit narrowly.

3edoarad4hI may be mistaken. I tried reversing your argument, and I bold the part that doesn't feel right. So I think that maybe there is inherently an asymmetry between reward and punishment when dealing with maximizers. But my intuition comes from somewhere else. If the difference between pessimism and optimism is given by a shift by a constant then it ought not matter for a utility maximizer. But your definition goes at errors conditional on the actual outcome, which should perhaps behave differently.
3paulfchristiano3hI think this part of the reversed argument is wrong: Even if the behaviors are very rare, and have a "normal" reward, then the agent will seek them out and so miss out on actually good states.
1Pattern2hBut there are behaviors we always seek out. Trivially, eating, and sleeping.

If Van der Waals was a neural network

At some point in history a lot of thought was put into obtaining the equation:

R*T = P*V/n

The ideal gas equation we learn in kindergarten, which uses the magic number R
in order to make predictions about how n
moles of an “ideal gas” will change in pressure, volume or temperature given that we can control two of those factors.

This law approximates the behavior of many gases with a small error and it was certainly useful for many o' medieval soap volcano party tricks and Victorian steam engine designs.

But, as is often the case in scienc... (Read more)


Assortative mating is when similar people marry and have children. Some people worry about assortative mating in Silicon Valley: highly analytical tech workers marry other highly analytical tech workers. If highly analytical tech workers have more autism risk genes than the general population, assortative mating could put their children at very high risk of autism. How concerned should this make us?

Methods / Sample Characteristics

I used the 2020 Slate Star Codex survey to investigate this question. It had 8,043 respondents selected for being interested in a highly analytical blog a... (Read more)

Rationalist preper thread
77h1 min readShow Highlight

There are different estimates of the possible severity of the current coronavirus outbreak. One estimation is based on the straight extrapolation of the exponential growth of the infected people number, which doubles every two days. This implies that the whole population of Earth will be ill in March. Another view takes into account that many mild cases are not included in the stat, so lethality is small and probably not everybody will be ill at all. We just don't know yet.

How we should act personally in this situation?

Firstly, we should act in the way, which should be good if everybody ... (Read more)

It is also good to invest in improving ones’ immune system by health food, vitamins, light therapy, as it is our best protection of the virus. Evacuation into a cold county house would weaken the immune system.

How much can one "improve" one's immune system by these methods in a short time? Is there any data to back this up?

In general agree with the rest. In worst-case scenario ability to self-isolate for a while ("bug in" in prepper lingo) seems worthwhile.

4gbear6055hIt has been doubling in this time frame, but that's because of its unique circumstances. Many other illnesses have had a similar rise when they first appear, but illnesses tend to have their peak and then cycle down. Perhaps this may change, but the data does not indicate that yet.
6ChristianKl6hIt seems like you only pick the time-frame where there was international attention on the issue. I would expect that this attention raised the discovery rate.
5avturchin6hI pick just recent numbers, but exponential two-day doubling trend in infections and deaths is visible in the wiki-table from 16 January, or for around 5-6 doublings. Total growth for 12 days is around 100 times.
Embedded Agency via AbstractionΩ
345mo10 min readΩ 13Show Highlight

Claim: problems of agents embedded in their environment mostly reduce to problems of abstraction. Solve abstraction, and solutions to embedded agency problems will probably just drop out naturally.

The goal of this post is to explain the intuition underlying that claim. The point is not to defend the claim socially or to prove it mathematically, but to illustrate why I personally believe that understanding abstraction is the key to understanding embedded agency. Along the way, we’ll also discuss exactly which problems of abstraction need to be solved for a theory of embedded agency.

What do we m

... (Read more)
3VojtaKovarik9hA side-note: Can't remember the specific reference but: Imperfect-information game theory has some research on abstractions. Naturally, an object of interest are "optimal" abstractions --- i.e., ones that are as small as possible for given accuracy, or as accurate as possible for given size. However, there are typically some negative results, stating that getting (near-) optimal abstractions is at least as expensive as finding the (near-) optimal solution of the full game. Intuitively, I would expect this to be a recurring theme for abstractions in general. The implication of this is that all the goals should have the implicitly have the caveat that the maps have to be "not-too-expensive to construct". (This is intended to be a side-note, not an advocacy to change the formulation. The one you have there is accessible and memorable :-).)

Thanks for the pointer, sounds both relevant and useful. I'll definitely look into it.

I have been thinking about Stuart Armstrong's preference synthesis research agenda, and have long had the feeling that there's something off about the way that it is currently framed. In the post I try to describe why. I start by describing my current model of human values, how I interpret Stuart's implicit assumptions to conflict with it, and then talk about my confusion with regard to reconciling the two views.

The two-layer/ULM model of human values

In Player vs. Character: A Two-Level Model of Ethics, Sarah Constantin describes a model where the mind is divided, in game terms,... (Read more)

The player seems to value emotional states, while the character values specific situations it can describe? Does that seem right?

I know a fair number of people who put in a lot of effort into things like emotional healing, digging up and dealing with buried trauma, meditative and therapy practices, and so on. (I count myself in this category.)

And I think that there’s a thing that sometimes happens when other people see all of this, which is that it all seems kinda fake. I say this because even I have this thought sometimes. The core of the thought is something like, “if all of this stuff really worked, shouldn’t you be finished sometime? You claim that practice X was really beneficial, so why have you now been talking a... (Read more)

On the topic of persistent ongoing work vs occasional epiphanies, I highly recommend "The Holy Sh!t Moment" by James Fell. It's a bit poppy for my taste, but it dives deeper into that aspect of things and ties really well into the model presented in Unlocking the Emotional Brain.

7gjm6hIf those things are multiplicative rather than additive, then improving one of them by 10% does make your whole life 10% better. Obviously real life is more complicated than either a simple additive model or a simple multiplicative model. But I'd expect there to be things that operate multiplicatively. E.g., suppose you have a vitamin deficiency that means your energy levels are perpetually low; that might mean that you're doing literally everything in your life 10% worse than if that problem were solved. (Obvious conclusion if the above is anything like right: it's worth putting some effort into figuring out which problems you have affect everything else so that making them 10% better makes everything 10% better, and which are independent of everything else so that making them 10% better makes everything 0.01% better.)

I notice that when I write for a public audience, I usually present ideas in a modernist, skeptical, academic style; whereas, the way I come up with ideas is usually in part by engaging in epistemic modalities that such a style has difficulty conceptualizing or considers illegitimate, including:

  • Advanced introspection and self-therapy (including focusing and meditation)
  • Mathematical and/or analogical intuition applied everywhere with only spot checks (rather than rigorous proof) used for confirmation
  • Identity hacking, including virtue ethics, shadow-eating, and applied performativity theory
  • A
... (Read more)
2gjm6h(I guess that when you wrote "piano" you meant "violin".) I agree: skills are acquired and preserved in a different way from factual knowledge, and there are mental skills as well as physical, and they may be highly relevant to figuring out what's true and what's false; e.g., if I present Terry Tao with some complicated proposition in (say) the theory of partial differential equations and give him 15 seconds to guess whether it's true or not then I bet he'll be right much more frequently than I would even if he doesn't do any explicit reasoning at all, because he's developed a good sense for what's true and what isn't. But he would, I'm pretty sure, still classify his opinion as a hunch or guess or conjecture, and wouldn't call it knowledge. I'd say the same about all varieties of mental metis (but cautiously, because maybe there are cases I've failed to imagine right). Practice (in various senses of that word) can give you very good hunches, but knowledge is a different thing and harder to come by. One possible family of counterexamples: for things that are literally within yourself, it could well be possible to extend the range of things you are reliably aware of. Everyone can tell you, with amply justified confidence, whether or not they have toothache right now. Maybe there are ways to gain sufficient awareness of your internal workings that you have similar insight into whether your blood pressure is elevated, or whether you have higher than usual levels of cortisol in your bloodstream, etc. But I don't think this is the kind of thing jessicata is talking about here. [EDITED to add:] I wouldn't personally tend to call what-a-skilled-violin-player-has-but-can't-transfer-verbally "knowledge". I would be happy saying "she knows how to play the violin well", though. (Language is complicated.) I also wouldn't generally use the word "ideas". So (to whatever extent jessicata's language use is like mine, at least) the violin player may provide a useful analogy for

I think a person who has trained awareness about thier own cortisol levels is likely to have some useful knowledge about cortisol.

They might have hundreds of experiences where they did X and then they noticed their cortisol rising. If you talk with them about stress they might have their own ontology that distinguishes activities in stressful and not-stressful based on whether or not they raise their own cortisol level. I do think that such an ontology provides fruitful knowledge.

A decade ago plenty of psychologists ran around and claimed that willpower ... (read more)

2ChristianKl6hYes, I changed the instrument and forgot to change all instances. I corrected it.

What I mean by the "usual good actions" is notable good deeds you do may be a few times every week. Some examples:

  • Help a friend end an abusive relationship.
  • Go out of your way to help a stranger.
  • Do something extra nice and thoughtful for your partner.
  • Show up in person to support your friend's endeavor.
  • Procure a Burning Man ticket for your friend.
  • Refrain from calling your co-founder's idea "the dumbest thing I've heard this year" and do your best to listen.

What I mean by "altruistic impact" is probably QALYs, but I guess it's harder to use that m... (Read more)

These actions are mostly low-impact (in comparison with saving lives, preventing environmental catastrophe, etc.) but also low-effort and frequently-occurring. The right measure might be something like "impact per unit input" or "impact per person-year", and I suspect they then look less negligible by comparison with big-ticket effective altruism activity.

They also tend to affect people close to us about whom we care a lot. It's not at all clear what the best ways of balancing such "near" interests against those of distan... (read more)

American football has some properties that make it useful for practicing making predictions:

  • A game consists of a sequence of discrete plays with well-defined outcomes
  • The plays are fast enough that you get quick feedback on each prediction, and there are enough plays in a game that you can make enough predictions to see how well calibrated you are
  • You go into each prediction with at least some information about what you're predicting

I've created a forecasting practice exercise that you can do while watching a game, either on paper or using this spreadsheet.

For those with no idea how footba

... (Read more)

I think to this approach to be beneficial the predictions would need to be much more narrowly defined that described here. Things along the lines or Offence will rush or pass or even more fine grained such as rush to short side of the field with tackle pulling, rush with sweep and back up of a pitchback. For passing some idea about receiver type -- wide or one of the backs -- as well as how deep. For punts and kick-offs perhaps things like what receiver is targeted or take away option to return, or for punts various fake play scenario.

If we just say "... (read more)

Sleeping Beauty Resolved?
392y9 min readShow Highlight

[This is Part 1. See also Part 2.]


The Sleeping Beauty problem has been debated ad nauseum since Elga's original paper [Elga2000], yet no consensus has emerged on its solution. I believe this confusion is due to the following errors of analysis:

  • Failure to properly apply probability theory.
  • Failure to construct legitimate propositions for analysis.
  • Failure to include all relevant information.

The only analysis I have found that avoids all of these errors is in Radford Neal's underappreciated technical report on anthropic reasoning [Neal2007]. In this note I'll discuss ho... (Read more),,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,