All of Pattern's Comments + Replies

bvbvbvbvbvbvbvbvbvbvbv's Shortform
i.e. it's one way to find out how much you're privileged

You described using it for 'bubble evaluation'. I've also heard of stuff like that to measure bias.

any way to quantify (even naively like my system) this kind of thing

Which thing, and what kind of thing?

What other problems would a successful AI safety algorithm solve?
reverse engineering the entire human mind from scratch!

That might not necessarily be required for AGI, though that does seem to be what figuring out how to program values is.

1MSRayne2dThe latter is more what I was pointing to.
Taleuntum's Shortform
Do you know of a real world example where the first intervention on the proxy raised the target value, but the second, more extreme one, did not (or vica versa)?

Here's a fictional story:

You decide to study more. Your grades go up. You like that, so you decide to study really really hard. You get burnt out. Your grades go down. (There's also an argument here that the metric - grades - isn't necessarily ideal, but that's a different thing.)*

*There might be a less extreme version involving 'you stay up late studying', and 'because you get less sleep it has l... (read more)

bvbvbvbvbvbvbvbvbvbvbv's Shortform

I wouldn't say there's flaws in reasoning. Just that multiple comparisons are more likely to have issues, it's just a proxy, etc.

It's an interesting idea.

1bvbvbvbvbvbvbvbvbvbvbv4dThanks. I am not aware of any way to quantify (even naively like my system) this kind of thing and I am very eager to hear about other ways people have found.
Am I anti-social if I get vaccinated now?

Your second argument seems to imply social neutrality, rather than pro- or anti-. It's not strong enough to match the claim above (although it is following a conditional).

1samshap5dFair. Since it's been better answered elsewhere, I withdrew the comment.
Taleuntum's Shortform

If you keep increasing P, the connection might break.

3Taleuntum5dYou are right, that is also a possibility. I only considered cases with one intervention, because the examples I've heard given for Goodhart's law only contain one (I'm thinking of UK monetary policy, Soviet nail factory and other cases where some "manager" introduces an incentive toward a proxy to the system). However, multiple intervention cases can also be interesting. Do you know of a real world example where the first intervention on the proxy raised the target value, but the second, more extreme one, did not (or vica versa)? My intuition suggests that in the real world those type of causal influences are rare and also, I don't think we can say that "P causes V" in those cases. Do you think that is too narrow of a definition?
Why do patients in mental institutions get so little attention in the public discourse?

Other possibilities that spring to mind are:

  • The difficulty of them becoming your voters
  • The opportunity has been overlooked. (The market is not all knowing.)
  • It conflicts with other interests already secured.
Why do patients in mental institutions get so little attention in the public discourse?

The question is why does the attic work so well. Why does no one talk about the attic?

2Slider4dAn attic runner has likely a very personal connection to the detainee. They also have better information control and likely include a lot more persuasion in elicting cooperation from the detainee. It is easier to contain a person if they think it would be shameful for them to run loose.
Alcohol, health, and the ruthless logic of the Asian flush

Someone dies and you get sued. (All it takes is one allergic reaction, or someone who already had asthma, and you're a murderer.)

1Daniel V2dI was joking ;) But the distinction between prophylaxis and treatment I think is useful because even if "it doesn't work" as one or both, it could work for the other and still be helpful.
Alcohol, health, and the ruthless logic of the Asian flush

Combine it with getting entrance to a place. It doesn't have last too long, just long enough.

Alcohol, health, and the ruthless logic of the Asian flush

One day at work you discover a protein that crosses the blood-brain barrier and causes crippling migraine headaches if someone's attention drifts while driving.

Seems way too specific. This is going to go off under at least some other condition.


If these genes really are an adaptation, it shows how ruthless evolution can be. If you implanted a device in your kid that mildly poisoned them every time they drank, you'd be a monster. But evolution basically did that.

It doesn't make them get drunk faster?


No one cares about my freedom to rob convenienc

... (read more)
Is ("Chemical Imbalance" => Depression) an example of fake causality?
To what extent would said research be more difficult to do without a working hypothesis?

You would have to poke around, with no idea what you're looking for.

By what sort of process does the existence of a working hypothesis enable research?

The working hypothesis says you should try poking around over there, which narrows things down a little bit, but not very much.

To the extent that a working hypothesis is used in public communication with non-scientists about a given topic, why is it so?

People like having an explanation. Even if it tells you very little in... (read more)

The Generalized Product Rule

Is this just 'expected value follows some of the same rules as probability' or is there more to it?

1aysajan7dIt can be any real-valued measurement of objects, as long as we can reasonably assume the three assumptions are satisfied.
ChristianKl's Shortform

Do the transposons ever have positive benefits?

Why is your population all connected?

2ChristianKl7dTransposons increase the mutation rate, so the fitness of organisms changes more when there are transposons in my model. When it comes to that I treat every transposons equally. Otherwise, each transposon has a rate for self replication. Aside of that transposons have no positive benefits but they self reproduce. If the mutation rate is what's useful then there should be pressure for transposons with low self replication rates which I don't see. Transposons work similar to the gene drive ideas for killing of malaria causing mosquitos. However the body does have some defenses. Both the transposons evolve and the defenses evolve and in nature there's an equilibrium. Finding the right parameters that lead to the equilibrium, might produce a model that does predict aging purely based on the fact that transposons exist. Having such a theory would back up [] . From what I read evolutionary models generally don't need group selection to work. If transposons kill every species if it wouldn't be for group selection that would be a major scientific finding. There's the belief that the minimum number of individuals for a specis is 500 over longer timeframes.
How do you keep track of your own learning?
It can't surveille your activities and see how much you've been studying.

It tries.

Game-theoretic Alignment in terms of Attainable Utility

That moment when the AI takes a treacherous turn

because it wasn't aligned up to affine transformations.

Covid 6/10: Somebody Else’s Problem

One of your links is broken:

Probably broken by twitter though, so...

Also, at this point I have zero faith that if we decided on reasonable precautions that were actually reasonable if followed, that those procedures would get followed, even by those who said they were following them. There would also be those who saw this as permission to do the research without even saying they would use the precautions. Either you ban this, or you don’t.

1984 style solution: the research is carried out and live-str... (read more)

3Zvi7dYep, that message means it's not broken, it got deleted.
Reply to Nate Soares on Dolphins


A dictionary definition is just a convenient pointer to help people pick out "the same" natural abstraction in their own world-model. Unambiguous discrete features make for better word definitions than high-dimensional statistical regularities, even if most of the everyday inferential utility of using the word comes from fuzzy high-dimensional[ ]statistical correlates, because discrete features are more useful as a simple membership test that can function as common knowledge to solve the coordination problem of matching up the meanings in different pe
... (read more)
Bayeswatch 3: A Study in Scarlet
"Weather is subject to the butterfly effect," said Vi.

The interesting question, is

  • would the red paint would make the change?
  • is the desired change made by the satellite and missile launched in response to the red paint job?
  • Or is it tired of making incorrect predictions and ensuring its own destruction to that end?
Qria's Shortform

Two versions of a goal:

World Peace

Preventing a war you think is going to happen

The 2nd may have a (close) deadline, the 1st might have a distant deadline like the sun burns out, or something closer like before you die, or 'an AGI revolution (like the industrial revolution) starts' (assuming you think AGI will happen before the sun burns out).

Five Whys

Why aren’t you exercising?

  • Because it’s difficult to stop mindlessly browsing the web in the evening to start exercising.
    • Possible solution:

Maybe I should get up early and exercise.

Often, enemies really are innately evil.

TL:DR; I was talking about selection bias from you still being alive (I assume).

My point was that, given that the protagonist of Worm almost died, probabilistically, most people won't have experienced that level of bullying, unless we include dead people in 'people who have experienced' because there's a selection effect from being alive. Conditioning on survival*, probabilistically selects against more extreme torture, and towards none at all. At the limit, no one survives, and thus everyone who is alive has experienced such things with probability zero.

*... (read more)

Often, enemies really are innately evil.
No bullying I or anyone else I know has experienced was that bad, but the point is, bullies can go far beyond name-calling or even hitting.

Selection bias much?

1Andrew Vlahos10dNot quite, since although it never went that far, there was a legitimate concern that I could get killed. Also, l needed to show a specific example of a bully taking the extra effort to do extra harm, and giving a real example would be, well, problematic.
Often, enemies really are innately evil.
Don't think this study is big enough to be representative?

How big is the study?

What to optimize for in life?

Patrick Collins might not think that is the only thing to optimize for - just one that is underrated.

2adamzerner10dI assume he meant it as a heuristic. It's hard to weigh hundreds of variables at once, but when you optimize for speed, good things follow from that.
"How to Talk About Books You Haven't Read", by Pierre Bayard
So if the underlying message of this argument is “it’s ok to shoot the shit,” I agree. If it’s “sometimes stories and ideas can be conveyed by texts other than the original,” that’s trivially true. If it’s “you can make assumptions about the contents of a given book, then opine on the book itself,” that seems very wrong to me.
  • Prior + Evidence = Posteriors*
  • “you can make assumptions about the contents of a given book, then opine on [your model of] the book”
  • Is there a specific book you haven't read? Why?

*(Technically it's P(X | Evidence) = P(Evidence | X)*P(X)/P(Evidence).)

An Intuitive Guide to Garrabrant Induction


However, even if you did know the source code, you might still be ignorant about what it would do.

The Halting Problem.

As a simple example, suppose I violate the axiom that P(Heads)+P(Not Heads)=1 by having P(Not Heads)=P(Heads)=13. Given my stated probabilities, I think a 2:1 bet that the coin is Heads is fair and a 2:1 bet that the coin is Not Heads is fair; this combination of bets that is guaranteed to lose me $1, making me Dutch-bookable.

It's not clear why you would think that bet is fair.

Solomonoff induction is an example of an ideal empirical in
... (read more)
1flodorner12dRe neural networks: All one billion parameter networks should be computable in polynomial time, but there exist functions that are not expressible by a one billion parameter network (perhaps unless you allow for an arbitrary choice of nonlinearity)
Rogue AGI Embodies Valuable Intellectual Property


A naive story for how humanity goes extinct from AI: Alpha Inc. spends a trillion dollars to create Alice the AGI. Alice escapes from whatever oversight mechanisms were employed to ensure alignment by uploading a copy of itself onto the internet. Alice does not have to pay an alignment tax, and so outcompetes Alpha and takes over the world.
On its face, this story contains some shaky arguments. In particular, Alpha is initially going to have 100x-1,000,000x more resources than Alice. Even if Alice grows its resources faster, the alignment tax would ha
... (read more)
Selection Has A Quality Ceiling


Combine searching and training to make the task not impossible. Use/make groups that have more skills than exist in an individual (yet). Do we 'basically understand paradigm changes/interdisciplinary efforts?' If you need a test you don't have, maybe you should make that test. Pay attention to growth - if you want someone (or a group) better than the best in the world, you need someone who is/can grow, past that point. Maybe you'll have to create a team that's better than the best (that currently exist) in the world - possibly people who are currentl... (read more)

The Cost of Convenience....
In this piece, I argue that by making a convenient world, we have made less meaning in the world.

Is it also convenient relative to other goals like 'having (desired) inconvenience'?

Networks of Trust vs Markets
This post could be read as an introduction to a (hypothetical) sequence about using and scaling networks of trust. If there is interest, I might write another post detailing my observations so far. Any thoughts?

I'd be interested in that.

Open and Welcome Thread - May 2021

That's a great post by the way. I loved it.

Networks of Trust vs Markets
That immidiatly raised two questions.
1. How can I find more hippies?
2. Why are markets so expensive?
Lets look at the second one.

Based on the name of this piece, I'm not surprised you went there, but the first question sounds like it might change your life.

3Henrik Karlsson15dTo be clear, by hippie I mean any highly trustworthy person that is willing to do non-market transactions with an extended network, or something like that. But yeah, actively looking for more people like that is having some interesting effects on my life. This post could be read as an introduction to a (hypothetical) sequence about using and scaling networks of trust. If there is interest, I might write another post detailing my observations so far. Any thoughts? Thinking in public like this has been great - several of the comments have helped me clearify my thinking and pinpoint new opportunities and potential pitfalls.
Power dynamics as a blind spot or blurry spot in our collective world-modeling, especially around AI
any good graduate education in mathematics will teach you that for the purpose of understanding something confusing, it’s always best to start with the simplest non-trivial example.  

While that comment is meant as a metaphor, I'd say it's always best to start with a trivial example. Seriously, start with the number of dimensions d, and turn it all the way down to 0* and 1, solve those cases, and draw a line all the way to where you started, and check if it's right.

Reflective Oracles (fallenstein2015reflective) are another c
... (read more)
2habryka16dWe had a bit of discussion in the tagger Slack about it. It mostly felt like it was better captured by some more specific tags, and it ended up without many posts in it after existing for multiple weeks. A lot of stuff was better captured by well-being, or longevity, or productivity, and then what was left didn't seem above critical mass for having its own tag, though totally plausible we should have one if we get more health-related posts on the site (or if someone wants to put in the effort to actually find all of them and curate them, in which case people should feel free to create one again).
Zen and Rationality: Continuous Practice
Scott Alexander wrote that rationality is a habit to be cultivated. As such, cultivation of that habit requires ongoing work, which he captured with the phrase "constant vigilance".

I thought that showed up in the sequences first, though that might have just been methods?

Forecasting Newsletter: May 2021
Probability theory does not extend logic (predicate calculus). In particular, freely mixing logical quantifiers (∀, ∃) and probability statements gets messy fairly quickly, and the tools to disambiguate their meaning may not be found solely in probability theory (but perhaps in statistical inference or in the study of causality.)

The original article made it sound like that was an area of unfinished research (at the time it was written). If that's been solved, I imagine the original writer might want to know about it.

1NunoSempere14dNo, this hasn't been solved. But I imagine that mixing logical quantifiers and probability statements would be less messy if one e.g., knows the causal graph of the events to which the statements refer. This is something that the original post didn't mention, but which I thought was interesting.
For Better Commenting, Take an Oath of Reply.
Committing to reply to any comment seems like [a bad idea].

Then don't. It could be 'at least one', or 'First', or something. There could also be something like 'if no one posts any comments on this, then (after a week) I will'.

There's also the option of including a 'unless I think you're a troll' clause.

I also want to give my commenters a chance to talk to each other without me interrupting.

The oath could be conditional on being invoked?

Like, 'I will respond to the first 5 questions

a) about this p... (read more)

For Better Commenting, Take an Oath of Reply.
For this post, my Oath of Reply is to respond to top-level comments at least once through August 2021.

Top level comments on what? This post?

Testing The Natural Abstraction Hypothesis: Project Intro
The natural abstraction hypothesis can be split into three sub-claims, two empirical, one mathematical:

The third one:

Convergence: a wide variety of cognitive architectures learn and use approximately-the-same summaries.

Couldn't this be operationalized as empirical if a wide variety...learn and give approximately the same predictions and recommendations for action (if you want this, do this), i.e. causal predictions?

Human-Compatibility: These summaries are the abstractions used by humans in day-to-day thought/language.

This seems contingent on 'the... (read more)

4johnswentworth17dVery good question, and the answer is no. That may also be a true thing, but the hypothesis here is specifically about what structures the systems are using internally. In generally, things could give exactly the same externally-visible predictions/actions while using very different internal structures. You are correct that this is a kind of convergence claim. It's not claiming convergence in all intelligent systems, but I'm not sure exactly what the subset of intelligence systems is to which this claim applies. It has something to do with both limited computation and evolution (in a sense broad enough to include stochastic gradient descent).
A.D&D.Sci May 2021 Evaluation and Ruleset
In general, one would define cooperation in games as strategies that lead to better overall gains, and ignore effort involved in thinking up the strategy.

You should change your username to 'one' then.*

Imagine a game where the 'optimal strategy' is more difficult to calculate than the optimal strategy in chess. Or, suppose you're playing a chess game. You know how to calculate the optimal strategy. Unfortunately, it will take 10 years to calculate on your supercomputer, and you can't take 10 years to make the first move. To ne... (read more)

Covid 5/27: The Final Countdown
Should we update to give more credence to other things that are labeled as ‘conspiracy theory’? That’s tricky. I don’t think this was a ‘grand conspiracy’ or anything, nor do I think those suppressing the theory had any knowledge of whether or not the virus leaked from a lab. My model says this is how the system works by default, with all who form the system instinctively moving to implement the suppression of such speculations, without any need to coordinate. 

A synchronized theory.

It’s important to not
... (read more)
How likely is our view of the cosmos?
In terms of what we see in the night sky, are we a statistical anomaly compared to the average star system?

The night sky:

Earth's moon might also be a bit unusual.

Other than that:

We've got one star (say, as opposed to two). I'm not sure what the threshold for statistical anomaly is, but it's less common. I'm also not sure how common planets orbiting a star, as opposed to not having planets is.

6Callmesalticidae13dIt's also unusual that our moon is just the right size that, during a solar eclipse, we can see the solar corona. If the moon were much bigger, the corona would be obscured, and if it were smaller, too much of the rest of the sun would be visible for us to look at it.
Antifragility in Games of Chance, Research, and Debate
Your argument is fragile when you argue in soldier mindset, aiming for victory. A bad-faith argument risks defeat even if you win. Even if you persuade your opponent, you risk transmitting a bad idea from you to them. If you care at all about accuracy or making the world a better place, this should scare you.

bad idea, or false/misleading idea

If you argue with an open mind, you might be right or wrong. If you're right, then you stand a chance of helping your debate partner come to a more accurate worldview. If you're wrong, then it is you who will
... (read more)
A.D&D.Sci May 2021 Evaluation and Ruleset

Returns on time aside (I meant that question seriously - plotting out a returns on compute versus compute (time) curve sounds interesting***):

It* requires less effort because 'cooperation' reduces effort, while 'competition' increases it**.

(This is also measurable in the split between the traveler and the players.)

*The strategy


***In particular, getting a sense for something like the marginal returns on time invested, and then comparing it across problems.

-1simon21dIn general, one would define cooperation in games as strategies that lead to better overall gains, and ignore effort involved in thinking up the strategy. In this case, there was an easy cooperative strategy, but it's not in general true, for example, in the Darwin Game [] designing a cooperative strategy was more complicated than a simple 3-bot defect strategy. 3-bot didn't do well but possibly could have if there were a lot of non-punishing simulators submitted (there weren't). Also, even in this particular case, you could have had better results if you had taken the effort to get more to follow the same strategy. The rules did not explicitly forbid coordination, even by non-Lesswrongers, so you could have recruited a horde of acquaintances to spam 1-bids. (that might have been against the spirit of the rules, but you could have asked abstractapplic about it first I, I guess).
A.D&D.Sci May 2021 Evaluation and Ruleset
we're trying to have a competition here!).

How much time did you spend coming up with that strategy?

1simon23dGood point. I should have anticipated strategies that require less effort to be more popular.
A.D&D.Sci May 2021 Evaluation and Ruleset
(2-bidders win in the 1-bidder-filled environment)


  • are the 2-bidders stable against 'defection'?
  • There weren't any 2-bidders.
2simon23dOf course not, they lose to 3-bidders. I wouldn't consider that "defection" in the same way though, since the 1-bidding is presumably an attempt at coordination and the 2-bidding would be exploiting that coordination and not directly a coordination attempt. Sure, but if 1-bidding were to become popular in similar problems, there would start to be 2-bidders.
Load More