Nominated Posts for the 2019 Review

Posts need at least 2 nominations to continue into the Review Phase.
Nominate posts that you have personally found useful and important.
Sort by: fewest nominations
2 2Review

2019 Review Discussion

Paul Graham has a new essay out, The Lesson to Unlearn, on the desire to pass tests. It covers the basic points made in Hotel Concierge's The Stanford Marshmallow Prison Experiment. But something must be missing from the theory, because what Paul Graham did with his life was start Y Combinator, the apex predator of the real-life Stanford Marshmallow Prison Experiment. Or it's just false advertising. 

As a matter of basic epistemic self-defense, the conscientious reader will want to read the main source texts for this essay before seeing what I do to them:

  1. The Lesson to Unlearn
  2. The Stanford Marshmallow Prison Experiment
  3. Sam Altman's Manifest Destiny
  4. Black Swan Farming
  5. Sam Altman on Loving Community, Hating Coworking, and the Hunt for Talent

The first four are recommended on their own merits as well. For...

I did the first 20 from each column of your spreadsheet, and got a different result. I hid your answers before writing mine. My rubric was different; instead of focusing on social value, I focused on what type of business relations a company has. You can see my answers here. These are all very noisy, and I'm not entirely confident I didn't have rating-drift between when I did the YC ones and when I did the S&P ones, but I got a slightly higher score for S&P companies.

In my rubric, things that mean low scores:

  • You have a Compliance department
  • A signif
... (read more)

Based on a comment I made on this EA Forum Post on Burnout.

Related links: Sabbath hard and go home, Bring Back the Sabbath

That comment I made generated more positive feedback than usual (in that people seemed to find it helpful to read and found themselves thinking about it months after reading it), so I'm elevating it to a LW post of its own. Consider this an update to the original comment.

Like Ben Hoffman, I stumbled upon and rediscovered the Sabbath (although my implementation seems different from both Ben and Zvi). I was experiencing burnout at CFAR, and while I wasn't able to escape the effects entirely, I found some refuge in the following distinction between Rest Days and Recovery Days.

Recovery Days

A Recovery Day is where...

"Free Day", while perhaps not the best option overall, has the merit that these days involving freeing the part of you that communicatess through your gut (and through what you feel like doing). During much of our working (and non-working) week, that part is overridden by our mind's sense of what we have to do. 

By contrast, in OP's Recovery Days this part is either:

(a) doing the most basic recharging before it can do things it positively feels like and enjoys, or

(b) overridden or hijacked by addictive behaviours that it doesn't find as roundly rewarding as Free Day activities.

Addiction can also be seen as a lack of freedom. 

1tog2dI agree about the names. 'Rest' days are particularly confusing, since recovery days involve a lot of rest. A main characteristic of 'rest' days instead seems to doing what you feel like and following your gut.

(This is a comment that has been turned into a post.)

From Chris_Leong’s post, “Making Exceptions to General Rules”:

Suppose you make a general rule, ie. “I won’t eat any cookies”. Then you encounter a situation that legitimately feels exceptional , “These are generally considered the best cookies in the entire state”. This tends to make people torn between two threads of reasoning:

  1. Clearly the optimal strategy is to make an exception this one time and then follow the rule the rest of the time.

  2. If you break the rule this one time, then you risk dismantling the rule and ending up not following it at all.

How can we resolve this? …

This is my answer:

Consider even a single exception to totally undermine any rule. Consequently, only follow rules with no exceptions.[1].

4Unnamed2dReviewIt seems like the core thing that this post is doing is treating the concept of "rule" as fundamental. If you have a general rule plus some exceptions, then obviously that "general rule" isn't the real process that is determining the results. And noticing that (obvious once you look at it) fact can be a useful insight/reframing. The core claim that this post is putting forward, IMO, is that you should think of that "real process" as being a rule, and aim to give it the virtues of good rules such as being simple, explicit, stable, and legitimate (having legible justifications). An alternative approach is to step outside of the "rules" framework and get in touch with what the rule is for - what preferences/values/strategy/patterns/structures/relationships/etc. it serves. Once you're in touch with that purpose, then you can think about both the current case, and what will become of the "general rule", in that light. This could end up with an explicitly reformulated rule, or not. It seems like treating the "real process" as a rule is more fitting in some cases than others, a better fit for some people's style of thinking than for other people's, and also something that a person could choose to aim for more or less. I think I'd find it easier to think through this topic if there was a long, diverse list of brief examples.

This comment discusses a class of situations where what you say seems likely to be true.

In most other cases, I think the sort of attitude you describe is likely to be a way to avoid admitting (to yourself or others) what the “real rules” are. Once you start saying stuff like…

… treating the “real process” as a rule is more fitting in some cases than others, a better fit for some people’s style of thinking than for other people’s, and also something that a person could choose to aim for more or less.

… then the usefulness of the concept/approach described... (read more)

1Ericf3dIt sounds like the actual meta-meta-rule is not "Real rules have no exceptions," but rather "the full set of all rules, and their relative priorities, fully determines (compliant) behavior, without any specific case exceptions"


Internal Family Systems (IFS) is a psychotherapy school/technique/model which lends itself particularly well for being used alone or with a peer. For years, I had noticed that many of the kinds of people who put in a lot of work into developing their emotional and communication skills, some within the rationalist community and some outside it, kept mentioning IFS.

So I looked at the Wikipedia page about the IFS model, and bounced off, since it sounded like nonsense to me. Then someone brought it up again, and I thought that maybe I should reconsider. So I looked at the WP page again, thought “nah, still nonsense”, and continued to ignore it.

This continued until I participated in CFAR mentorship training last September, and we had a class on CFAR’s...

The back-and-forth (here and elsewhere) between Kaj & pjeby was an unusually good, rich, productive discussion, and it would be cool if the book could capture some of that. Not sure how feasible that is, given the sprawling nature of the discussion.

Related to: Asymmetric JusticePrivacyBlackmail

Previously (Paul Christiano): Epistemic Incentives and Sluggish Updating

The starting context here is the problem of what Paul calls sluggish updating. Bob is asked to predict the probability of a recession this summer. He said 75% in January, and how believes 50% in February. What to do? Paul sees Bob as thinking roughly this:

If I stick to my guns with 75%, then I still have a 50-50 chance of looking smarter than Alice when a recession occurs. If I waffle and say 50%, then I won’t get any credit even if my initial prediction was good. Of course if I stick with 75% now and only go down to 50% later then I’ll get dinged for making a bad prediction right now—but that’s little worse than what


This post seems to me to be misunderstanding a major piece of Paul's "sluggish updating" post, and clashing with Paul's post in ways that aren't explicit.

The core of Paul's post, as I understood it, is that incentive landscapes often reward people for changing their stated views too gradually in response to new arguments/evidence, and Paul thinks he has often observed this behavioral pattern which he called "sluggish updating." Paul illustrated this incentive landscape through a story involving Alice and Bob, where Bob is thinking through his optimal strat... (read more)

“I didn’t get much done last week because it was so hot.”

It wasn’t the first time a client said this to me, and I was curious. “Have you considered getting an air conditioner if it’s that bad?”

“No” He replied (let’s call him Philip), “An open window is usually enough. It’s just that the heatwave this week was particularly bad.” When we discussed it a bit more, Philip said it didn’t seem worth the hassle since the AC only really felt necessary for a few weeks during the summer.

So, was Philip right? Is it worth buying an AC as a productivity hack?

How much is your time worth?

Most people don’t think about air conditioners when evaluating their productivity, because it’s not within the normal...

AC doesn't "completely" solve the problem, because they:

a) Have time-needs for the purchase, installation and maintenance, none of which are included here.
b) Make noise, at least some of us also work less efficiently while subject to noise.

Followup to: What Evidence Filtered Evidence?

In "What Evidence Filtered Evidence?", we are asked to consider a scenario involving a coin that is either biased to land Heads 2/3rds of the time, or Tails 2/3rds of the time. Observing Heads is 1 bit of evidence for the coin being Heads-biased (because the Heads-biased coin lands Heads with probability 2/3, the Tails-biased coin does so with probability 1/3, the likelihood ratio of these is , and ), and analogously and respectively for Tails.

If such a coin is flipped ten times by someone who doesn't make literally false statements, who then reports that the 4th, 6th, and 9th flips came up Heads, then the update to our beliefs about the coin depends on what algorithm the not-lying[1] reporter used to...

Crucial. I definitely remember reading this and thinking it was one of the most valuable posts I'd seen all year. Good logical structure.

But it's hard to read? It has jarring, erratic rhetoric flow; succinct where elaboration is predictably needed, and verbose where it is redundant. A mathematician's scratch notes, I think.

Cross-posted from Putanumonit.


Imagine that tomorrow everyone on the planet forgets the concept of training basketball skill.

The next day everyone is as good at basketball as they were the previous day, but this talent is assumed to be fixed. No one expects their performance to change over time. No one teaches basketball, although many people continue to play the game for fun.

Geneticists explain that some people are born with better hand-eye coordination and are able to shoot a basketball accurately. Economists explain that highly-paid NBA players have a stronger incentive to hit shots, which explains their improved performance. Psychologists note that people who take more jump shots each day hit a higher percentage and theorize a principal factor of basketball affinity that influences both desire and skill at...

The fosbury flop is a good analogy. Where i think it comes short is that rationality is indeed a much more complex thing than jumping. You would need more than just the invention and application of a technique by one person for a paradigm shift - It would at least also require distilling the technique well, learning how to teach it well, and changing the rationality cannon in light of it.

I think a paradigm shift would happen when a new rationality cannon will be created and adopted that outperforms the current sequences (very likely also containing new tec... (read more)

Followup to: Where to Draw the Boundary?

Figuring where to cut reality in order to carve along the joints—figuring which things are similar to each other, which things are clustered together: this is the problem worthy of a rationalist. It is what people should be trying to do, when they set out in search of the floating essence of a word.

Once upon a time it was thought that the word "fish" included dolphins ...

The one comes to you and says:

The list: {salmon, guppies, sharks, dolphins, trout} is just a list—you can't say that a list is wrong. You draw category boundaries in specific ways to capture tradeoffs you care about: sailors in the ancient world wanted a word to describe the swimming finned creatures that they saw in

2abramdemski6dThe point of the thought experiment is that, for the alien, all of that is totally mundane (ie scientific) knowledge. So why can't that observation count as scientific for us? IE, just because we have control over a thing doesn't -- in my ontology -- indicate that the concept of map/territory correspondence no longer applies. It only implies that we need to have conditional expectations, so that we can think about what happens if we do one thing or another. (For example, I know that if I think about whether I'm thinking about peanut butter, I'm thinking about peanut butter. So my estimate "am I thinking about peanut butter?" will always be high, when I care to form such an estimate.) And how is the temporal point at which something comes into existence relevant to whether we need to track it accurately in our map, aside from the fact that things temporally distant from us are less relevant to our concerns? Your reply was very terse, and does not articulate very much of the model you're coming from, instead mostly reiterating the disagreement. It would be helpful to me if you tried to unpack more of your overall view, and the logic by which you reach your conclusions. I know that you have a concept of "pre-existing reality" which includes rocks and not money, and I believe that you think things which aren't in pre-existing reality don't need to be tracked by maps (at least, something resembling this). What I don't see is the finer details of this concept of pre-existing reality, and why you think we don't need to track those things accurately in maps. The point of my rock example is that the smashed rock did not exist before we smashed it. Or we could say "the rock dust" or such. In doing so, we satisfy your temporal requirement (the rock dust did not exist until we smashed it, much like money did not exist until we conceived of it). We also satisfy the requirement that we have complete control over it (we can make the rock dust, just like we can invent gay marr

The point of the thought experiment is that, for the alien, all of that is totally mundane (ie scientific) knowledge. So why can’t that observation count as scientific for us?

The point is that the rule "if it is not in the territory it should not be in the map" does not apply in cases where we are constructing reality, not just reflecting it.

If you are drafting a law to introduce gay marriage, it isn't objection to say that it doesn't already exist.

IE, just because we have control over a thing doesn’t—in my ontology—indicate that the concept of map/t

... (read more)
2abramdemski6dIt seems like part of our persistent disagreement is: * I see this as one of very few pathways, and by far the dominant pathway, by which beliefs can be beneficial in a different way from useful-for-prediction * You see this as one of many many pathways, and very much a corner case I frankly admit that I think you're just wrong about this, and you seem quite mistaken in many of the other pathways you point out. The argument you quoted above was supposed to help establish my perspective, by showing that there would be no reason to use gerrymandered concepts unless there was some manipulation going on. Yet you casually brush this off as a very particular set of problems. As a general policy, I think that yes, frequently pointing out subtler inaccuracies in language helps practice specificity and gradually refines concepts. For example, if you keep pointing out that tomatoes are fruit, you might eventually be corrected by someone pointing out that "vegetable" is a culinary distinction rather than a biological one, and so there is no reason to object to the classification of a tomato as a vegetable. This could help you develop philosophically, by providing a vivid example of how we use multiple overlapping classification systems rather than one; and further, that scientific-sounding classification criteria don't always take precedence (IE culinary knowledge is just as valid as biology knowledge). In what you quoted, I was trying to point out the distinction between speaking a certain way vs thinking a certain way. My overall conversational strategy was to try to separate out the question of whether you should speak a specific way from the question of whether you should think a specific way. This was because I had hoped that we could more easily reach agreement about the "thinking" side of the question. More specifically, I was pointing out that if we restrict our attention to how to think, then (I claim) the cost of using concepts for non-epistemic reasons is
4Raemon6d(I say all of this largely agreeing with the thrust of what the post and your (Abram's) comments are pointing at, but feeling like something about the exact reasoning is off. And it feeling consistently off has been part of why I've taken awhile to come around to the reasoning)

In discussions of algorithm bias, the COMPAS scandal has been too often quoted out of context. This post gives the facts, and the interpretation, as quickly as possible. See this for details.

The fight

The COMPAS system is a statistical decision algorithm trained on past statistical data on American convicts. It takes as inputs features about the convict and outputs a "risk score" that indicates how likely the convict would reoffend if released.

In 2016, ProPublica organization claimed that COMPAS is clearly unfair for blacks in one way. Northpointe replied that it is approximately fair in another way. ProPublica rebukes with many statistical details that I didn't read.

The basic paradox at the heart of the contention is very simple and is not a simple "machines are biased because it learns


All evidence is "biased compared to our prior". That is what evidence is.

2IlyaShpitser5d90% of the work ought to go into figuring out what fairness measure you want and why. Not so easy. Also not really a "math problem." Most ML papers on fairness just solve math problems.
3BrianChristian7d>> False negative rate = (false negative)/(actual positive) >> False positive rate = (false positive)/(actual negative) Correct me if I’m mistaken, but isn’t it: False negative rate = (false negative)/(false negative + actual positive) False positive rate = (false positive)/(false positive + actual negative)

Cross posted from the EA Forum.

Epistemic Status: all numbers are made up and/or sketchily sourced. Post errs on the side of simplistic poetry – take seriously but not literally.

If you want to coordinate with one person on a thing about something nuanced, you can spend as much time as you want talking to them – answering questions in realtime, addressing confusions as you notice them. You can trust them to go off and attempt complex tasks without as much oversight, and you can decide to change your collective plans quickly and nimbly.

You probably speak at around 100 words per minute. That's 6,000 words per hour. If you talk for 3 hours a day, every workday for a year, you can communicate 4.3 million words worth...

Not about Gendlin, but following the trail of relating chunks to other things: I wonder if propaganda or cult indoctrination can be described as a malicious chunking process.

I've weighed in against taking the numbers literally elsewhere, but following this thread I suddenly wondered if the work that using few words was doing isn't delivering the chunk, but rather screening out any alternative chunk. If what we are interested in is common knowledge, it isn't getting people to develop a chunk per se that is the challenge; rather everyone has to agree on exac... (read more)

2ryan_b6dThis part points pretty directly at research debt and inferential distance, where the debt is how many of these chunks need to be named and communicated as chunks, and the distance is how many re-chunking steps need to be done.

Reply to: Decoupling vs Contextualising Norms

Chris Leong, following John Nerst, distinguishes between two alleged discursive norm-sets. Under "decoupling norms", it is understood that claims should be considered in isolation; under "contextualizing norms", it is understood that those making claims should also address potential implications of those claims in context.

I argue that, at best, this is a false dichotomy that fails to clarify the underlying issues—and at worst (through no fault of Leong or Nerst), the concept of "contextualizing norms" has the potential to legitimize derailing discussions for arbitrary political reasons by eliding the key question of which contextual concerns are genuinely relevant, thereby conflating legitimate and illegitimate bids for contextualization.

Real discussions adhere to what we might call "relevance norms": it is almost universally "eminently reasonable to expect

2Raemon8dHmm. I don't I want to commit to a huge discussion of it. I'm happy to continue doing async LW comments about it. I'm busier than usual this month. There might turn out to be a day I had a spare hour or two to chat in more detail but don't think I want to spend cognition planning around that. I think I've mostly said my main piece and am fairly happy with "LW members can read what Matt and Ray have said so far and vote accordingly." If you raise specific points on specific posts I (and others) might change their vote for those posts.
2Matt Goldenberg8dYeah so I think my thought on this is that it's often impossible to point at these sorts of missing frames or implicit assumptions in a single post. In my review of Liron's post I was able to pull out a bunch of quotes pointing to some specific frames, but that's because it was unusually dense with examples. In the case of this post, if I were to do the same thing, I think I'd have to pull out quotes from at least 3-4 of the posts in the sequence to point to this underlying straw man (in this case I didn't actually do that and just sort of hoped others could do it own their own through reading my review).
4Raemon8dThat seems true, but I think it still makes sense to concentrate the discussion on particular posts. (Zack specifically disavowed this post and the meta-honesty response, so I think it makes most sense to concentrate on Where To Draw The Boundaries and Heads I Win, Tails Never Heard Of Her) I think it's reasonable to bring up "this post seems rooted in a wrong frame" on both of those, linking to other examples. But my own voting algorithm for those posts will personally be asking "does this single post have a high overall mix of 'true' and 'important'?" I think most posts in the review, even the top posts, have something wrong with them, and in some cases I disagree with the author about which things are wrong-enough-to-warrant-fixing. I do feel that the overall review process isn't quite solid enough for me to really endorse the Best Of book as a statement of "The LessWrong Community fully endorses this post", and I think that's a major problem to be fixed for next year. But meanwhile I think it makes more sense to accept that some posts will have flaws.

Zack specifically disavowed this post and the meta-honesty response, so I think it makes most sense to concentrate on Where To Draw The Boundaries and Heads I Win, Tails Never Heard Of Her

Ahh, I didn't realize that, definitely would not have reviewed this post if I realized this was the case.

But my own voting algorithm for those posts will personally be asking "does this single post have a high overall mix of 'true' and 'important'?"

Yeah I think this is reasonable. I'm worried about thinks that are wrong is subtle non-obvious ways with certain frames or as... (read more)

Related and required reading in life (ANOIEAEIB): The Copenhagen Interpretation of Ethics

Epistemic Status: Trying to be minimally judgmental

Spoiler Alert: Contains minor mostly harmless spoiler for The Good Place, which is the best show currently on television.

The Copenhagen Interpretation of Ethics (in parallel with the similarly named one in physics) is as follows:

The Copenhagen Interpretation of Ethics says that when you observe or interact with a problem in any way, you can be blamed for it. At the very least, you are to blame for not doing more. Even if you don’t make the problem worse, even if you make it slightly better, the ethical burden of the problem falls on you as soon as you observe it. In particular, if you interact with a problem and benefit from it, you


I really like this post. I think it points out an important problem with intuitive credit-assignment algorithms which people often use. The incentive toward inaction is a real problem which is often encountered in practice. While I was somewhat aware of the problem before, this post explains it well.

I also think this post is wrong, in a significant way: asymmetric justice is not always a problem and is sometimes exactly what you want. in particular, it's how you want a justice system (in the sense of police, judges, etc) to work.

The book Law's Order explai... (read more)

A few years ago, the rationalsphere was small, and it was hard to get funding to run even one organization. Spinning up a second one with the same focus area might have risked killing the first one.

By now, I think we have the capacity (financial, coordinational and human-talent-wise) that that's less of a risk. Meanwhile, I think there are a number of benefits to having more, better, friendly competition.

Reasons competition seems good

Diversity of worldviews is better.

Two research orgs might develop different schools of thought that lead to different insights. This can lead to more ideas as well as avoiding the tail risks of bias and groupthink.

Easier criticism.

When there's only one org doing A Thing, criticizing that org feels sort of like criticizing That Thing. And...

2Raemon7dI ended up chatting with Habryka (who had originally inspired this post, and now wasn't sure he agreed with it). One key additional point here is "gee, doing anything at all is really goddamn hard, and you might not want people to feel any additional disincentive from doing anything, including building monopolies." There's the frustrating "consider reversing all advice you hear [], because you maybe filter yourself to hear advice that reinforces your own biases" thingy. I think I endorse the specific phrasings used in this post (which I think were properly caveated). But, I wouldn't want the takeaway to be people being too overly worried about ensuring they have competitors when they themselves haven't gotten off the ground. There's an additional confusing issue where... 1. it's really bad if a mediocre organization is a monopoly, or if organizations stretch beyond their capacity. 2. it... actually can be good if a highly competent organization does a lot of vertical integration. And the problem is that figuring out "am I competent or not?" is one off the hardest things you can try to figure out.

Note: I don't think objectively figuring out "am I competent or not?" is that hard of a question. It's just one that the people who are incompetent will very likely get wrong in a highly predictable direction, so building norms that start with "if you think you are competent do X, if you don't do Y" are hard to make work. 

Followup/distillation/alternate-take on Duncan Sabien's Dragon Army Retrospective and Open Problems in Group Rationality.

There's a particular failure mode I've witnessed, and fallen into myself:

I see a problem. I see, what seems to me, to be an obvious solution to the problem. If only everyone Took Action X, we could Fix Problem Z. So I start X-ing, and maybe talking about how other people should start X-ing. Action X takes some effort on my part but it's obviously worth it.

And yet... nobody does. Or not enough people do. And a few months later, here I'm still taking Action X and feeling burned and frustrated.

Or –

– the problem is that everyone is taking Action Y, which directly causes Problem Z. If only everyone would stop Y-ing, Problem Z would go...

Self Review.

I still endorse the broad thrusts of this post. But I think it should change at least somewhat. I'm not sure how extensively, but here are some considerations

Clearer distinctions between Prisoner's Dilemma and Stag Hunts

I should be more clear about what the game theoretical distinctions I'm actually making between Prisoners Dilemma and Stag Hunt. I think Rob Bensinger rightly criticized the current wording, which equivocates between "stag hunting is meaningfully different" and "'hunting rabbit' has nicer aesthetic properties than 'defect'".&nbs... (read more)

This post grew out of a conversation with Laurent Orseau; we were initially going to write a paper for a consciousness/philosophy journal of some sort, but that now seems unlikely, so I thought I'd post the key ideas here.

A summary of this post can be found here - it even has some diagrams.

The central idea is that thinking in terms of AI or similar artificial agent, we can get some interesting solutions to old philosophical problems, such as the Mary's room/knowledge problem. In essence, simple agents exhibit similar features to Mary in the thought experiments, so (most) explanations of Mary's experience must also apply to simple artificial agents.

To summarise:

  • Artificial agents can treat certain inputs as if the input were different from mere information.
  • This analogises loosely to how
2Stuart_Armstrong9dAdded a summary post here: []
4Chris_Leong9dReviewI've already written a comment with a suggestion that this post needs a summary so that you can benefit from it, even if you don't feel like wading through a bunch of technical material.
4Stuart_Armstrong9dAdded a summary post here: []

Summary: I think it’s important for surveys about the future of technology or society to check how people's predictions of the future depend on their beliefs about what actions or responsibilities they and others will take on. Moreover, surveys should also help people to calibrate their beliefs about those responsibilities by collecting feedback from the participants about their individual plans. Successive surveys could help improve the groups calibration as people update their responsibilities upon hearing from each other. Further down, I’ll argue that not doing this — i.e. surveying only for predictions but not responsibilities — might even be actively harmful.

An example

Here's an example of the type of survey question combination I'm advocating for, in the case of a survey to AI researchers about...

I agree it would be good to add a note about push polling, but it's also good to note that the absence of information is itself a choice! The most spare possible survey is not necessarily the most informative. The question of what is a neutral framing is a tricky one, and a question about the future that deliberate does not draw attention to responsibilities is not necessarily less push-poll-y than one that does.

I’ve recently been spending some time thinking about the rationality mistakes I’ve made in the past. Here’s an interesting one: I think I have historically been too hasty to go from “other people seem very wrong on this topic” to “I am right on this topic”.

Throughout my life, I’ve often thought that other people had beliefs that were really repugnant and stupid. Now that I am older and wiser, I still think I was correct to think that these ideas were repugnant and stupid. Overall I was probably slightly insufficiently dismissive of things like the opinions of apparent domain experts and the opinions of people who seemed smart whose arguments I couldn’t really follow. I also overrated conventional wisdom about factual claims about how the world worked,


One good idea to take out of this is that other people's ability to articulate their reasons for their belief can be weak—weak enough that it can distract from the strength of evidence for the actual belief. (More people can catch a ball than explain why it follows the arc that it does).

This post covers the set-up and results from our exploration in amplifying generalist research using predictions, in detail. It is accompanied by a second post with a high-level description of the results, and more detailed models of impact and challenges. For an introduction to the project, see that post.


The rest of this post is structured as follows.

First, we cover the basic set-up of the exploration.

Second, we share some results, in particular focusing on the accuracy and cost-effectiveness of this method of doing research.

Third, we briefly go through some perspectives on what we were trying to accomplish and why that might be impactful, as well as challenges with this approach. These are covered more in-depth in a separate post.

Overall, we are very interested...

2Raemon11dIt's unclear to me whether I should think of the forecasters as more replaceable than Elizabeth. If they're all generalist researchers, having "a bunch of generalist researchers do generalist research for the same amount of time as the original researcher" doesn't seem obviously scalable. (That said, my current belief is that this work was pretty interesting and important overall)

The forecasters were only quite loosely selected for "some forecasting experience". Some of them I know are very able forecasters, others are people much less experienced, and who I don't think are affiliated that much with the rationality or effective altruism communities. 

Previously: More Dakka

Epistemic Status: The Dakka Files

Smartphones are wonderful things.

Get a second one. And get a lot of chargers.

I did this a week ago. My phone (a Google Pixel 2) had some sort of ink leak onto its screen, so I purchased a second one (a Google Pixel 3a) while I attempted to repair the old one. When the repair attempt was successful, but I hadn’t finished transferring some features and data across, I tried carrying around both.

It was clear by the end of the day that the second phone was a huge improvement.

On reflection, this was a basic case of More Dakka. A substantial portion of the people one interacts with worry about running out of phone battery, or running out of phone storage space, or needing to...

Agreeing that just the final paragraph would be a good idea to include; otherwise, I don't think this passes my bar for "worth including as best-of."

6Gunnar_Zarncke9dI just saw this in the 2019 review and was surprised I didn't see it earlier. I'm using two phones for a while now and it just have been standard advice for executives for ages (2014?). Just google "executive two phones".
4Raemon9dThe first google result is called "most executives agree: one phone is enough" (but, has a list of testimonials where are least some do advocate two phones for the reasons Zvi describes)
2Mati_Roy9dDo you still use 2 phones? edit: oops, I saw your answer in your review; nvm
Load More