Reviews 2019

Sorted by Top

Things To Take Away From The Essay

First and foremost: Yudkowsky makes absolutely no mention whatsoever of the VNM utility theorem. This is neither an oversight nor a simplification. The VNM utility theorem is not the primary coherence theorem. It's debatable whether it should be considered a coherence theorem at all.

Far and away the most common mistake when arguing about coherence (at least among a technically-educated audience) is for people who've only heard of VNM to think they know what the debate is about. Looking at the top-voted comments on this ess... (read more)

This review is mostly going to talk about what I think the post does wrong and how to fix it, because the post itself does a good job explaining what it does right. But before we get to that, it's worth saying up-front what the post does well: the post proposes a basically-correct notion of "power" for purposes of instrumental convergence, and then uses it to prove that instrumental convergence is in fact highly probable under a wide range of conditions. On that basis alone, it is an excellent post.

I see two (related) central problems, from which various o... (read more)

1. Manioc poisoning in Africa vs. indigenous Amazonian cultures: a biological explanation?

Note that while Josef Henrich, the author of TSOOS, correctly points out that cassava poisoning remains a serious public health concern in Africa, he doesn't supply any evidence that it wasn't also a public health issue in Amazonia. One author notes that "none of the disorders which have been associated with high cassava diets in Africa have been found in Tukanoans or other indigenous groups on cassava-based diets in Amazonia."

Is this because Tukanoans have superior p... (read more)

I wrote up a longer, conceptual review. But I also did a brief data collection, which I'll post here as others might like to build on or go through a similar exercise. 

In 2019 YC released a list of their top 100 portfolio companies ranked by valuation and exit size, where applicable.

So I went through the top 50 companies on this list, and gave each company a ranking ranging from -2 for "Very approval-extracting" to 2 for "Very production-oriented".  

To decide on that number, I asked myself questions like "Would growth of this company seem cancero... (read more)

Looking back, I have quite different thoughts on this essay (and the comments) than I did when it was published. Or at least much more legible explanations; the seeds of these thoughts have been around for a while.

On The Essay

The basketballism analogy remains excellent. Yet searching the comments, I'm surprised that nobody ever mentioned the Fosbury Flop or the Three-Year Swim Club. In sports, from time to time somebody comes along with some crazy new technique and shatters all the records.

Comparing rationality practice to sports practice, rationality has ... (read more)

I do not like this post. I think it gets most of its rhetorical oomph from speaking in a very moralizing tone, with effectively no data, and presenting everything in the worst light possible; I also think many of its claims are flat-out false. Let's go through each point in order.

1. You can excuse anything by appealing to The Incentives

No, seriously—anything. Once you start crying that The System is Broken in order to excuse your actions (or inactions), you can absolve yourself of responsibility for all kinds of behaviors that, on paper, should raise red f

... (read more)

(I reviewed this in a top-level post: Review of 'But exactly how complex and fragile?'.)

I've thought about (concepts related to) the fragility of value quite a bit over the last year, and so I returned to Katja Grace's But exactly how complex and fragile? with renewed appreciation (I'd previously commented only a very brief microcosm of this review). I'm glad that Katja wrote this post and I'm glad that everyone commented. I often see private Google docs full of nuanced discussion which will never see the light of day, and that makes me sad, and I'm happy ... (read more)

I strongly oppose collation of this post, despite thinking that it is an extremely well-written summary of an interesting argument on an interesting topic. The reason that I do so is because I believe it represents a substantial epistemic hazard because of the way it was written, and the source material it comes from. I think this is particularly harmful because both justifications for nominations amount to "this post was key in allowing percolation of a new thesis unaligned with the goals of the community into community knowledge," which is a justificatio... (read more)

Connection to Alignment

One of the main arguments in AI risk goes something like:

  • AI is likely to be a utility maximizer (or goal-directed in some other sense)
  • Goodhart, instrumental convergence, etc make powerful goal-directed agents dangerous by default

One common answer to this is "ok, how about we make AI which isn't goal-directed"?

Unconscious Economics says: selection effects will often create the same effect as goal-directedness, even if we're trying to build a non-goal-directed AI.

Discussions around CAIS are one obvious application. Paul's "you get what... (read more)

In “Why Read The Classics?”, Italo Calvino proposes many different definitions of a classic work of literature, including this one:

A classic is a book which has never exhausted all it has to say to its readers.

For me, this captures what makes this sequence and corresponding paper a classic in the AI Alignment literature: it keeps on giving, readthrough after readthrough. That doesn’t mean I agree with everything in it, or that I don’t think it could have been improved in terms of structure. But when pushed to reread it, I found again and again that I had m... (read more)

I think this post, as promised in the epistemic status, errs on the side of simplistic poetry. I see its core contribution as saying that the more people you want to communicate to, the less you can communicate to them, because the marginal people aren't willing to put in work to understand you, and because it's harder to talk to marginal people who are far away and can't ask clarifying questions or see your facial expressions or hear your tone of voice. The numbers attached (e.g. 'five' and 'thousands of people') seem to not be super precise.

That being sa... (read more)

I <3 Specificity

For years, I've been aware of myself "activating my specificity powers" multiple times per day, but it's kind of a lonely power to have. "I'm going to swivel my brain around and ride it in the general→specific direction. Care to join me?" is not something you can say in most group settings. It's hard to explain to people that I'm not just asking them to be specific right now, in this one context. I wish I could make them see that specificity is just this massively under-appreciated cross-domain power. That's why I wanted this sequence to... (read more)

It was interesting to re-read this article 2 years later.  It reminds me that I am generally working with a unique subset of the population, which is not fully representative of human psychology.  That being said, I believe this article is misleading in important ways, which should be clarified.  The article focused too much on class, and it is hard to see it as anything but classist. While I wrote an addendum at the end, this really should have been incorporated into the entire article and not tacked on, as the conclusions one would re... (read more)

Selection vs Control is a distinction I always point to when discussing optimization. Yet this is not the two takes on optimization I generally use. My favored ones are internal optimization (which is basically search/selection), and external optimization (optimizing systems from Alex Flint’s The ground of optimization). So I do without control, or at least without Abram’s exact definition of control.

Why? Simply because the internal structure vs behavior distinction mentioned in this post seems more important than the actual definitions (which seem constra... (read more)

ETA 1/12: This review is critical and at times harsh, not because I want to harshly criticize the post or the author, but because I did not consider harshness of criticism when writing. I still think the post is positive-net-value, and might even vote it up in the review. I especially want to emphasize that I do not think it is in any way useful to blame or punish the author for the things I complain about below; this is intended as a "pointing out a problematic habit which a lot of people have and society often encourages" criticism, not a "bad thing must... (read more)

I really like this post. I think it points out an important problem with intuitive credit-assignment algorithms which people often use. The incentive toward inaction is a real problem which is often encountered in practice. While I was somewhat aware of the problem before, this post explains it well.

I also think this post is wrong, in a significant way: asymmetric justice is not always a problem and is sometimes exactly what you want. in particular, it's how you want a justice system (in the sense of police, judges, etc) to work.

The book Law's Order explai... (read more)

“Phase change in 1960’s” - first claim is california’s prison pop went from 5k to 25k. According to wikipedia this does seem to happen… but then it’s immediately followed by a drop in prison population between 1970 and 1980. It also looks like the growth is pretty stable starting in the 1940s.

According to this prison pop in California was a bit higher than 5k historically, 6k-8k, and started growing in 1945 by about 1k/year fairly consistently until 1963. It was then fairly steady, even dropping a bit, until 1982 when it REALLY exploded, more than doubling... (read more)

What's the type signature of goals?

The type signature of goals is the overarching topic to which this post contributes. It can manifest in a lot of different ways in specific applications:

  • What's the type signature of human values?
  • What structure types should systems biologists or microscope AI researchers look for in supposedly-goal-oriented biological or ML systems?
  • Will AI be "goal-oriented", and what would be the type signature of its "goal"?

If we want to "align AI with human values", build ML interpretability tools, etc, then that's going to be pretty to... (read more)

This post states the problem of gradient hacking. It is valuable in that this problem is far from obvious, and if plausible, very dangerous. On the other hand, the presentation doesn’t go into enough details, and so leaves gradient hacking open to attacks and confusion. Thus instead of just reviewing this post, I would like to clarify certain points, while interweaving my critics about the way gradient hacking was initially stated, and explaining why I consider this problem so important.

(Caveat: I’m not pretending that any of my objections are unknown to E... (read more)

The only way to get information from a query is to be willing to (actually) accept different answers. Otherwise, conservation of expected evidence kicks in. This is the best encapsulation of this point, by far, that I know about, in terms of helping me/others quickly/deeply grok it. Seems essential.

Reading this again, the thing I notice most is that I generally think of this point as being mostly about situations like the third one, but most of the post's examples are instead about internal epistemic situations, where someone can't confidently conclude or ... (read more)

The discussion around It's Not the Incentives, It's You, was pretty gnarly. I think at the time there were some concrete, simple mistakes I was making. I also think there were 4-6 major cruxes of disagreement between me and some other LessWrongers. The 2019 Review seemed like a good time to take stock of that.

I've spent around 12 hours talking with a couple people who thought I was mistaken and/or harmful last time, and then 5-10 writing this up. And I don't feel anywhere near done, but I'm reaching the end of the timebox so here goes.

Core Claims

I think th... (read more)

I wrote this post about a year ago.  It now strikes me as an interesting mixture of

  1. Ideas I still believe are true and important, and which are (still) not talked about enough
  2. Ideas that were plausible at the time, but are much less so now
  3. Claims I made for their aesthetic/emotional appeal, even though I did not fully believe them at the time

In category 1 (true, important, not talked about enough):

  • GPT-2 is a source of valuable evidence about linguistics, because it demonstrates various forms of linguistic competence that previously were only demonstrated
... (read more)

I think this post is incredibly useful as a concrete example of the challenges of seemingly benign powerful AI, and makes a compelling case for serious AI safety research being a prerequisite to any safe further AI development. I strongly dislike part 9, as painting the Predict-o-matic as consciously influencing others personality at the expense of short-term prediction error seems contradictory to the point of the rest of the story. I suspect I would dislike part 9 significantly less if it was framed in terms of a strategy to maximize predictive accuracy.... (read more)

Rereading this post, I'm a bit struck by how much effort I put into explaining my history with the underlying ideas, and motivating that this specifically is cool. I think this made sense as a rhetorical move--I'm hoping that a skeptical audience will follow me into territory labeled 'woo' so that they can see the parts of it that are real--and also as a pedagogical move (proofs may be easy to verify, but all of the interesting content of how they actually discovered that line of thought in concept space has been cleaned away; in this post, rather than hid... (read more)

The parent-child model is my cornerstone of healthy emotional processing. I'd like to add that a child often doesn't need much more than your attention. This is one analogy of why meditation works: you just sit down for a while and you just listen

The monks in my local monastery often quip about "sitting in a cave for 30 years", which is their suggested treatment for someone who is particularly deluded. This implies a model of emotional processing which I cannot stress enough: you can only get in the way. Take all distractions away from someone and t... (read more)

Self Review.

I still endorse the broad thrusts of this post. But I think it should change at least somewhat. I'm not sure how extensively, but here are some considerations

Clearer distinctions between Prisoner's Dilemma and Stag Hunts

I should be more clear about what the game theoretical distinctions I'm actually making between Prisoners Dilemma and Stag Hunt. I think Rob Bensinger rightly criticized the current wording, which equivocates between "stag hunting is meaningfully different" and "'hunting rabbit' has nicer aesthetic properties than 'defect'".&nbs... (read more)

I revisited this post a few months ago, after Vaniver's review of Atlas Shrugged.

I've felt for a while that Atlas Shrugged has some really obvious easy-to-articulate problems, but also offers a lot of value in a much-harder-to-articulate way. After chewing on it for a while, I think the value of Atlas Shrugged is that it takes some facts about how incentives and economics and certain worldviews have historically played out, and propagates those facts into an aesthetic. (Specifically, the facts which drove Rand's aesthetics presumably came from growing up i... (read more)

This post seems excellent overall, and makes several arguments that I think represent the best of LessWrong self-reflection about rationality. It also spurred an interesting ongoing conversation about what integrity means, and how it interacts with updating.

The first part of the post is dedicated to discussions of misaligned incentives, and makes the claim that poorly aligned incentives are primarily to blame for irrational or incorrect decisions. I’m a little bit confused about this, specifically that nobody has pointed out the obvious corollary: the peop... (read more)

There are two aspects of this post worth reviewing: as an experiment in a different mode of discourse, and as a description of the procession of simulacra, a schema originally advanced by Baudrillard.

As an experiment in a diffferent mode of discourse, I think this was a success on its own terms, and a challenge to the idea that we should be looking for the best blog posts rather than the behavior patterns that lead to the best overall discourse.

The development of the concept occurred over email quite naturally without forceful effort. I would have written ... (read more)

The material here is one seed of a worldview which I've updated toward a lot more over the past year. Some other posts which involve the theme include Science in a High Dimensional World, What is Abstraction?, Alignment by Default, and the companion post to this one Book Review: Design Principles of Biological Circuits.

Two ideas unify all of these:

  1. Our universe has a simplifying structure: it abstracts well, implying a particular kind of modularity.
  2. Goal-oriented systems in our universe tend to evolve a modular structure which reflects the structure of the u
... (read more)

I notice I am confused.

I feel as though these type of posts add relatively little value to LessWrong, however, this post has quite a few upvotes. I don’t think novelty is a prerequisite for a high-quality post, but I feel as though this post was both not novel and not relevant, which worries me. I think that most of the information presented in this article is a. Not actionable b. Not related to LessWrong, and c. Easily replaceable with a Wikipedia or similar search. This would be my totally spot balled test for a topical post: at least one of these 3 must... (read more)

I've stepped back from thinking about ML and alignment the last few years, so I don't know how this fits into the discourse about it, but I felt like I got important insight here and I'd be excited to include this. The key concept that bigger models can be simpler seems very important. 

In my words, I'd say that when you don't have enough knobs, you're forced to find ways for each knob to serve multiple purposes slash combine multiple things, which is messy and complex and can be highly arbitrary, whereas with lots of knobs you can do 'the thing you na... (read more)

There is a joke about programmers, that I picked up long ago, I don't remember where, that says: A good programmer will do hours of work to automate away minutes of drudgery. Some time last month, that joke came into my head, and I thought: yes of course, a programmer should do that, since most of the hours spent automating are building capital, not necessarily in direct drudgery-prevention but in learning how to automate in this domain.

I did not think of this post, when I had that thought. But I also don't think I would've noticed, if that joke had crosse... (read more)

The notion of specificity may be useful, but to me its presentation in terms of tone (beginning with the title "The Power to Demolish Bad Arguments") and examples seemed rather antithetical to the Less Wrong philosophy of truth-seeking.

For instance, I read the "Uber exploits its drivers" example discussion as follows: the author already disagrees with the claim as their bottom line, then tries to win the discussion by picking their counterpart's arguments apart, all the while insulting this fictitious person with asides like "By sloshing around his mental ... (read more)

I've alluded to this in other comments, but I think worth spelling out more comprehensively here.

I think this post makes a few main points:

  1. Categories are not arbitrary. You might need different categories for different purposes, but categories are for helping you think about the things you care about, and a category that doesn't correspond to the territory will be less helpful for thinking and communciating.
  2. Some categories might sort of look like they correspond to something in reality, but they are gerrymandered in a way optimized for deception. 
  3. You
... (read more)

This came out in April 2019, and bore a lot of fruit especially in 2020. Without it, I wouldn't have thought about the simulacra concept and developed the ideas, and without those ideas, I don't think I would have made anything like as much progress understanding 2020 and its events, or how things work in general. 

I don't think this was an ideal introduction to the topic, but it was highly motivating regarding the topic, and also it's a very hard topic to introduce or grok, and this was the first attempt that allowed later attempts. I think we should reward all of that.

This is a self-review, looking back at the post after 13 months.

I have made a few edits to the post, including three major changes:
1. Sharpening my definition of what counts as "Rationalist self-improvement" to reduce confusion. This post is about improved epistemics leading to improved life outcomes, which I don't want to conflate with some CFAR techniques that are basically therapy packaged for skeptical nerds.
2. Addressing Scott's "counterargument from market efficiency" that we shouldn't expect to invent easy self-improvement techniques that haven't be... (read more)

This is my post. It is fundamentally a summary of an overview paper, which I wrote to introduce the concept to the community, and I think it works for that purpose. In terms of improvements there are a few I would make; I would perhaps include the details about why people choose megaprojects as a venue, for completeness' sake.  It might have helped if I provided more examples in the post to motivate engagement; these are projects like powerplants, chip fabs, oil rigs and airplanes, or in other words the fundamental blocks of modern civilization.

I cont... (read more)

This is a cogent, if sparse, high-level analysis of the epistemic distortions around megaprojects in AI and other fields.

It points out that projects like the human brain project and the fifth generation computer systems project made massive promises, raised around a billion dollars, and totally flopped. I don't expect this was a simple error, I expect there were indeed systematic epistemic distortions involved, perpetuated at all levels.

It points out that similar scale projects are being evaluated today involving various major AI companies globally, and po... (read more)

(Self-review.) I've edited the post to include the calculation as footnote 10.

The post doesn't emphasize this angle, but this is also more-or-less my abstract story for the classic puzzle of why disagreement is so prevalent, which, from a Bayesian-wannabe rather than a human perspective, should be shocking: there's only one reality, so honest people should get the same answers. How can it simultaneously be the case that disagreement is ubiquitous, but people usually aren't outright lying? Explanation: the "dishonesty" is mostly in the form... (read more)

I think I agree with the thrust of this, but I think the comment section raises caveats that seem important. Scott's acknowledged that there's danger in this, and I hope an updated version would put that in the post.

But also...

Steven Pinker is a black box who occasionally spits out ideas, opinions, and arguments for you to evaluate. If some of them are arguments you wouldn’t have come up with on your own, then he’s doing you a service. If 50% of them are false, then the best-case scenario is that they’re moronically, obviously false, so that you can reject

... (read more)

One factor no one mentions here is the changing nature of our ability to coordinate at all. If our ability to coordinate in general is breaking down rapidly, which seems at least highly plausible, then that will likely carry over to AGI, and until that reverses it will continuously make coordination on AGI harder same as everything else. 

In general, this post and the answers felt strangely non-"messy" in that sense, although there's also something to be said for the abstract view. 

In terms of inclusion, I think it's a question that deserves more thought, but I didn't feel like the answers here (in OP and below) were enlightening enough to merit inclusion. 

I chose this particular post to review because I think it does a great job of highlighting soe of the biases and implicit assumptions that Zack makes throughout the rest of the sequence. Therefore this review should be considered not just a review of this post, but also all subsequent posts in Zack's sequence.

Firstly, I think the argument Zack is making here is reasonable. He's saying that if a fact is relevant to an argument it should be welcome, and if it's not relevant to an argument it should not be.

Throughout the rest of the sequence, he continues to ... (read more)

This post introduces a potentially very useful model, both for selecting problems to work on and for prioritizing personal development. This model could be called "The Pareto Frontier of Capability". Simply put:

  1. By an efficient markets-type argument, you shouldn't expect to have any particularly good ways of achieving money/status/whatever - if there was an unusually good way of doing that, somebody else would already be exploiting it.
  2. The exception to this is that if only a small amount of people can exploit an opportunity, you may have a shot. So you s
... (read more)

This post makes a straightforward analytic argument clarifying the relationship between reason and experience. The popularity of this post suggests that the ideas of cultural accumulation of knowledge, and the power of reason, have been politicized into a specious Hegelian opposition to each other. But for the most part neither Baconian science nor mathematics (except for the occasional Ramanujan) works as a human institution except by the accumulation of knowledge over time.

A good follow-up post would connect this to the ways in which modernist ideology p... (read more)

If this post is selected, I'd like to see the followup made into an addendum—I think it adds a very important piece, and it should have been nominated itself.

Self-review: Looking back, this post is one of the first sightings of a simple, very useful concrete suggestion to have chargers ready to go literal everywhere you might want them, and that is a remarkably large life improvement that got through to many people and that I'm very happy I realized. 

However, that could easily be more than all of this post's value, because essentially no one embraced the central concept of Duel Wielding the phones themselves. And after a few months, I stopped doing so as well, in favor of not getting confused about which p... (read more)

This post surprised me a lot. It still surprises me a lot, actually. I've also linked it a lot of times in the past year. 

The concrete context where this post has come up is in things like ML transparency research, as well as lots of theories about what promising approaches to AGI capabilities research are. In particular, there is a frequently recurring question of the type "to what degree do optimization processes like evolution and stochastic gradient descent give rise to understandable modular algorithms?". 

I'm trying out making some polls about posts for the Review (using the predictions feature). You can answer by hovering over the scale and clicking a number to indicate your agreement with the claim. 

Making more land out of the about 50mi^2 shallow water in the San Francisco Bay, South of the Dumbarton Bridge, would... 

... (read more)

This seems to me like a valuable post, both on the object level, and as a particularly emblematic example of a category ("Just-so-story debunkers") that would be good to broadly encourage.

The tradeoff view of manioc production is an excellent insight, and is an important objection to encourage: the original post and book (haven't read in the entirety) appear to have leaned to heavily on what might be described as a special case of a just-so story: the phenomena is a behavior difference is explained as an absolute by using a post-hoc framework, and then doe... (read more)

In a field like alignment or embedded agency, it's useful to keep a list of one or two dozen ideas which seem like they should fit neatly into a full theory, although it's not yet clear how. When working on a theoretical framework, you regularly revisit each of those ideas, and think about how it fits in. Every once in a while, a piece will click, and another large chunk of the puzzle will come together.

Selection vs control is one of those ideas. It seems like it should fit neatly into a full theory, but it's not yet clear what that will look like. I revis... (read more)

To effectively extend on Raemon's commentary:

I think this post is quite good, overall, and adequately elaborates on the disadvantages and insufficiencies of the Wizard's Code of Honesty beyond the irritatingly pedantic idiomatic example. However, I find the implicit thesis of the post deeply confusing (that EY's post is less "broadly useful" than it initially appears). As I understand them, the two posts are saying basically identical things, but are focused in slightly different areas, and draw very different conclusions. EY's notes the issues with the wi... (read more)

Post is very informal. It reads like, well, a personal blog post. A little in the direction of raw freewriting. It's fluid. Easy to read and relate to.

That matters, when you're trying to convey nuanced information about how minds work. Relatable means the reader is making connections with their personal experiences; one of the most powerful ways to check comprehension and increase retention. This post shows a subtle error as it appears from the inside. It doesn't surprise me that this post sparked some rich discussion in the comments.

To be frank, I'd be ve... (read more)

As has been mentioned elsewhere, this is a crushingly well-argued piece of philosophy of language and its relation to reasoning. I will say this post strikes me as somewhat longer than it needs to be, but that's also my opinion on much of the Sequences, so it is at least traditional.

Also, this piece is historically significant because it played a big role in litigating a community social conflict (which is no less important for having been (being?) mostly below the surface), and set the stage for a lot of further discussion. I think it's very important tha... (read more)

I had not read this post until just now. I think it is pretty great.

I also already had a vague belief that I should consume more timeless content. But, now I suddenly have a gearsy model that makes it a lot more intuitive why I might want to consume more timeless content. I also have a schema of how to think about "what is valuable to me?"

I bounced off this post the first couple times because, well, it opens with math and math makes my eyes glaze over and maybe it shouldn't cause that but it is what it is. I suspect it would be worth rewriting this post (or writing an alternate version), that puts the entire model in verbal-english front and center.

Looking back, this all seems mostly correct, but missing a couple, assumed steps. 

 I've talked to one person since about their mild anxiety talking to certain types of people; I found two additional steps that helped them.

  1. Actually trying to become better
  2. Understanding that their reaction is appropriate for some situations (like the original trauma), but it's overgeneralized to actually safe situations.

These steps are assumed in this post because, in my case, it's obvious I'm overreacting (there's no drone) and I understand PTSD is common and treat... (read more)

This is an excellent post, with a valuable and well-presented message. This review is going to push back a bit, talk about some ways that the post falls short, with the understanding that it's still a great post.

There's this video of a toddler throwing a tantrum. Whenever the mother (holding the camera) is visible, the child rolls on the floor and loudly cries. But when the mother walks out of sight, the toddler soon stops crying, gets up, and goes in search of the mother. Once the toddler sees the mother again, it's back to rolling on the floor crying.

A k... (read more)

For the Review, I'm experimenting with using the predictions feature to poll users for their opinions about claims made in posts. 

The first two cites Scott almost verbatim, but for the third I tried to specify further. 

Feel free to add your predictions above, and let me know if you have any questions about the experienc... (read more)

As mentioned in my comment, this book review overcame some skepticism from me and explained a new mental model about how inner conflict works. Plus, it was written with Kaj's usual clarity and humility. Recommended.

This review is more broadly of the first several posts of the sequence, and discusses the entire sequence. 

Epistemic Status: The thesis of this review feels highly unoriginal, but I can't find where anyone else discusses it. I'm also very worried about proving too much. At minimum, I think this is an interesting exploration of some abstract ideas. Considering posting as a top-level post. I DO NOT ENDORSE THE POSITION IMPLIED BY THIS REVIEW (that leaving immoral mazes is bad), AND AM FAIRLY SURE I'M INCORRECT.

The rough thesis of "Meditations on Moloch"... (read more)

Biorisk - well wouldn't it be nice if we'd all been familiar with the main principles of biorisk before 2020? i certainly regretted sticking my head in the sand.

> If concerned, intelligent people cannot articulate their reasons for censorship, cannot coordinate around principles of information management, then that itself is a cause for concern. Discussions may simply move to unregulated forums, and dangerous ideas will propagate through well intentioned ignorance.

Well. It certainly sounds prescient in hindsight, doesn't it?

Infohazards in particular cro... (read more)

One year later, I remain excited about this post, from its ideas, to its formalisms, to its implications. I think it helps us formally understand part of the difficulty of the alignment problem. This formalization of power and the Attainable Utility Landscape have together given me a novel frame for understanding alignment and corrigibility.

Since last December, I’ve spent several hundred hours expanding the formal results and rewriting the paper; I’ve generalized the theorems, added rigor, and taken great pains to spell out what the theorems do and do not ... (read more)

This sort of thing is exactly what Less Wrong is supposed to produce. It's a simple, straightforward and generally correct argument, with important consequences for the world, which other people mostly aren't making. That LW can produce posts like this—especially with positive reception and useful discussion—is a vindication of this community's style of thought.

It seems like the core thing that this post is doing is treating the concept of "rule" as fundamental. 

If you have a general rule plus some exceptions, then obviously that "general rule" isn't the real process that is determining the results. And noticing that (obvious once you look at it) fact can be a useful insight/reframing.

The core claim that this post is putting forward, IMO, is that you should think of that "real process" as being a rule, and aim to give it the virtues of good rules such as being simple, explicit, stable, and legitimate (having... (read more)

This post is even-handed and well-reasoned, and explains the issues involved well. The strategy-stealing assumption seems important, as a lot of predictions are inherently relying on it either being essentially true, or effectively false, and I think the assumption will often effectively be a crux in those disagreements, for reasons the post illustrates well.

The weird thing is that Paul ends the post saying he thinks the assumption is mostly true, whereas I thought the post was persuasive that the assumption is mostly false. The post illustrates that the u... (read more)

The author does a good job articulating his views on why Buddhist concentration and insight practices can lead to psychological benefits. As somebody who has spent years practicing these practices and engaging with various types of (Western) discourse about them, the author's psychological claims seem plausible to a point. He does not offer a compelling mechanism for why introspective awareness of sankharas should lead to diminishing them. He also offers no account for why if insight does dissolve psychological patterns, it would preferentially dissolve ne... (read more)

This post seems to me to be misunderstanding a major piece of Paul's "sluggish updating" post, and clashing with Paul's post in ways that aren't explicit.

The core of Paul's post, as I understood it, is that incentive landscapes often reward people for changing their stated views too gradually in response to new arguments/evidence, and Paul thinks he has often observed this behavioral pattern which he called "sluggish updating." Paul illustrated this incentive landscape through a story involving Alice and Bob, where Bob is thinking through his optimal strat... (read more)

This post is hard enough to get through that the original person who nominated it didn't make it, and also I tried and gave up in order to look at more other things instead. I agree that it's possible there is something here, but we didn't build upon it, and if we put it in the book people are going to be confused as to what the hell is going on. I don't think we should include. 

This feels like an extremely important point. A huge number of arguments devolve into exactly this dynamic because each side only feels one of (the Rock|the Hard Place) as a viscerally real threat, while agreeing that the other is intellectually possible. 

Figuring out that many, if not most, life decisions are "damned if you do, damned if you don't" was an extremely important tool for me to let go of big, arbitrary psychological attachments which I initially developed out of fear of one nasty outcome.

I love this post, it's a really healthy way of exploring assumptions about one's goals and subagents. I think it's really hard to come up with simple diagrams that communicate key info, and I am impressed by choices such as changing the color of the path over time. I also find it insightful in matters relating to what a distracted agent looks like, or how adding subgoals can improve things.

It's the sort of thing I'd like to see more rationalists doing, and it's a great read, and I feel very excited about more of this sort of work on LessWrong. I hope it inspires more LessWrongers to build on it. I expect to vote it at somewhere between +5 and +7.

I specifically endorse the "literally just include the paragraph about buying lots of chargers" idea that Zvi suggested.

Echoing previous reviews (it's weird to me the site still suggested this to review anyway, seems like it was covered already?) I would strongly advise against including this. While it has a useful central point - that specificity is important and you should look for and request it - I agree with other reviewers that the style here is very much the set of things LW shouldn't be about, and LWers shouldn't be about, but that others think LW-style people are about, and it's structuring all these discussions as if arguments are soldiers and the goal is to win w... (read more)

Partial Self Review:

There's an obvious set of followup work to be done here, which is to ask "Okay, this post was vague poetry meant to roughly illustrate a point. But, how many words do you actually precisely have?" What are the in-depth models that let you predict precisely how much nuance you have to work with?

Less obvious to me is whether this post should become a longer, more rigorous post, or whether it should stay it's short, poetic self, and have those questions get explored in a different post with different goals. 

Also less obvious to me is ... (read more)

This was a profoundly impactful post and definitely belongs in the review. It prompted me and many others to dive deep into understanding how emotional learnings have coherence and to actually engage in dialogue with them rather than insisting they don't make sense. I've linked this post to people more than probably any other LessWrong post (50-100 times) as it is an excellent summary and introduction to the topic. It works well as a teaser for the full book as well as a standalone resource.

The post makes both conceptual and pragmatic claims. I haven't exa... (read more)

How do you review a post that was not written for you? I’m already doing research in AI Alignment, and I don’t plan on creating a group of collaborators for the moment. Still, I found some parts of this useful.

Maybe that’s how you do it: by taking different profiles, and running through the most useful advice for each profile from the post. Let’s do that.

Full time researcher (no team or MIRIx chapter)

For this profile (which is mine, by the way), the most useful piece of advice from this post comes from the model of transmitters and receivers. I’m convinced... (read more)

This post proposes 4 ideas to help building gears-level models from papers that already passed the standard epistemic check (statistics, incentives):

  • Look for papers which are very specific and technical, to limit the incentives to overemphasize results and present them in a “saving the world” light.
  • Focus on data instead of on interpretations.
  • Read papers on different aspects of the same question/gear
  • Look for mediating variables/gears to explain multiple results at once

(The second section, “Zombie Theories”, sounds more like epistemic check than gears-level ... (read more)

Concise. The post briefly sums up the fields and directions where rationality have been developed on the site, then asks for users to lists the big open questions that are still left to answer.

  • The post is mostly useful to 1) people wishing to continue their training in rationality after they went through the recommendations and are looking for what they should do next and 2) continue the conversation on how to improve rationality systematically. The post itself lists a few of the fields that have been developed and are being developed, in the answers there
... (read more)

I want to have this post in a physical book so that I can easily reference it.

It might actually work better as a standalone pamphlet, though. 

After reading this, I went back and also re-read Gears in Understanding (https://www.lesswrong.com/posts/B7P97C27rvHPz3s9B/gears-in-understanding) which this is clearly working from. The key question to me was, is this a better explanation for some class of people? If so, it's quite valuable, since gears are a vital concept. If not, then it has to introduce something new in a way that I don't see here, or it's not worth including.

It's not easy to put myself in the mind of someone who doesn't know about gears. 

I think the original Gears in Understandin... (read more)

This is a very important point to have intuitively integrated into one's model, and I charge a huge premium to activities that require this kind of reliability. I hope it makes the cut.

I also note that someone needs to write The Costs of Unreliability and I authorize reminding me in 3 months that I need to do this.

Echoing Raemon that this has become one of my standard reference points and I anticipate linking to this periodically for a long time. I think it's important. 

I'm also tagging this as something I should build upon explicitly some time soon, when I have the bandwidth for that, and I'm tagging Ben/Raemon to remind me of this in 6 months if I haven't done so yet, whether or not it makes the collection.

These issues are key ones to get right, involve difficult trade-offs, and didn't have a good descriptor that I know about until this post. 

Consider this as two posts.

The first post is Basketballism. That post is awesome. Loved it. 

The second post is the rest of the post. That post tries to answer the question in the title, but doesn't feel like it makes much progress to me. There's some good discussion that goes back and forth, but mostly everyone agrees on what should be clear to all: No, rationalism doesn't let you work miracles at will, and we're not obviously transforming the world or getting key questions reliably right. Yes, it seems to be helpful, and generally the people who do i... (read more)

I like this post a lot.

I'm noticing an unspoken assumption: that Amish culture hasn't changed "much" since the 1800s. If that's not the case... it's not that anything here would necessarily be false, but it would be an important omission.

Like, taking this post as something that it's not-quite but also not-really-not, it uses the Amish as an example in support of a thesis: "cultural engineering is possible". You can, as a society, decide where you want your society to go and then go there. The Amish are an existence proof, and Ray bounces from them to askin... (read more)

I stand by this piece, and I now think it makes a nice complement to discussions of GPT-3. In both cases, we have significant improvements in chunking of concepts into latent spaces, but we don't appear to have anything like a causal model in either. And I've believed for several years that causal reasoning is the thing that puts us in the endgame.

(That's not to say either system would still be safe if scaled up massively; mesa-optimization would be a reason to worry.)

I trust past-me to have summarized CAIS much better than current-me; back when this post was written I had just finished reading CAIS for the third or fourth time, and I haven't read it since. (This isn't a compliment -- I read it multiple times because I had a lot of trouble understanding it.)

I've put in two points of my own in the post. First:

(My opinion: I think this isn't engaging with the worry with RL agents -- typically, we're worried about the setting where the RL agent is learning or planning at test time, which can happen in learn-to-learn and on

... (read more)

This has been one of the most useful posts on LessWrong in recent years for me personally. I find myself often referring to it, and I think almost everyone underestimates the difficulty gap between critiquing others and proposing their own, correct, ideas.

I believe this is an important gears-level addition to posts like hyperbolic growth, long-term growth as a sequence of exponential modes and an old yudkowsky post I am unable to find at the moment.

I don't know how closely these texts are connected, but Modeling the Human Trajectory picks up one year later, creating two technical models: one stochastically fitting and extrapolating GDP growth; the other providing a deterministic outlook, considering labor, capital, human capital, technology and production (and, in one case, natural resources). Roodman arriv... (read more)

A good example of crucial arguments, in the wild.

I'm not sure I like it. It looks like a lot of talking past each other. Very casually informative of different perspectives without much direct confrontation. Relatively good for an internet argument, but still not as productive as one might hope for between experts debating a serious topic. I'm glad for the information; I strongly value concretely knowing that sometimes arguments play out like this.

But I still don't like it.

(To be fair, this is comments on a facebook link post. I feel Ben misleads with technical truths when he describes this as an "actual debate" occurring in "a public space".)

The central point here seems strong and important. One can, as Scott notes, take it too far, but mostly yes one should look where there are very interesting things even if the hit rate is not high, and it's important to note that. Given the karma numbers involved and some comments sometimes being included I'd want assurance that we wouldn't include any of that with regard to particular individuals. 

That comment section, though, I believe has done major harm and could keep doing more even in its current state, so I still worry about bringing more focus... (read more)

The problem with evaluating a post like this is that long post is long and slow and methodical, and making points that I (and I'm guessing most others who are doing the review process) already knew even at the time it was written in 2017. So it's hard to know whether the post 'works' at doing the thing it is trying to do, and also hard to know whether it is an efficient means of transmitting that information. 

Why can't the post be much shorter and still get its point across? Would it perhaps even get the point across better if it was much shorter, bec... (read more)

I really dislike the central example used in this post, for reasons explained in this article. I hope it isn't included in the next LW book series without changing to a better example.

So I reread this post, found I hadn't commented... and got a strong desire to write a response post until I realized I'd already written it, and it was even nominated. I'd be fine with including this if my response also gets included, but very worried about including this without the response. 

In particular, I felt the need to emphasize the idea that Stag Hunts frame coordination problems as going against incentive gradients and as being maximally fragile and punishing, by default. 

If even one person doesn't get with the program, for any reason, ... (read more)

I don't know whether it was this post, or maybe just a bunch of things I learned while trying to build LessWrong, but this feels like it has become a pretty important part of my model of how organizations work, and also what kind of things I pay attention to in my personal development. 

Some additional consequences of things that I believe that feel like they extend on this post: 

  • Automating is often valuable because it frequently replaces tasks that were really costly because they had to be executed reliably
  • I am very hesitant to start projects in
... (read more)

THIS. TIMES 1000.

I want more people to know this about the Amish! More people should have the concept of "distributed community with intentional norms that has a good relationship with the government and can mostly run on their own legal system" floating around in their head.

For followups, I'd want to see

  1. discussing the issues without proposing solutions
  2. a review of the history + outcomes of similar attempts to take the benedict option
    • this could turn out to be a terrible idea in practice, and if so I want to know that so I can start harping on about my next
... (read more)

At first when I read this, I strongly agreed with Zack's self-review that this doesn't make sense to include in context, but on reflection and upon re-reading the nominations, I think he's wrong and it would add a lot of value per page to do so, and it should probably be included. 

The false dichotomy this dissolves, where either you have to own all implications, so it's bad to say true things that imply things that are true but focus upon would have unpleasant consequences, or it has to be fine to ignore all the extra communication that's involved in ... (read more)

[NB: this is a review of the paper, which I have recently read, not of the post series, which I have not]

For a while before this paper was published, several people in AI alignment had discussed things like mesa-optimization as serious concerns. That being said, these concerns had not been published in their most convincing form in great details. The two counterexamples that I’m aware of are the posts What does the universal prior actually look like? by Paul Christiano, and Optimization daemons on Arbital. However, the first post only discussed the issue i... (read more)

  • Olah’s comment indicates that this is indeed a good summary of his views.
  • I think the first three listed benefits are indeed good reasons to work on transparency/interpretability. I am intrigued but less convinced by the prospect of ‘microscope AI’.
    • The ‘catching problems with auditing’ section describes an ‘auditing game’, and says that progress in this game might illustrate progress in using interpretability for alignment. It would be good to learn how much success the auditors have had in this game since the post was published.
    • One test of ‘microscope
... (read more)

I would probably include this post in the review as-is if I had to. However, I would quite prefer the post to change somewhat before putting it in the Best Of Book.

Most importantly, I think, is the title and central handle. It does an important job, but it does not work that well in the wild among people who don't share the concept handle. Several people have suggested alternatives. I don't know if any of them are good enough, but I think now is a good time to reflect on a longterm durable name.

I'd also like to see some more explicit differentiation of "as... (read more)

I stand by my nomination. This was the most serious attempt I am aware of to set up straightforward amplification of someone's reasoning in this way, it was competently executed, the diagrams showing the results are awesome, and I am proud that this sort of work is on LessWrong. It's only a baby step, but I think this step is exciting and I hope it encourages others to run further with it.

I like this post a lot and it feels fairly foundational to me. But... I don't have a strong impression that the people I most wanted to take heed of it really did. 

In my mind this post pairs with my followup "Can you eliminate memetic scarcity instead of fighting?", which also didn't seem to take off as a key tool for conflict resolution.

I feel like there's some core underlying problem where even acknowledging this sort of problem feels a bit like ceding ground, and I don't know what to do about it. To be fair I also think this argument can be used as... (read more)

I think this post and the Gradient Hacking post caused me to actually understand and feel able to productively engage with the idea of inner-optimizers. I think the paper and full sequence was good, but I bounced off of it a few times, and this helped me get traction on the core ideas in the space. 

I also think that some parts of this essay hold up better as a core abstraction than the actual mesa-optimizer paper itself, though I am not at all confident about this. But I just noticed that when I am internally thinking through alignment problems relate... (read more)

I got an email from Jacob L. suggesting I review my own post, to add anything that might offer a more current perspective, so here goes...

One thing I've learned since writing this is that counterfactualizing, while it doesn't always cause akrasia, it is definitely an important part of how we maintain akrasia: what some people have dubbed "meta-akrasia".

When we counterfactualize that we "should have done" something, we create moral license for our past behavior. But also, when we encounter a problem and think, "I should [future action]", we are often licen... (read more)

This is a more kludgy dense read than some of Kaj's other writing. I think I'm mostly only making sense of it because I'm familiar with similar ideas already. Some of those from Kaj's later posts! I guess I'm not that interested in an overview of a particular book? I can't tell if I read this post before, or if the same points were repeated in other writing. But I'm getting stuck on some clinical wordiness. 

Doesn't seem... foundational? It's a starting-to-build on literature and other posts. I'm not sure how someone else would build on it.

If anything,... (read more)

There's are factual claims in this section:

The point is, I know of a few people, acquaintances of mine, who, even when asked to try to find flaws, could not detect anything weird or mistaken in the GPT-2-generated samples.

There are probably a lot of people who would be completely taken in by literal “fake news”, as in, computer-generated fake articles and blog posts. This is pretty alarming. Even more alarming: unless I make a conscious effort to read carefully, I would be one of them.

I'm a little uncertain of how I would test this since it seems ... (read more)

So first off... I'd forgotten this existed. That's obviously a negative indication in terms of how much it guided my thinking over the past two years! It also meant I got to see it with fresh eyes two years later. 

I think the central point the post thinks it is making is that, extending on the original econ paper, search effectiveness can rapidly become impossible to improve by expanding size of one's search, if those you are searching understand they are in competition. To improve results further, one must instead improve average quality in the searc... (read more)

As someone who was involved in the conversations, and who cares about and focuses on such things frequently, this continues to feel important to me, and seems like one of the best examples of an actual attempt to do the thing being done, which is itself (at least partly) an example of the thing everyone is trying to figure out how to do. 

What I can't tell is whether anyone who wasn't involved is able to extract the value. So in a sense, I "trust the vote" on this so long as people read it first, or at least give it a chance, because if that doesn't convince them it's worthwhile, then it didn't work. Whereas if it does convince them, it's great and we should include it.

I didn't notice until just recently that this post fits into a similar genre as (what I think) the Moral Mazes discussion is pointing at (which may be different from what Zvi thinks).

Where one of the takeaways from Moral Mazes might be: "if you want your company to stay aligned, try not to grow the levels of hierarchy too much, or be extremely careful when you do."

"Don't grow the layers of hierarchy" is (in practice) perhaps a similar injunction to "don't grow the company too much at all" (since you need hierarchy to scale)

Immoral Mazes posits a specific f... (read more)

Self Review. 

I still think this is true, and important. Honestly, I'd like to bid for it being required-reading among org-founders in the rationalsphere (alongside Habryka's Integrity post)

I think healthy competition is particularly important for a (moderately small) constellation of orgs and proto-orgs to have in mind if they are trying to scale up and impact the world at large, while maintaining integrity. (i.e. the rationality/x-risk/EA ecosystem). 

I think this is one of the key answers to "what safeguards do we have against evolving into a mo... (read more)

It's hard to know how to judge a post that deems itself superseded by a post from a later year, but I lean toward taking Daniel at his word and hoping we survive until the 2021 Review comes around.

The content here is very valuable, even if the genre of "I talked a lot with X and here's my articulation of X's model" comes across to me as a weird intellectual ghostwriting. I can't think of a way around that, though.

This was a great read at the time and still holds up. It's one of the rare artifacts that can only produced after a decade or two, which is an account of major shifts in a person's perspective over the course of a decade or two. (In that way it's similar in genre for me as Buck's post in the review.)

It's a very excitingly written history, and gives me insight into the different perspectives on the issue of psycholinguistics, and helps me frame the current situation in AI. I expect to vote on this somewhere between +5 and +7.

Last minute review. Daniel Kokotajlo, the author of this post, has written a review as a separate post, within which he identifies a flawed argument here and recommends against this post's inclusion in the review on that basis.

I disagree with that recommendation. The flaw Daniel identifies and improves does not invalidate the core claim of the post. It does appear to significantly shift the conclusion within the post, but:

  • I still feel that this still falls within the scope of the title and purpose of the post. 
  • I feel the shifted conclusion falls withi
... (read more)

I really liked this sequence. I agree that specificity is important, and think this sequence does a great job of illustrating many scenarios in which it might be useful.

However, I believe that there are a couple implicit frames that permeate the entire sequence, alongside the call for specificity.  I believe that these frames together can create a "valley of bad rationality" in which calls for specificity can actually make you worse at reasoning than the default.

------------------------------------

The first of these frames is not just that being speci... (read more)

I have now linked at least 10 times to the heading on "'Generate evidence of difficulty' as a research purpose" section of this post. It was a thing that I kind of wanted to point to before this post came out, but felt confused about it, and this post finally gave me a pointer to it. 

I think that section was substantially more novel and valuable to me than the rest of this post, but it is also evidence that others might have also not had some of the other ideas on their map, and so they might found it similarly valuable because of a different section. 

Writing this post helped clarify my understanding of the concepts in both taxonomies - the different levels of specification and types of Goodhart effects. The parts of the taxonomies that I was not sure how to match up usually corresponded to the concepts I was most confused about. For example, I initially thought that adversarial Goodhart is an emergent specification problem, but upon further reflection this didn't seem right. Looking back, I think I still endorse the mapping described in this post.

I hoped to get more comments on this post... (read more)

  • I think this paper does a good job at collecting papers about double descent into one place where they can be contrasted and discussed.
  • I am not convinced that deep double descent is a pervasive phenomenon in practically-used neural networks, for reasons described in Rohin’s opinion about Preetum et. al.. This wouldn’t be so bad, except the limitations of the evidence (smaller ResNets than usual, basically goes away without label noise in image classification, some sketchy choices made in the Belkin et al experiments) are not really addressed or highlight
... (read more)

On reflection, I endorse the conclusion and arguments in this post. I also like that it's short and direct. Stylistically, it argues for a behavior change among LessWrong readers who sometimes make surveys, rather than being targeted at general LessWrong readers. In particular, the post doesn't spend much time or space building interest about surveys or taking a circumspect view of them. For this reason, I might suggest a change to the original post to add something to the top like "Target audience: LessWrong readers who often or occasionally make form... (read more)

Over the last year, I've thought a lot about human/AI power dynamics and influence-seeking behavior. I personally haven't used the strategy-stealing assumption (SSA) in reasoning about alignment, but it seems like a useful concept.

Overall, the post seems good. The analysis is well-reasoned and reasonably well-written, although it's sprinkled with opaque remarks (I marked up a Google doc with more detail). 

If this post is voted in, it might be nice if Paul gave more room to big-picture, broad-strokes "how does SSA tend to fail?" discussion, discussing ... (read more)

The basic claim of this post is that Paul Graham has written clearly and well about unlearning the desire to do perfectly on tests, but that his actions are incongruous, because he has built the organization that most encourages people to do perfectly on tests.

Not that he has done no better – he has done better than most – but that he is advertising himself as doing this, when he has instead probably just made much better tests to win at.

Sam Altman's desire to be a monopology

On tis the post offers quotes giving evidence saying:

  • YC is a gatekeeper to funding
... (read more)

tl;dr: If this post included a section discussing push-poll concerns and advocating (at least) caution and (preferably) a policy that'd be robust against human foibles, I'd be interested in having this post in the 2019 Review Book.

I think this is an interesting idea that should likely get experimented with.

A thing I was worried about when this first came out, and still worried about, is the blurriness between "survey as tool to gather data" and "survey as tool to cause action in the respondent." 

Some commenters said "this seems like push-polling, isn'... (read more)

I don't feel the "stag hunt" example to be a good fit to the situation described, but the post is clear in explaining the problem and suggesting how to adapt to it.

  • The post helps understand in which situations group efforts where everyone has to invest heavy resources aren't likely to work, focusing on the different perspectives and inferential frames people have on the risks/benefits of the situation. The post is a bit lacking on possible strategies to promote stag hunts, but it specified it would focus on the Schelling choice being "rabbit".
  • The suggestio
... (read more)

I found this post important to developing what's proved to be a useful model for me of thinking about neural annealing as a metaphor for how the brain operates in a variety of situations. In particular, I think it makes a lot of sense when thinking about what it is that meditation and psychedelics do to the brain, and consequently helps me think about how to use them as part of Zen practice.

One thing I like about this post is that it makes claims that should be verifiable via brain studies in that we should see things like brain wave patterns that correspo... (read more)

(Self-review.) I oppose including this post in a Best-of-2019 collection. I stand by what I wrote, but, as with "Relevance Norms", this was a "defensive" post; it exists as a reaction to "Meta-Honesty"'s candidacy in the 2018 Review, rather than trying to advance new material on its own terms.

The analogy between patch-resistence in AI alignment and humans finding ways to dodge the spirit of deontological rules, is very important, but not enough to carry the entire post.

A standalone canon-potential explanation of why I think we need a broader conception of ... (read more)

I think this post is excellent, and judging by the comments I diverge from other readers in what I liked about it.

In the first, I endorse the seriously-but-not-literally standard for posting concepts. The community - rightly in my view - is under continuous pressure to provide high quality posts, but when the standard gets too high we start to lose introduction of ideas and instead they just languish in the drafts folder, sometimes for years. In order to preserve the start of the intellectual pipeline, posts of this level must continue to be produced.

In th... (read more)

I did not follow the Moral Mazes discussion as it unfolded. I came across this article context-less. So I don't know that it adds much to Lesswrong. If that context is relevant, it should get a summary before diving in. From my perspective, its inclusion in the list was a jump sideways.

It's written engagingly. I feel Yarkoni's anger. Frustration bleeds off the page, and he has clearly gotten on a roll. Not performing moral outrage, just *properly, thoroughly livid* that so much has gone wrong in the science world.

We might need that.

What he wrote does not o... (read more)

This post made me try adding more randomness to my life for a week or so. I learned a small amount. I remain excited about automated tools that help do things like this, e.g. recent work from Ought.

The back-and-forth (here and elsewhere) between Kaj & pjeby was an unusually good, rich, productive discussion, and it would be cool if the book could capture some of that. Not sure how feasible that is, given the sprawling nature of the discussion.

Building off Raemon's review, this feels like it is an attempt to make a 101-style point that everyone needs to understand if they don't already (not as rationalists, but as people in general) but that seems to me like it fails because those reading it will fall into the categories of (1) those who already got it and (2) those who need to get it but won't. 

This is another great response post from Zvi.

It takes a list of issues that Zvi didn't get to cherry pick, and then proceeds to explain all them with a couple of core tools: Goodhart's Law, Asymmetric Justice/Copenhagen Interpretation of Ethics, Forbidden Considerations, Power, and Theft. I learned a lot and put a lot of key ideas together in this post. I think it makes a great follow-up read to some of the relevant articles (i.e. Asymmetric Justice, Goodhart Taxonomy, etc).

The only problem is it's very long. 8.5k words. That's about 4% of last year's book... (read more)

I think this post (and similarly, Evan's summary of Chris Olah's views) are essential both in their own right and as mutual foils to MIRI's research agenda. We see related concepts (mesa-optimization originally came out of Paul's talk of daemons in Solomonoff induction, if I remember right) but very different strategies for achieving both inner and outer alignment. (The crux of the disagreement seems to be the probability of success from adapting current methods.)

Strongly recommended for inclusion.

I really enjoyed this post. It was fun to read and really drove home the point about starting with examples. I also thought it was helpful that it didn't just saying, "teach by example". I feel that simplistic idea is all too common and often leads to bad teaching where example after example is given with no clear definitions or high level explanations. However, this article emphasized how one needs to build on the example to connect it with abstract ideas. This creates a bridge between what we already understand and what we are learning.

As I was thinking... (read more)

The notion of paradigm shifts has felt pretty key to how I think about intellectual progress (which in turn means "how do I think about lesswrong?"). A lot of my thinking about this comes from listening to talks by Geoff Anders (huh, I just realized I was literally at a retreat organized by an org called Paradigm at the time, which was certainly not coincidence). 

In particular, I apply the paradigm-theory towards how to think about AI Alignment and Rationality progress, both of which are some manner of "pre-paradigmatic."

I think this post is a good wr... (read more)

This post starts as a discussion of babies enjoying simple repetitive games and observes that for babies this is how they learn a skill. It then suggests that we should apply the same frame to understand adults who engage in seemingly maladaptive social behaviors, such as repetitive arguments, romantic drama, and being shocking to get attention. Finally, it gives several ideas of what might being happening in very abstract terms, in the language of machine learning. It fails to connect any of these abstract, machine-learning-type explanations to any of the... (read more)

This post makes assertions about YC's culture which I find really fascinating. If it's a valid assessment of YC, I rather expect it to have broad implications for the whole capitalist and educational edifice. I've found lots of crystallized insight in Paul Graham's writing, so if his project is failing in the dimensions he's explicitly pointed out as important this seems like critical evidence towards how hard the problem space really is.

What does it mean for the rationalist commmunity if selecting people for quickness of response correlates with anxiety o... (read more)

This post is an observation about a difference between the patients in the doctor's prior practice dealing with poor Medicaid patients, and her current practice dealing with richer patients. The former were concerned with their relationships, the latter with their accomplishments. And the former wanted pills, the later often refused pills. And for these richer patients, refusing pills is a matter of identity - they want to be the type of people who can muddle through and don't need pills. They continue at jobs they hate, because they want to be the type of... (read more)

I experimented with extracting some of the core claims from this post into polls: 

Personally, I find that answering polls like these make me more of a "quest participant" than a passive reader. They provide a nice "think for yourself" prompt, that then makes me look at the essay with a more active mindset. But others might have different experiences, feel free to provide feedback on how it worked for... (read more)

Lesswrong review of Zettelkasten - I stumbled upon this post a few weeks ago, and it solidified several of my vague thoughts on how I might make my notes more useful. In particular, it helped me think of ways I could unify the structures and content-linkage between my roam-notes, orgmode notes, filesystem, and paper journal. I especially appreciated the background context and long-term followup. This post proved an invaluable branching point. I would love it if abram integrated the followup insights back in to the overall post.

That said, I didn't actually ... (read more)

May I just say: Aaaaaa!

This post did not update my explicit model much, but it sure did give my intuition a concrete picture to freak out about. Claim 2 especially. I greatly look forward to the rewrite. Can I interest you in sending an outline/draft to me to beta read?

Given your nomination was for later work building on this post and spinning off discussion, you can likely condense this piece and summarize the later work / responses. (Unless you are hoping they get separately nominated for 2020?)

Your "See: Colonialism" as a casual aside had me cracking up... (read more)

I like this post and would like to see it curated, conditional on the idea actually being good. There are a few places where I'd want more details about the world before knowing if this was true.

  • Who owns this land? I'm guessing this is part of the Guadalupe Watershed, though I'm not sure how I'd confirm that.

This watershed is owned and managed by the Santa Clara Valley Water District.

  • What legal limits are there on use of the land? Wikipedia notes:

The bay was designated a Ramsar Wetland of International Importance on February 2, 2012.

I don't know what that ... (read more)

I think Raemon’s comments accurately describe my general feeling about this post-intriguing, but not well-optimized for a post.

However, I also think that this post may be the source of a subtle misconception in simulacra levels that the broader LessWrong community has adopted. Specifically, I think the distinction between 3 and 4 is blurred in this post, and tries to draw the false analogy that 1:2::3:4. Going from 3 (masks the absence of a profound reality) to 4 (no profound reality) is more clearly described not as a “widespread understanding” that they... (read more)

This post seems helpful in that it expands on the basic idea of the copenhagen interpretation of ethics, and when I first read it was modestly impactful to me, though it was mostly a way to reorganize what I already knew from the examples that Zvi uses. 

It seems to be very accurate and testable, through simple tests of moral intuitions? 

I would like to see more expanding on the conditions that get normal people out of this frame of mind, about suprising places that it pops up, and about realistic incentive design that can be used personally to get this to not happen in your brain.

I've already written a comment with a suggestion that this post needs a summary so that you can benefit from it, even if you don't feel like wading through a bunch of technical material.

This post is excellent, in that it has a very high importance-to-word-count ratio. It'll take up only a page or so, but convey a very useful and relevant idea, and moreover ask an important question that will hopefully stimulate further thought.

This was the first major, somewhat adversarial doublecrux that I've participated in.

(Perhaps this is a wrong framing. I participated in many other significant, somewhat adversarial doublecruxes before. But, I dunno, this felt significantly harder than all the previous ones, the point where it feels like a difference in kind)

It was a valuable learning experience for me. My two key questions for "Does this actually make sense as part of the 2019 Review Book" are:

  • Is this useful to others for learning how to doublecrux, pass ITTs, etc in a lowish-trust-setting
... (read more)

This was important to the discussions around timelines at the time, back when the talk about timelines felt central. This felt like it helped give me permission to no longer consider them as central, and to fully consider a wide range of models of what could be going on. It helped make me more sane, and that's pretty important.

It was also important for the discussion about the use of words and the creation of clarity. There's been a long issue of exactly when and where to use words like "scam" and "lie" to describe things - when is it accurate, when is it ... (read more)

I can't think of a question on which this post narrows my probability distribution.

Not recommended.

This is a true engagement with the ideas in Paul original post. It actively changed my mind – at first I thought Paul was making a good recommendation, but now I think it was a bad one. It helped me step back from a very detailed argument and notice what rationalist virtues were in play. I think it's a great example of what a rebuttal of someone else's post looks like. I'd like to see it in the review, and I will vote on it somewhere between +3 and +7.

In general, I think this post does a great job of articulatng a single, incomplete frame. Others in the review take umbrage with the moralizing tone, but I think the moralizing tone is actually quite useful to give an inside view of this frame. 

I believe this frame is incomplete, but gives an important perspective that is often ignored in the Lesswrong/Gray tribe.

I haven't reviewed the specific claims of the literature here, but I did live through a pandemic where a lot of these concerns came up directly, and I think I can comment directly on the experience.

  • Some LessWrong team members disagree with me on how bad remote-work is. I overall thought it was "Sort of fine, it made some things a bit harder, other things easier. It made it harder to fix some deeper team problems, but we also didn't really succeed at fixing those team problems for in previous non-pandemic years."
    • Epistemic Status, btw: I live the farthest aw
... (read more)

I think I have juuust enough background to follow the broad strokes of this post, but not to quite grok the parts I think Abram was most interested in. 

I definitely caused me to think about credit assignment. I actually ended up thinking about it largely through the lens of Moral Mazes (where challenges of credit assignment combine with other forces to create a really bad environment). Re-reading this post, while I don't quite follow everything, I do successfully get a taste of how credit assignment fits into a bunch of different domains.

For the "myop... (read more)

The post attempts to point out the important gap between fighting over norms/values and getting on the same page about what people's norms/values even are, and offers a linguistic tool to help readers navigate it in their life.

A lot of (the first half of) the post feels like An Intuitive Introduction to Being Pro Conversation Before Fighting, and it's all great reading.

I think the OP wants to see people really have conversation about these important differences in values, and is excited about that. Duncan believes that this phrase is a key step allowing (c... (read more)

I think that, among those who've done serious thought about how intellectual progress happens, it was pretty well known that in some domains a lot of research is happening on forums, and that forum participation as a research strategy can work. But in the broader world, most people treat forums as more like social spaces, and have a model of research works that puts it in distant, inaccessible institutional settings. Many people think research means papers in prestigious journals, with no model of where those papers come from. I think it's worth making common knowledge that getting involved in research can be as simple as tweaking your forum subscriptions.

I observe: There are a techniques floating around the rationality community, with models attached, where the techniques seem anecdotally effective, but the descriptions seem like crazy woo. This post has a model that predicts the same techniques will work, but the model is much more reasonable (it isn't grounded out in axon-connections, but in principle it could be). I want to resolve this tension in this post's favor. In fact I want that enough to distrust my own judgment on the post. But it does look probably true, in the way that models of mind can ever be true (ie if you squint hard enough).

This is not the clearest or the best explanation of simulacrum levels on LessWrong, but it is the first. The later posts on the subject (Simulacra and Subjectivity, Negative Feedback and Simulacra, Simulacra Levels and Their Interactions) are causally downstream of it, and are some of the most important posts on LessWrong. However, those posts were written in 2020, so I can't vote for them in the 2019 review.

I have applied the Simulacrum Levels concept often. I made spaced-repetition cards based on them. Some questions are easy to notice and ask, in simula... (read more)

For me, this is the paper where I learned to connect ideas about delegation to machine learning. The paper sets up simple ideas of mesa-optimizers, and shows a number of constraints and variables that will determine how the mesa-optimizers will be developed – in some environments you want to do a lot of thinking in advance then delegate execution of a very simple algorithm to do your work (e.g. this simple algorithm Critch developed that my group house uses to decide on the rent for each room), and in some environments you want to do a little thinking and ... (read more)

Note 1: This review is also a top-level post.

Note 2: I think that 'robust instrumentality' is a more apt name for 'instrumental convergence.' That said, for backwards compatibility, this comment often uses the latter. 

In the summer of 2019, I was building up a corpus of basic reinforcement learning theory. I wandered through a sun-dappled Berkeley, my head in the clouds, my mind bent on a single ambition: proving the existence of instrumental convergence. 

Somehow. 

I needed to find the right definitions first, and I couldn't even imagine what... (read more)

So, reviewing this seriously seems like a pretty big todo, which has not yet been done. I don't feel qualified to do it. But... this feels plausible enough to consider in at least a bit more depth, and if taken seriously it might have ramifications on how to think about current events.

I am interested in at least seeing a rough pass of how this post fares in the vote. I'd like to see a distillation of this post, plus Scott's Ages of Discord post, plus the SSC subreddit's response to Peter Turchin's response. (Maybe this already happened in some SSC highligh... (read more)

I... feel like this post is important. But I'm not actually sure how to use it and build around it.

I have vague memories of seeing this link-dropped by folk in the Benquo/Jessicata/Zack/Zvi crowd in various comments, but usually in a way that feels more like an injoke than a substantive point. 

I just checked the three top-level-post pingbacks, and I do think they make meaningful reference to this post. Which I think is sufficient for "yes this concept got followed up on in the past 2 years". But I'm left with a vague frustration with the concept feeli... (read more)

Author here: I think this post could use a bunch of improvements. It spends a bunch of time on tangential things (e.g. the discussion of Inadequacy and why this doesn't come through in textbooks, spending a while initially setting up a view to then tear down). 

But really what would be nice is to have it do a much better job at delivering the core insight. This is currently just done in two bullets + one exercise for the reader. 

Even more important would be to include JenniferRM's comment which adds a core mechanism (something like "cultural learn... (read more)

This points out something true and important that is often not noticed, and definitely is under-considered. That seems very good. The question I ask is, did this cause other people to realize this effect exists, and to remember to notice and think about it more? I don't know either way.

If so, it's an important post, and I'd be at moderately excited to include it. 

If not, it's not worth the space. 

I'm guessing this post could be improved/sharpened relatively easily, if it did get included - it's good, and there's nothing wrong exactly, but feels l... (read more)

adamshimi says almost everything I wanted to say in my review, so I am very glad he made the points he did, and I would love for both his review and the top level post to be included in the book. 

The key thing I want to emphasize a bit more is that I think the post as given is very abstract, and I have personally gotten a lot of value out of trying to think of more concrete scenarios where gradient hacking can occur. 

I think one of the weakest aspects of the post is that it starts with the assumption that an AI system has already given rise to an... (read more)

I've written up a review here, which I made into a separate post because it's long.

Now that I read the instructions more carefully, I realize that I maybe should have just put it here and waited for mods to promote it if they wanted to. Oops, sorry, happy to undo if you like.

This is a retroactively obvious concept that I'd never seen so clearly stated before, which makes it a fantastic contribution to our repertoire of ideas. I've even used it to sanity-check my statements on social media. Well, I've tried.

Recommended, obviously.

This makes a simple and valuable point. As discussed in and below Anna's comment, it's very different when applied to a person who can interact with you directly versus a person whose works you read. But the usefulness in the latter context, and the way I expect new readers to assume that context, leads me to recommend it.

I liked the comments on this post more than I liked the post itself. As Paul commented, there's as much criticism of short AGI timelines as there is of long AGI timelines; and as Scott pointed out, this was an uncharitable take on AI proponents' motives.

Without the context of those comments, I don't recommend this post for inclusion.

Here are prediction questions for the predictions that TurnTrout himself provided in the concluding post of the Reframing Impact sequence

Elicit Prediction (eli
... (read more)

I continue to agree with my original comment on this post (though it is a bit long-winded and goes off on more tangents than I would like), and I think it can serve as a review of this post.

If this post were to be rewritten, I'd be particularly interested to hear example "deployment scenarios" where we use an AGI without human models and this makes the future go well. I know of two examples:

  1. We use strong global coordination to ensure that no powerful AI systems with human models are ever deployed.
  2. We build an AGI that can do science / engineering really wel
... (read more)

(You can find a list of all 2019 Review poll questions here.)

I'm the author, writing a review/reflection.

I wrote this post mainly to express myself and make more real my understanding of my own situation. The summer of 2019 I was doing a lot of exploration on how I felt and experience the world, and also I was doing lots of detective work trying to understand "how I got to now."

The most valuable thing it adds is a detailed example of what it feels like to mishandle advice about emotions from the inside. This was prompted by the fact that younger me "already knew" about dealing with his emotions, and I wanted to writ... (read more)

I've referred and linked to this post in discussions outside the rationalist community; that's how important the principle is. (Many people understand the idea in the domain of consent, but have never thought about it in the domain of epistemology.)

Recommended.

I've known about S-curves for a long time, and I don't think I read this the first time. If you don't know S-curves exist, this has good info, and it seems to be well explained. There are also a few useful nuggets otherwise. As someone who has long known of S-curves, hard to say how big an insight this is to others, but my instinct is that while I have nothing against this post and I'm very glad it exists, this isn't sufficiently essential to justify including. 

Oh man, I loved this post. Very vivid mental model for subtle inferential gaps and cross-purposes.

It's surprising it's been so long since I thought about it! Surely if it's such a strong and well-communicated mental model, I would have started using it.. So why did I not?

My guess: the vast majority of conversation frames are still *not the frame which contains talking about frames*. I started recognizing when I wanted to have a timeout and clarify the particular flavor of conversation desired before continuing, but it felt like I broke focus or rappor anyt... (read more)

I think the CAIS framing that Eric Drexler proposed gave concrete shape to a set of intuitions that many people have been relying on for their thinking about AGI. I also tend to think that those intuitions and models aren't actually very good at modeling AGI, but I nevertheless think it productively moved the discourse forward a good bit. 

In particular I am very grateful about the comment thread between Wei Dai and Rohin, which really helped me engage with the CAIS ideas, and I think were necessary to get me to my current understanding of CAIS and to ... (read more)

So. I have the distinct sense I just read an unusually mathematical vagueblog.

Was there a way to explain these dynamics with concrete examples, and *NOT* have those groups' politics blow up in your face about it? Not sure. I'm really not sure. Could do with a flow chart?

I would be fascinated to see this in the form of a flowchart, and *then* run an experiment to test if jointly going through it shortens the time it takes to get two people arguing over norms/punishments to a state of double-crux.

This is so lovely! Pure happy infodump energy. Reveling in the wonder of reality.

It's very close to a zetetic explanation, which I massively approve of. You might even say it's a concrete example.

(Side note: Rebar reinforced concrete was a mistake. It rusts in place and this fucks up so much modern architecture.)

I think the point this post makes is right, both as a literal definition of what a rule is, and of how you should respond to the tendency to make "exceptions." I prefer the notion of a "framework" to a rule, because it suggests that the rules can be flexible, layered, and only operating in specific contexts (where appropriate). For example, I'm trying to implement a set of rules about when I take breaks from work, but the rule "25 minutes on, 5 minutes off" only is valid when I'm actually at work.

My point of disagreement is the conclusion - that exceptions... (read more)

tl;dr – I'd like to see further work that examines a ton of examples of real coordination problems that rationalists have run into ("stag hunt" shaped and otherwise), and then attempt to extract more general life lessons and ontologies from that. 

...

1.5 years later, this post still seems good for helping to understand the nuances of stag hunts, and I think was worth a re-read. But something that strikes me as I reread it is that it doesn't have any particular takeaway. Or rather, it doesn't compress easily. 

I spent 5 minutes seeing if I could dis... (read more)

I read this post only half a year ago after seeing it being referenced in several different places, mostly as a newer, better alternative to the existing FOOM-type failure scenarios. I also didn't follow the comments on this post when it came out.

This post makes a lot of sense in Christiano's worldview, where we have a relatively continuous, somewhat multipolar takeoff which to a large extent inherits the problem in our current world. This is especially applies to part I: we already have many different instances of scenarios where humans follow measured in... (read more)

(Self-review.) I oppose including this post in a Best-of-2019 collection. I stand by what I wrote, but it's not potential "canon" material, because this was a "defensive" post for the 2018 Review: if the "contextualizing vs. decoupling" idea hadn't been as popular and well-received as it was, there would be no reason for this post to exist.

A standalone Less Wrong "house brand" explanation of Gricean implicature (in terms of Bayesian signaling games, probably?) could be a useful reference post, but that's not what this is.

The factual point that moderate liberals are more censorious is easy to lose track of, and I saw confusion about it today that sent me back to this article.

I appreciate that this post starts from a study, and outlines not just the headline from the study but the sample size. I might appreciate more details on the numbers, such as how big the error bars are, especially for subgroups stats.

Historical context links are good, and I confirm that they state what they claim to state.

Renee DiResta is no longer at New Knowledge, though her previous work there is st... (read more)

I see where Raemon is going with this, and for a simplified model, where number of words is the only factor, this is at least plausible. Super-simplified models can be useful not only insofar as they make accurate predictions, but because they suggest what a slightly more complex model might look like.

In this case, what other factors play into the number of people you can coordinate with about X words?

Motivation (payment, commitment to a cause, social ties, status) Repetition, word choice, presentation Intelligence of the audience Concreteness and familiar... (read more)

in retrospect it looks like understanding cognitive biases doesn’t actually make you substantially more effective

 

I'm not convinced that this is true, or that it's an important critique of the original sequences.

 

Looking at the definition of agent, I'm curious how this matches with Cartesian Frames.

Given that we want to learn to think about humans in a new way, we should look for ways to map the new way of thinking into a native mode of thought

I was very happy to read this pingback, but it's purely anecdotal. There are better sources for t... (read more)

This post sparked some meta topic ideas to extend the conversation on note taking and productivity:

  • A list of 50 factors influencing productivity, such as "notetaking methods," "desk setup" and "cold-emailing experts to ask questions" so that people could get a broad perspective on aspects of their productivity to explore.
  • A map of books or web pages listing numerous examples and descriptions in each factor category so that people could experiment.
  • When people study productivity methods, how do they go about it? Are the research methods sound?
  • I tried this met
... (read more)

This post gave a slightly better understanding of the dynamics happening inside SGD. I think deep double descent is strong evidence that something like a simplicity prior exists in SGG, which might have actively bad generalization properties, e.g. by incentivizing deceptive alignment. I remain cautiously optimistic that approaches like Learning the Prior can get circumnavigate this problem.

Can you help me paint a specific mental picture of a driver being exploited by Uber?

I've had similar "exploitation" arguments with people:

"Commodification" and "dehumanization" don't mean anything unless you can point to their concrete effects.

I think your way of handling it is much, much better than how I've handled it. It comes across as less adversarial while still making the other person do the work of explaining themselves better. I've found that small tricks like this can completely flip a conversation from dysfunctional to effective. I'll have to remember to use your suggestion.

This idea seems obviously correct, all the responses to objections seem correct, and the chance of this happening any time soon is about epsilon. 

In some sense I wish the reasons it will never happen were less obvious than they are, so it would be a better example of our inability to do things that are obviously correct. 

The question is, how much does this add to the collection. Do we want to use a slot on practical good ideas that we could totally do if we could do things, and used to do? I'm not sure. 

Kaj_sotala's book summary provided me with something I hadn't seen before - a non-mysterious answer to the question of consciousness. And I say this as someone who took graduate level courses in neuroscience (albeit a few years before the book was published). Briefly, the book defines consciousness as the ability to access and communicate sensory signals, and shows that this correlates highly with those signals being shared over a cortical Global Neuronal Workspace (GNW). It further correlates with access to working memory. The review also gives a great ac... (read more)

Author of the post here. I edited the post by:

(1) adding an introduction — for context, and to make the example in Part I less abrupt

(2) editing the last section — the original version was centered on my conversations with Rationalists in 2011-2014; I changed it to be a more general discussion, so as to broaden the post's applicability and make the post more accessible

(You can find a list of all 2019 Review poll questions here.)

I've read a lot of books in the self-help/therapy/psychology cluster, but this is the first which gives a clear and plausible model of why the mental structure they're all working with (IFS exiles, EMDR unprocessed memories, trauma) has enough fitness-enhancing value to evolve despite the obvious costs.

It seems to me that there has been enough unanswered criticism of the implications of coherence theorems for making predictions about AGI that it would be quite misleading to include this post in the 2019 review. 

In an earlier review, johnswentworth argues:

I think instrumental convergence provides a strong argument that...we can use trade-offs with those resources in order to work out implied preferences over everything else, at least for the sorts of "agents" we actually care about (i.e. agents which have significant impact on the world).

I think this... (read more)

A Question post!

I think I want to write up a summary of the 2009 Nobel Prize book I own on commons governance. This post had me update to think it's more topically relevant than I realized.

The LW review could use more question posts, if the goal is to solidify something like a canon of articles to build on. A question invites responses. I am disappointed in the existing answers, which appear less thought through than the question. Good curation, good nomination.

I like that this responds to a conflict between two of Eliezer's posts that are far apart in time. That seems like a strong indicator that it's actually building on something.

Either "just say the truth", or "just say whatever you feel you're expected to say" are both likely better strategies.

I find this believable but not obvious. For example, if the pressure on you is you'll be executed for saying the truth, saying nothing is probably better that saying the truth. If the pressure on you is remembering being bullied on tumblr, and you're being asked if you... (read more)

(Epistemic status: I don’t have much background in this. Not particularly confident, and attempting to avoid making statements that don’t seem strongly supported.)

I found this post interesting and useful, because it brought a clear unexpected result to the fore, and proposed a potential model that seems not incongruent with reality. On a meta-level, I think supporting these types of posts is quite good, especially because this one has a clear distinction between the “hard thing to explain” and the “potential explanation,” which seems very important to allo... (read more)

I’m pretty impressed by this post overall, not necessarily because of the object-level arguments (though those are good as well), but because I think it’s emblematic of a very good epistemic habit that is unfortunately rare. The debate between Hanson and Zvi over this, like habryka noted, is a excellent example of how to do good object-level debate that reveals details of shared models over text. I suspect that this is the best post to canonize to reward that, but I’m not convinced of this. On the meta-level, the one major improvement/further work I’d lik... (read more)

Crucial. I definitely remember reading this and thinking it was one of the most valuable posts I'd seen all year. Good logical structure.

But it's hard to read? It has jarring, erratic rhetoric flow; succinct where elaboration is predictably needed, and verbose where it is redundant. A mathematician's scratch notes, I think.

I agree it would be good to add a note about push polling, but it's also good to note that the absence of information is itself a choice! The most spare possible survey is not necessarily the most informative. The question of what is a neutral framing is a tricky one, and a question about the future that deliberate does not draw attention to responsibilities is not necessarily less push-poll-y than one that does.

One good idea to take out of this is that other people's ability to articulate their reasons for their belief can be weak—weak enough that it can distract from the strength of evidence for the actual belief. (More people can catch a ball than explain why it follows the arc that it does).

Agreeing that just the final paragraph would be a good idea to include; otherwise, I don't think this passes my bar for "worth including as best-of."

Given all of the discussion around simulacra, I would be disappointed if this post wasn't updated in light of this.

I would love to see a more concise version of this.

This is an excellent post - my only question is how accurately this translates the Buddhism which is not something I'm qualified to have a strong opinion on. Nonethless, it matches my limited understanding of meditation.

Of the two progress studies up for review, I think this is better than the invention of concrete one. Mostly because it dips more into how the development of fertilizer interacted with other domains (notably: war), as well as some politics/history.

This part was actually most interesting to me, which you may have missed if you started reading and then decided "meh, I don't care how artificial fertilizer was invented."

The Alchemy of Air is as much about the lives of Haber and Bosch, and what happened after their process became a reality, as it is about the s

... (read more)

While the sort of Zettelkasten-adjacent notes that I do in Roam have really helped how I do research, I'd say No to this article. The literal Zettelkasten method is adapted to a world without hypertext, which is why I describe [what everyone does in Roam] as Zettelkasten-adjacent instead of Zettelkasten proper.

This is not to knock this post, it's a good overview of the literal Zettelkasten method. But I don't think it should be included.

These are good lists of open problems, although as Ben notes are bad lists if they are to be considered all the open problems. I don't think that is the fault of the post, and it's easy enough to make clear the lists are not meant to be complete. 

This seems like a spot where a good list of open problems is a good idea, but here we're mostly going to be taking a few comments. I think that's still a reasonable use of space, but not exciting enough to think of this as important.

I'm all for such things existing and a book entirely composed of such things seems like it should exist, but I don't know what it would be doing in this particular book. 

The combination of the two previous reviews, by hamnox and fiddler, seem to summarize: It's a pure happy infodump that doesn't add much, that gets you a lot of upvotes, and that says more about the voting system than about what is valuable.

I don't think this post introduced its ideas to many people, including Raemon and Ben who nominated it. Nor does it seem like it provides a superior frame with which to examine those issues. Not recommended.

Zvi wrote a two whole posts on perfect/imperfect competition and how more competition can be bad. However, this is the only post that has really stuck with me in teaching me how increased competition can be worse overall for the system, and helped me appreciate Moloch in more detail. I expect to vote for this post around +4 or +5.

As with one or two others by Zvi, I think it's a touch longer than it needs to be, and can be made more concise.

This is a core piece of a mental toolkit, being able to quantify life choices like this, and the post explains it well. I think I would like the a version in the book to spend a bit more space helping the reader do the calculation that you do in the Clearer Thinking tool. A lot of the value of the post is in showing how to use the number to make decisions.

I think it's a valuable post, and I expect to vote for it somewhere in the range of +2 to +4.

I'm probably going to write a second review that is more accessible. But, first: I made a couple vague promises here:

  • I said I would try to think of examples of what it would look like, if "someone I trusted, who looked like they had a deep model, in fact was just highly motivated." (I said I'd think about it in advance so that if I learned new facts about someone I trusted, I wouldn't confabulate excuses for them)
  • I said I would think more about my factual cruxes for "propagating the level of fear/disgust/concern that Benquo/Jessica/Zack had, into my own ae
... (read more)

I continue to think this post is important, for basically the same reasons as I did when I curated it. I think for many conversations, having the affordance and vocabulary to talk about frames makes the difference between them going well and them going poorly.

So, I think this post is pretty bad as a 'comprehensive' list of the open problems, or as 'the rationality agenda'. All of the top answers (Wei, Scott, Brienne, Thrasymachus) add something valuable, but I'd be pretty unhappy if this was considered the canonical answer to "what is the research agenda of LW", or our best attempt at answering that question (I think we can do a lot better). I think it doesn't address many things I care about. Here's a few examples:

  • What are the best exercises for improving your rationality? Fermi estimates, Thinking Physics, Ca
... (read more)

On initially reading it, I found it quite interesting, but over time it's come to shape my thinking much more than I expected.

Robin has correctly pointed out that blackmail is just a special case of free trade between apparently consenting adults, which tends to be pretty good, and you need quite a strong argument for making the law interfere with that. He also points out that it creates good incentives not to do things that you wouldn't want people finding out about.

However Zvi's point is that this is an incredibly strong incentive for someone to ruin you... (read more)

Alas, I haven't made it through this post. I do not understand what I have made of it, and nor does anyone else I know (except maybe Jacob Falkovich). I do wish there had been real conversation around this post, and I think there's some probability (~30%) that I will look back and deeply regret not engaging with it much more, but in my current epistemic state I can only vote against its inclusion in the book. Somewhere around -1 to -4.

I've made these comments previously, but for purposes of having at least one official review:

  1. I think the names aren't optimal. Zombie days or Slug days or some-such seem like an improvement for "recovery days." I think it's also possible that Rest day might be more unambiguous if it were called a "Restorative day" or somethjing.
  2. I think the post doesn't quite call enough attention to "this is specifically about listening to your gut. Your gut is a specific part of your body. Listening to it is a skill you might not have. Listening to it is a useful source o
... (read more)

A good explanation of the difference between intellectual exploration and promoting people. You don't need to agree with everything someone says, and you don't even need to like them, but if they occasionally provide good insight, they are worth taking into account. If you propagate this strategy, you may even get to a "wisdom of the crowds" scenario - you'll have many voices to integrate in your own thinking, potentially getting you farther along than if you just had one thought leader you liked.

Having many smart people you don't necessarily agree with, l... (read more)

I don't have much to say in a review I didn't already say in my nomination. But, a key point of this post is "the math checks out in a way that thoroughly dissolves a confusion" and I'd kinda like it if someone else did a more thorough review that the math actually checks out.

Update: Made these changes

I originally wrote this post because I saw quite a few of what I perceived mistakes in the reasoning of rationalists around predicting trends and innovation.

  • People confusing s-curves with exponentinal growth.
  • People confusing evolution and diffusion curves, and assuming they were the same thing.
  • People making basic mistakes about how technologies would likely evolve, because they didn't understand historical evolutionary patterns.

At the time, I thought that simply making a post explaining the models they were missing would create a ... (read more)

Okay, whenever I read this post, I don't get it.

There's some fermi-estimation happening, but the fermi is obviously wrong. As Benquo points out, certain religions have EVERYONE read their book, memorize it, chant it, discuss it every Sunday (or Saturday).

I feel like the post is saying "there are lots of bandwidth problems. the solution to all of them is '5'." and I don't get why 5.

So I read Ray's comment on Daniel Filan's review, where he says:

...at some maximum scale, your coordination-complexity is bottlenecked on a single working-memory-cluster, which (

... (read more)

This reminds me of That Alien Message, but as a parable about mesa-alignment rather than outer alignment. It reads well, and helps make the concepts more salient. Recommended.

I made some prediction questions for this, and as of January 9th, there interestingly seems to be some disagreement with the author on these. 

Would definitely be curious for some discussion between Matthew and some of the people with low-ish predictions. Or perhaps for Matthew to clarify the argument made on these points, and see if that changes people's minds.

I took some liberties in operationalising what seemed to me a core thesis underlying the post. Let me know if you think it doesn't really capture the important stuff!

(You can find a list of all review poll questions here.)

Broken image link! Broken image link! Sad.

Using "evolution" when referring to things other than breeding populations, BOO!

Important concept. Very light on backing evidence and references. I wanna hear more about Systems and Networks angles. Could do with fewer examples of Innovation.

I tend to try to do things that I think are in my comparative advantage. This post hammered home the point that comparative advantage exists along multiple dimensions. For example, as a pseudo-student, I have almost no accumulated career capital, so I risk less by doing projects that might not pan out (under the assumption that career capital gets less useful over time). This fact can be combined other properties I have to more precisely determine comparative advantage.

This post also gives the useful intuition that being good at multiple things exponentially cuts down the number of people you're competing with. I use this heuristic a reasonable amount when trying to decide the best projects to be working on.

Include bendini's post with it.

But it shows all the free energy in the world. Good nod to Inadequate Equilibriua.

More than a year since writing this post, I would still say it represents the key ideas in the sequence on mesa-optimisation which remain central in today's conversations on mesa-optimisation. I still largely stand by what I wrote, and recommend this post as a complement to that sequence for two reasons:

First, skipping some detail allows it to focus on the important points, making it better-suited than the full sequence for obtaining an overview of the area. 

Second, unlike the sequence, it deemphasises the mechanism of optimisation, and explicitly cas... (read more)

I think this post significantly benefits in popularity, and lacks in rigor and epistemic value, from being written in English. The assumptions that the post makes in some part of the post contradict the judgements reached in others, and the entire post, in my eyes, does not support its conclusion. I have two main issues with the post, neither of which involve the title or the concept, which I find excellent:

First, the concrete examples presented in the article point towards a different definition of optimal takeover than is eventually reached. All of the p... (read more)

very clear and simple. tempting to dismiss this as not significant/novel, but there is a place for presenting basic things well.

And it's positively framed. We could all use a little hope right now.

The noise in my model o