Nominated Posts for the 2018 Review

Sort by Fewest Reviews
76Give praise[anonymous]
0 0
118Unknown Knowns
0 0
0 0
0 0


Sorted by New

I didn't feel like I fully understood this post at the time when it was written, but in retrospect it feels like it's talking about essentially the same thing as Coherence Therapy does, just framed differently.

Any given symptom is coherently produced, in other words, by either (1) how the individual strives, without conscious awareness, to carry out strategies for safety or well-being; or (2) how the individual responds to having suffered violations of safety or well-being. This model of symptom production is squarely in accord with the construct
... (read more)

This is probably the post I got the most value out of in 2018. This is not so much because the precise ideas (although I have got value out of the principle of meta-honesty, directly), but because it was an attempt to understand and resolve a confusing, difficult domain. Eliezer explores various issues facing meta-honesty – the privilege inherent in being fast-talking enough to remain honesty in tricky domains, and the various subtleties of meta-honesty that might make it too subtly a set of rules to coordinate around.

This illustration of "how to contend w

... (read more)

I'm a bit torn here, because the ideas in the post seem really important/useful to me (e.g., I use these phrases as a mental pointer sometimes), such that I'd want anyone trying to make sense of the human situation to have access to them (via this post or a number of other attempts at articulating much the same, e.g. "Elephant and the Brain"). And at the same time I think there's some crucial misunderstanding in it that is dangerous and that I can't articulate. Voting for it anyhow though.

[Rambly notes while voting.] This post has some merit, but it feels too...jumpy, and, as the initial comments point out, it's unclear in what's being considered "explicit" vs "implicit" communication. Only getting to the comments did I realize that the author's sense of those words was not quite my own.

I'm also not sure it's either 1) telling the whole picture, vs 2) correct. A couple of examples are brought, but examples are easy to cherry-pick. The fact that the case brought with Bruce Lee seemed to be in favor of a non-compassionate feels maybe, maybe l

... (read more)

I just re-read this sequence. Babble has definitely made its way into my core vocabulary. I think of "improving both the Babble and Prune of LessWrong" as being central to my current goals, and I think this post was counterfactually relevant for that. Originally I had planned to vote weakly in favor of this post, but am currently positioning it more at the upper-mid-range of my votes.

I think it's somewhat unfortunate that the Review focused only on posts, as opposed to sequences as a whole. I just re-read this sequence, and I think the posts More Babble, P

... (read more)

This post is well written and not over-long. If the concepts it describes are unfamiliar to you, it is a well written introduction. If you're already familiar with them, you can skim it quickly for a warm feeling of validation.

I think the post would be even better with a short introduction describing its topic and scope, but I'm aware that other people have different preferences. In particular:

  • There are more than two 'cultures' or styles of discussion, perhaps many more. The post calls this out towards the end (apparently this is new in
... (read more)

In this essay Paul Christiano proposes a definition of "AI alignment" which is more narrow than other definitions that are often employed. Specifically, Paul suggests defining alignment in terms of the motivation of the agent (which should be, helping the user), rather than what the agent actually does. That is, as long as the agent "means well", it is aligned, even if errors in its assumptions about the user's preferences or about the world at large lead it to actions that are bad for the user.

Rohin Shah's comment on the essay (which I believe is endorsed

... (read more)

This post raises some reasonable-sounding and important-if-true hypotheses. There seems to be a vast open space of possible predictions, relevant observations, and alternative explanations. A lot of it has good treatment, but not on LW, as far as I know.

I would recommend this post as an introduction to some ideas and a starting point, but not as a good argument or a basis for any firm conclusions. I hope to see more content about this on LW in the future.

I think it was important to have something like this post exist. However, I now think it's not fit for purpose. In this discussion thread, rohinmshah, abramdemski and I end up spilling a lot of ink about a disagreement that ended up being at least partially because we took 'realism about rationality' to mean different things. rohinmshah thought that irrealism would mean that the theory of rationality was about as real as the theory of liberalism, abramdemski thought that irrealism would mean that the theory of rationality would be about as real as the theo

... (read more)

In the comments of this post, Scott Garrabrant says:

I think that Embedded Agency is basically a refactoring of Agent Foundations in a way that gives one central curiosity based goalpost, rather than making it look like a bunch of independent problems. It is mostly all the same problems, but it was previously packaged as "Here are a bunch of things we wish we understood about aligning AI," and in repackaged as "Here is a central mystery of the universe, and here are a bunch things we don't understand about it." It is not a coincidence that they are the sa

... (read more)
Load More


Sorted by Top

Daniel Filan's bottle cap example was featured prominently in "Risks from Learned Optimization" for good reason. I think it is a really clear and useful example of why you might want to care about the internals of an optimization algorithm and not just its behavior, and helped motivate that framing in the "Risks from Learned Optimization" paper.

This part is very important (the recursive distortion of a conscious, strategic lie is less bad than the alternative of trashing your ability to think in general):

Carl: "[...] But there's a big difference between acting immorally because you deceived yourself, and acting immorally with a clear picture of what you're doing."

Worker: "Yes, the second one is much less bad!"

Carl: "What?"

Worker: "All else being equal, it's better to have clearer beliefs than muddier ones, right?"

Reading Alex Zhu's Paul agenda FAQ was the first time I felt like I understood Paul's agenda in its entirety as opposed to only understanding individual bits and pieces. I think this FAQ was a major contributing factor in me eventually coming to work on Paul's agenda.

I'm a bit surprised that most of the previous discussion here was focused on the "okay, so how do you actually motivate people?" aspect of this post.

This post gave me a fairly strong "sit bolt upright in alarm" experience because of it's implications on epistemics, and I think those implications are sneaky and far reaching. I expect this phenomenon to influence people's ability to think and communicate, before you get to the point where you actually have a project that people are hitting the hard parts of. 

People form models of what sort of things the

... (read more)

This post not only made me understand the relevant positions better, but the two different perspectives on thinking about motivation have remained with me in general. (I often find the Harris one more useful, which is interesting by itself since he had been sold to me as "the guy who doesn't really understand philosophy".)

I actually have some understanding of what MIRI's Agent Foundations work is about

I think this post, together with Abram's other post "Towards a new technical explanation" actually convinced me that a bayesian approach to epistemology can't work in an embedded context, which was a really big shift for me. 

I kind of have conflicting feelings about this post, but still think it should at least be nominated for the 2018 review. 

I think the point about memetically transmitted ideas only really being able to perform a shallow, though maybe still crucial, part of cognition is pretty important and might deserve this to be nominated alone. 

But the overall point about clickbait and the internet feels also really important to me, but I also feel really conflicted because it kind of pattern-matches to a narrative that I feel performs badly on some reference-

... (read more)

Robustness to scale is still one of my primary explanations for why MIRI-style alignment research is useful, and why alignment work in general should be front-loaded. I am less sure about this specific post as an introduction to the concept (since I had it before the post, and don't know if anyone got it from this post), but think that the distillation of concepts floating around meatspace to clear reference works is one of the important functions of LW.

This post:

  • Tackles an important question. In particular, it seems quite valuable to me that someone who tries to build a platform for intellectual progress attempts to build their own concrete models of the domain and try to test those against history
  • It also has a spirit of empiricism and figuring things out yourself, rather than assuming that you can't learning anything from something that isn't an academic paper
  • Those are positive attributes and contribute to good epistemic norms on the margin. Yet at the same time, a culture of unchecked amateu
... (read more)
Load More