If you have 1000+ karma, you have until Dec 1st to nominate LessWrong posts from 2018 (yes, 2018, not 2019) for the first LessWrong Review. The nomination button is available from a post's dropdown menu.
Multiple nominations are helpful – posts with enough nominations will proceed to a review phase (ending December 31st), followed by a week of voting. Details below.
The LW team will be compiling the best posts and reviews into a physical book, awarding $2000 divided among top posts and (up to) $2000 divided among top reviews.
Helpful Links:
... (Read more)Suppose I wanted to get good intuitions about how the world works on historical timescales.
I could study history, but just reading history is rife with historical hindsight bias, both on my own part, and even worse, on the part of the authors I'm reading.
So if I wanted to master history, a better way would be to do it forecasting-style. I read what was happening in the some part of the world, up to a particular point in time, and then make bets about what will happen next. This way, I have feedback as I'm learning, and I'm training an actual historical predictor.
However, this re... (Read more)
I think this is a challenging and non-trivial question, which I've considered before, but I'm less pessimistic than some other commenters.
I think what we really should do is to fund someone to research and build a rigorous training set along these lines, using some kind of bias avoiding methodology (eg clever pre-registration, systematic protocols for what data to include, etc ).
I find it conceivable but very implausible that doing this will make you worse, and can certainly imagine that doing it might make you a lot better. Though most plausibly it will h
... (Read more)(Click to expand thread. ⌘F to Expand All)Cmd/Ctrl F to expand all comments on this postReply to: Decoupling vs Contextualising Norms
Chris Leong, following John Nerst, distinguishes between two alleged discursive norm-sets. Under "decoupling norms", it is understood that claims should be considered in isolation; under "contextualizing norms", it is understood that those making claims should also address potential implications of those claims in context.
I argue that, at best, this is a false dichotomy that fails to clarify the underlying issues—and at worst (through no fault of Leong or Nerst), the concept of "contextualizing norms" has the potential to legitimize derailing discus
... (Read more)Yeah I realized that when reading through but going back and changing everything feels pointless since you basically get the implicature of what I was trying to say.
Followup to: Possibility and Could-ness
Arthur Schopenhauer (1788-1860) said:
"A man can do as he wills, but not will as he wills."
For this fascinating sentence, I immediately saw two interpretations; and then, after some further thought, two more interpretations.
On the first interpretation, Schopenhauer forbids us to build circular causal models of human psychology. The explanation for someone's current will cannot be their current will - though it can include their past will.
On the second interpretation, the sentence says that alternate choices are not reachable - that we... (Read more)
Let's be careful not to conflate choice with free will. It does seem quite inescapable that there is no room for free will, nevertheless, choice happens all the time. Perhaps the phenomenon of choosing cannot be feasibly examined at the same order of granularity that is required for the examination of free-will (just as it is unfeasible to examine the function of a house by examining each of its constituent atoms)? Perhaps choice is a phenomenon emergent, not reliant on any underlying freedom of will?
Epistemic status: pretty confident. Based on several years of meditation experience combined with various pieces of Buddhist theory as popularized in various sources, including but not limited to books like The Mind Illuminated, Mastering the Core Teachings of the Buddha, and The Seeing That Frees; also discussions with other people who have practiced meditation, and scatterings of cognitive psychology papers that relate to the topic. The part that I’m the least confident of is the long-term nature of enlightenment; I’m speculating on what comes next based on what I’ve experienced, but have no... (Read more)
I think this post basically succeeds at its goal (given the discussion in the comments), and feels like an important precursor to discussion of some of the directions the LW community has been moving in for the last several years. I think the connection to cognitive fusion was novel to me when I first read it, but immediately clicked in place.
In his AI Safety “Success Stories” post, Wei Dai writes:
[This] comparison table makes Research Assistant seem a particularly attractive scenario to aim for, as a stepping stone to a more definitive success story. Is this conclusion actually justified?
I share Wei Dai's intuition that the Research Assistant path is neglected, and I want to better understand the safety problems involved in this path.
Specifically, I'm envisioning AI research assistants, built without any kind of reinforcement learning, that help AI alignment researchers identify, understand, and solve AI alignment problems. S
... (Read more)... (Read more)(Click to expand thread. ⌘F to Expand All)Cmd/Ctrl F to expand all comments on this postI suspect that the concept of utility functions that are specified over your actions is fuzzy in a problematic way. Does it refer to utility functions that are defined over the physical representation of the computer (e.g. the configuration of atoms in certain RAM memory cells that their value represents the selected action)? If so, we're talking about systems that 'want to affect (some part of) the world', and thus we should expect such systems to have convergent instrumental goals with respect to our world (e.g. taking control over as much resources in
Here’s a pattern I’d like to be able to talk about. It might be known under a certain name somewhere, but if it is, I don’t know it. I call it a Spaghetti Tower. It shows up in large complex systems that are built haphazardly.
Someone or something builds the first Part A.

Later, someone wants to put a second Part B on top of Part A, either out of convenience (a common function, just somewhere to put it) or as a refinement to Part A.

Now, suppose you want to tweak Part A. If you do that, you might break Part B, since it interacts with bits of Part A. So you might instead build Part C on top of t... (Read more)
Endorsed.
What is the good life? Probably one of the most important questions we have to ask. Science is moving closer to answering it. This post is a map of what I believe makes a good life. I'm putting it here for you to critique, to hear and maybe it inspires some useful introspection.You guys are smart people, you reason well, and you don't hold back in your criticisms. I hope you will do the same for this post. Thank you for your time.
This is where science falls short. We can figure out how to get from A to B, but I find it unlikely that science can define a B without axi... (Read more)
Why does it matter that “i am sensitive to others’ needs”? If I’m happy being selfish, that shouldn’t matter.
I want to quickly draw attention to a concept in AI alignment: Robustness to Scale. Briefly, you want your proposal for an AI to be robust (or at least fail gracefully) to changes in its level of capabilities. I discuss three different types of robustness to scale: robustness to scaling up, robustness to scaling down, and robustness to relative scale.
The purpose of this post is to communicate, not to persuade. It may be that we want to bite the bullet of the strongest form of robustness to scale, and build an AGI that is simply not robust to scale, but if we do, we should at least realize that... (Read more)
Robustness to scale is still one of my primary explanations for why MIRI-style alignment research is useful, and why alignment work in general should be front-loaded. I am less sure about this specific post as an introduction to the concept (since I had it before the post, and don't know if anyone got it form this post), but think that the distillation of concepts floating around meatspace to clear reference works is one of the important functions of LW.
Epistemic Status: Simple point, supported by anecdotes and a straightforward model, not yet validated in any rigorous sense I know of, but IMO worth a quick reflection to see if it might be helpful to you.
A curious thing I've noticed: among the friends whose inner monologues I get to hear, the most self-sacrificing ones are frequently worried they are being too selfish, the loudest ones are constantly afraid they are not being heard, the most introverted ones are regularly terrified that they're claiming more than their share of the conversation, the most assertive ones are always su... (Read more)
I think this post is a clear explanation of an important point in system dynamics, connected to personal psychology (and thus individual rationality). It connects to other important concepts, like the tails coming apart and Goodhart, while being distinct (in a way that I think helps clarify them all).
The user cousin_it has pointed out a problem with the counterfactual Oracle idea: the Oracle AIs may form a "bucket chain" bringing back a dangerous message from a future UFAI (unfriendly AI).
This is certainly a problem, and though there are ways of reducing the risk, there doesn't seem to be any clean solutions to it.
The basic idea is simple. Suppose there is a counterfactual Oracle, . It makes a prediction about the value of some variable , in two days time.
However, in one day's time, an UFAI will be unleashed. It will take over everything, includ
... (Read more)It looks like a reincarnation of the RB idea, now as a chain, not one-short game.
If there are many possible UFAIs in the future, they could acausally compete for the O's reward channel, and this would create some noise and may work as a protection.
It also reminds me of the SETI-attack, now in time, not space. Recently I had a random shower thought that if all quantum computers occured to be connected with each other via some form of entaglement, when aleins could infiltrate our quantum computers as their quantum computers will be connected to such pa... (Read more)(Click to expand thread. ⌘F to Expand All)Cmd/Ctrl F to expand all comments on this post
On December 14th Yahoo will shut down Yahoo Groups. Since my communities have mostly moved away from @yahoogroups.com hosting, to Facebook, @googlegroups, and other places, the bit that hit me was that they are deleting all the mailing list archives.
Digital archives of text conversations are close to ideal from the perspective of a historian: unlike in-person or audio-based interaction this naturally leaves a skimmable and easily searchable record. If I want to know, say, what people were thinking about in the early days of GiveWell, their early blog posts (including comments) ... (Read more)
Thanks for providing a clue and example of what to do with all these lovely *.json files after we've captured them. I wouldn't call those archives nice or ideal from a subject finding / rereading standpoint, but at least they work, and it doesn't take a lot of effort! Maybe better archive display strategies will emerge after the Yahoo debacle has been over with for a bit.
Exactly two years to the day I started writing this post I published Map and Territory's most popular post of all time, "Doxa, Episteme, and Gnosis" (also here on LW). In that post I describe a distinction ancient Greek made between three kinds of knowledge we might translate as hearsay, justified belief, and direct experience, respectively, although if I'm being totally honest I'm nowhere close to being a classics scholar so I probably drew a distinction between the three askew to the one ancient Attic Greeks would have made. Historical accuracy aside, the distinction... (Read more)
Everything that is not a literal quote from the previous post is new.
Epistemic status: trying to vaguely gesture at vague intuitions. A similar idea was explored here under the heading "the intelligibility of intelligence", although I hadn't seen it before writing this post.
There’s a mindset which is common in the rationalist community, which I call “realism about rationality” (the name being intended as a parallel to moral realism). I feel like my skepticism about agent foundations research is closely tied to my skepticism about this mindset, and so in this essay I try to articulate what it is.
Humans ascribe properties to entities in the world ... (Read more)
I think that rationality realism is to Bayesianism is to rationality anti-realism as theism is to Christianity is to atheism. Just like it's feasible and natural to write a post advocating and mainly talking about atheism, despite that position being based on default skepticism and in some sense defined by theism, I think it would be feasible and natural to write a post titled 'rationality anti-realism' that focussed on that proposition and described why it was true.
Why does our skin form wrinkles as we age?
This post will outline the answer in a few steps:
In the process, we’ll draw on sources from three different fields: mechanical engineering, animation, and physiology.
Imagine we have a material with two layers:
We squeeze this material from the sides, so the whole thi... (Read more)
The best source I've seen on the topic is this Physiological Reviews article (I've read other sources as well, but didn't keep around links for most of them).
Reversibility is specifically addressed - age-related muscle loss (aka sarcopenia) is not really reversible. There are things people can do at any age to add muscle (e.g. exercise), but muscle is lost if exercise/diet/etc is held constant. Masters athletes are a visible example of this.
Also, it's not just skeletal muscle. For example, the pupil muscle squeezes the lens of the eye t... (Read more)(Click to expand thread. ⌘F to Expand All)Cmd/Ctrl F to expand all comments on this post
The following was a presentation I made for Sören Elverlin's AI Safety Reading Group. I decided to draw everything by hand because powerpoint is boring. Thanks to Ben Pace for formatting it for LW! See also the IAF post detailing the research which this presentation is based on.





























Abram's writing and illustrations often distill technical insights into accessible, fun adventures. I've come to appreciate the importance and value of this expository style more and more over the last year, and this post is what first put me on this track. While more rigorous communication certainly has its place, clearly communicating the key conceptual insights behind a piece of work makes those insights available to the entire community.
This is an ultra-condensed version of the research agenda on synthesising human preferences:
In order to infer what a human wants from what they do, an AI needs to have a human theory of mind.
Theory of mind is something that humans have instinctively and subconsciously, but that isn't easy to spell out explicitly; therefore, by Moravec's paradox, it will be very hard to implant it into an AI, and this needs to be done deliberately.
One way of defining theory of mind is to look at how humans internally model the value of various hypothetical actions and events (happening to themselves and to othe
... (Read more)Having printed and read the full version, this ultra-simplified version was an useful summary.
Happy to read a (not-so-)simplified version (like 20-30 paragraphs).
NB: Originally published on Map and Territory on Medium. This is an old post originally published on 2016-09-10. It was never previously cross-posted or linked on LessWrong, so I'm adding it now for posterity. It's old enough that I can no longer confidently endorse it, and I won't bother trying to defend it if you find something wrong, but it might still be interesting.
I find Kegan’s model of psychological development extremely useful. Some folks I know disagree on various grounds. These are some accumulated responses to critiques I’ve encountered.
Before we dive i... (Read more)
So this is "replies to some objections to Kegan", and not "positive reasons to think Kegan is useful", yes? That is, it won't convince anyone that they should pay attention to Kegan, but if they've been previously convinced that they shouldn't, it might change their mind.
I confess I had been hoping for the positive reasons, but if it's not what you were going for, then fair enough.
... (Read more)(Click to expand thread. ⌘F to Expand All)Cmd/Ctrl F to expand all comments on this postBasically everyone trusts experts in math because it’s far from lived experience and we agree that mathematicians are the math experts even though we can only easily validate t
This is Part X of the Specificity Sequence
Cats notoriously get stuck in trees because their claws are better at climbing up than down. Throughout this sequence, we’ve seen how humans are similar: We get stuck in high-level abstractions because our brains struggle to unpack them into specifics. Our brains are better at climbing up (concrete→abstract) than down (abstract→concrete).

If you’ve ever struggled to draw a decent picture, you know what being stuck at a high level of abstraction feels like in the domain of visual processing. I know I do. My drawing skills are ... (Read more)
Rapid Viz by Kurt Hanks and How to Draw by Scott Robertson are excellent primers.
Last fall we had solar panels installed. Our roof is pretty marginal for solar, large parts blocked by trees and the remainder mostly facing West, but incentives were high enough that it looked decent. And even if it only broke even I still liked it for the resiliency advantages. It's currently doing slightly better than expected, so I'm happy!
We have fourteen LG Neon-R 360 watt panels, three facing South and eleven facing West:
The installers predicted we would see 3.45MWh in the first year, slowly ramping down to 3.25MWh/y over the next twenty five years as the panels degr... (Read more)
That was actually the original plan, but we decided that this process was complicated enough (at least as a first attempt), with the (relatively) narrow target of 2018.
My guess is that in future years, once this process has gelled into something everyone understands and the kinks are ironed out, is that it will include some kind of mechanism for including older works.