Recent Discussion

The LessWrong 2018 Review
832d6 min readShow Highlight

If you have 1000+ karma, you have until Dec 1st to nominate LessWrong posts from 2018 (yes, 2018, not 2019) for the first LessWrong Review. The nomination button is available from a post's dropdown menu. 

Multiple nominations are helpful – posts with enough nominations will proceed to a review phase (ending December 31st), followed by a week of voting. Details below.

The LW team will be compiling the best posts and reviews into a physical book, awarding $2000 divided among top posts and (up to) $2000 divided among top reviews.

Helpful Links:

... (Read more)
2orthonormal9m What do you think about doing this for 2017 and years previous?

That was actually the original plan, but we decided that this process was complicated enough (at least as a first attempt), with the (relatively) narrow target of 2018.

My guess is that in future years, once this process has gelled into something everyone understands and the kinks are ironed out, is that it will include some kind of mechanism for including older works.

3Raemon14m A lot of the details are up in the air – over the next week I plan to write out a lot of my thoughts and open questions about the review process, and how it should feel into the overall end product. One option is to include a curated selection of comments from the post. Another is to sort of leave that up to reviewers, to distill those comments down into a more succinct encapsulation of them. In some cases it might be that the commenters "got it right the first time", and basically wrote a fine "review-like comment" back in 2018, and there should be some way of marking an old comment as a review, retroactively. A middle ground might be something like "in addition to summarizing key points from the previous discussion, reviewers can point to particular comments that seem worth including". In the end, the editors will make some judgment calls about how much fits – we definitely wouldn't include the entire comment section of Circling. My guess is that the upper bound of "amount of comments and/or reviews from a given post to include" is roughly the same as "the upper bound for a post." (In some cases posts are quite long, but maybe expect the median comments/reviews-length to be comparable to the median post length)
3Raemon17m Thanks!

Suppose I wanted to get good intuitions about how the world works on historical timescales.

I could study history, but just reading history is rife with historical hindsight bias, both on my own part, and even worse, on the part of the authors I'm reading.

So if I wanted to master history, a better way would be to do it forecasting-style. I read what was happening in the some part of the world, up to a particular point in time, and then make bets about what will happen next. This way, I have feedback as I'm learning, and I'm training an actual historical predictor.

However, this re... (Read more)

I think this is a challenging and non-trivial question, which I've considered before, but I'm less pessimistic than some other commenters.

I think what we really should do is to fund someone to research and build a rigorous training set along these lines, using some kind of bias avoiding methodology (eg clever pre-registration, systematic protocols for what data to include, etc ).

I find it conceivable but very implausible that doing this will make you worse, and can certainly imagine that doing it might make you a lot better. Though most plausibly it will h

... (Read more)(Click to expand thread. ⌘F to Expand All)Cmd/Ctrl F to expand all comments on this post
1Liam Donovan9h What about reading modern history textbooks written at a particular time? I'm not sure how long textbooks like that have existed, but it seems like a good way to get secondary data about a specified time frame without data snooping.
2quanticle15h They're what the OP is looking to forecast, though. I pulled "money in circulation" example straight from the OP's post.
2elityre13h Well, I would be happy with whatever I can get. I'm not attached to those particular metrics.

Reply to: Decoupling vs Contextualising Norms

Chris Leong, following John Nerst, distinguishes between two alleged discursive norm-sets. Under "decoupling norms", it is understood that claims should be considered in isolation; under "contextualizing norms", it is understood that those making claims should also address potential implications of those claims in context.

I argue that, at best, this is a false dichotomy that fails to clarify the underlying issues—and at worst (through no fault of Leong or Nerst), the concept of "contextualizing norms" has the potential to legitimize derailing discus

... (Read more)
4Zack_M_Davis4h Yes, it does. "Contextualizers" think that the statement "Green-eyed people commit twice as many murders" creates an implicature that "... therefore green-eyed people should be stereotyped as criminals" that needs to be explicitly canceled with a disclaimer, which is an instance of the more general cognitive process by which most people think that "The washing machine is by the stairs" creates an implicature of "... and the machine works" that, if it's not true, needs to be explicitly canceled with a disclaimer ("... but it's broken"). "Decouplers" don't think the statement about murder rates creates an implicature about stereotyping.
2mr-hire3h I don't think it's necessarily about implacature. It's often about being taken "out of context" or used as a justification. That is, I may not think "green eyed people commit twice as many murders" implies anything about stereotyping, but I still think it may lead to more stereotyping due to motivated reasoning. It's much more about consequences rather than implications. Edit: To expand on this: There are several types of context: * The context in which it is said (Implacature) * The context about the state of mind/biases which the other person is in when hearing it (Inference) * The context in which what was said may be used (culture). Contextualizing vs. decoupling seem much more about the the latter two to me - No one is arguing that you shouldn't be clear in your speech. The question is how much you should take into account other people and culture. That is, decouplers often decouple from consequences and focus merely on implacature, whereas contextualizers try and contextualize how what they say will be interpreted and used.
2Said Achmiz2h Meta: it’s implicature. The second vowel is an i.

Yeah I realized that when reading through but going back and changing everything feels pointless since you basically get the implicature of what I was trying to say.

Will As Thou Wilt
411y1 min readShow Highlight

Followup toPossibility and Could-ness

Arthur Schopenhauer (1788-1860) said:

"A man can do as he wills, but not will as he wills."

For this fascinating sentence, I immediately saw two interpretations; and then, after some further thought, two more interpretations.

On the first interpretation, Schopenhauer forbids us to build circular causal models of human psychology.  The explanation for someone's current will cannot be their current will - though it can include their past will.

On the second interpretation, the sentence says that alternate choices are not reachable - that we... (Read more)

Let's be careful not to conflate choice with free will. It does seem quite inescapable that there is no room for free will, nevertheless, choice happens all the time. Perhaps the phenomenon of choosing cannot be feasibly examined at the same order of granularity that is required for the examination of free-will (just as it is unfeasible to examine the function of a house by examining each of its constituent atoms)? Perhaps choice is a phenomenon emergent, not reliant on any underlying freedom of will?

Epistemic status: pretty confident. Based on several years of meditation experience combined with various pieces of Buddhist theory as popularized in various sources, including but not limited to books like The Mind Illuminated, Mastering the Core Teachings of the Buddha, and The Seeing That Frees; also discussions with other people who have practiced meditation, and scatterings of cognitive psychology papers that relate to the topic. The part that I’m the least confident of is the long-term nature of enlightenment; I’m speculating on what comes next based on what I’ve experienced, but have no... (Read more)

Nomination for 2018 Review

I think this post basically succeeds at its goal (given the discussion in the comments), and feels like an important precursor to discussion of some of the directions the LW community has been moving in for the last several years. I think the connection to cognitive fusion was novel to me when I first read it, but immediately clicked in place.

In his AI Safety “Success Stories” post, Wei Dai writes:

[This] comparison table makes Research Assistant seem a particularly attractive scenario to aim for, as a stepping stone to a more definitive success story. Is this conclusion actually justified?

I share Wei Dai's intuition that the Research Assistant path is neglected, and I want to better understand the safety problems involved in this path.

Specifically, I'm envisioning AI research assistants, built without any kind of reinforcement learning, that help AI alignment researchers identify, understand, and solve AI alignment problems. S

... (Read more)
3ofer9h If "unintended optimization" referrers only to the inner alignment problem [https://www.lesswrong.com/posts/FkgsxrGf3QxhfLWHG/risks-from-learned-optimization-introduction] , then there's also the malign prior [https://ordinaryideas.wordpress.com/2016/11/30/what-does-the-universal-prior-actually-look-like/] problem.
2John_Maxwell17h Well, the reason I mentioned the "utility function over different states of matter" thing is because if your utility function isn't specified over states of matter, but is instead specified over your actions (e.g. behave in a way that's as corrigible as possible), you don't necessarily get instrumental convergence. "Unintended optimization. First, the possibility of mesa-optimization means that an advanced ML system could end up implementing a powerful optimization procedure even if its programmers never intended it to do so." - Source [https://www.lesswrong.com/posts/FkgsxrGf3QxhfLWHG/risks-from-learned-optimization-introduction] . "Daemon" is an older term. My impression is that early thinking about Oracles wasn't really informed by how (un)supervised systems actually work, and the intellectual momentum from that early thinking has carried to the present, even though there's no real reason to believe these early "Oracle" models are an accurate description of current or future (un)supervised learning systems.
1ofer9h I suspect that the concept of utility functions that are specified over your actions is fuzzy in a problematic way. Does it refer to utility functions that are defined over the physical representation of the computer (e.g. the configuration of atoms in certain RAM memory cells that their value represents the selected action)? If so, we're talking about systems that 'want to affect (some part of) the world', and thus we should expect such systems to have convergent instrumental goals with respect to our world (e.g. taking control over as much resources in our world as possible). It seems possible that something like this has happened. Though as far as I know, we don't currently know how to model contemporary supervise learning at an arbitrarily large scale in complicated domains. How do you model the behavior of the model on examples outside the training set? If your answer contains the phrase "training distribution" then how do you define the training distribution? What makes the training distribution you have in mind special relative to all the other training distributions that could have produced the particular training set that you trained your model on? Therefore, I'm sympathetic to the following perspective, from Armstrong and O'Rourke (2018 [https://arxiv.org/abs/1711.05541]) (the last sentence was also quoted in the grandparent):

I suspect that the concept of utility functions that are specified over your actions is fuzzy in a problematic way. Does it refer to utility functions that are defined over the physical representation of the computer (e.g. the configuration of atoms in certain RAM memory cells that their value represents the selected action)? If so, we're talking about systems that 'want to affect (some part of) the world', and thus we should expect such systems to have convergent instrumental goals with respect to our world (e.g. taking control over as much resources in

... (Read more)(Click to expand thread. ⌘F to Expand All)Cmd/Ctrl F to expand all comments on this post
[Link]Spaghetti Towers
961y2 min readShow Highlight

Here’s a pattern I’d like to be able to talk about. It might be known under a certain name somewhere, but if it is, I don’t know it. I call it a Spaghetti Tower. It shows up in large complex systems that are built haphazardly.

Someone or something builds the first Part A.

Later, someone wants to put a second Part B on top of Part A, either out of convenience (a common function, just somewhere to put it) or as a refinement to Part A.

Now, suppose you want to tweak Part A. If you do that, you might break Part B, since it interacts with bits of Part A. So you might instead build Part C on top of t... (Read more)

Endorsed.

What is the good life? Probably one of the most important questions we have to ask. Science is moving closer to answering it. This post is a map of what I believe makes a good life. I'm putting it here for you to critique, to hear and maybe it inspires some useful introspection.You guys are smart people, you reason well, and you don't hold back in your criticisms. I hope you will do the same for this post. Thank you for your time.

The Terminal Goal

This is where science falls short. We can figure out how to get from A to B, but I find it unlikely that science can define a B without axi... (Read more)

Why does it matter that “i am sensitive to others’ needs”? If I’m happy being selfish, that shouldn’t matter.

3cousin_it5h To me, having relationships with people is less about sensitivity, and more about having friends and family who have my back and I have theirs, even if they're insensitive assholes and so am I. Even though that's an unpopular view today.
1ryqiem5h I read your comment as "I don't mind that my emotions are sometimes not considered, as long as I can depend on my friends and family". I'd argue that that's sensitivity to your needs as well – they satisfy what matters most to you :-) Does that make sense?
Robustness to ScaleΩ
1662y1 min readΩ 7Show Highlight

I want to quickly draw attention to a concept in AI alignment: Robustness to Scale. Briefly, you want your proposal for an AI to be robust (or at least fail gracefully) to changes in its level of capabilities. I discuss three different types of robustness to scale: robustness to scaling up, robustness to scaling down, and robustness to relative scale.

The purpose of this post is to communicate, not to persuade. It may be that we want to bite the bullet of the strongest form of robustness to scale, and build an AGI that is simply not robust to scale, but if we do, we should at least realize that... (Read more)

Nomination for 2018 Review

Robustness to scale is still one of my primary explanations for why MIRI-style alignment research is useful, and why alignment work in general should be front-loaded. I am less sure about this specific post as an introduction to the concept (since I had it before the post, and don't know if anyone got it form this post), but think that the distillation of concepts floating around meatspace to clear reference works is one of the important functions of LW.

Epistemic Status: Simple point, supported by anecdotes and a straightforward model, not yet validated in any rigorous sense I know of, but IMO worth a quick reflection to see if it might be helpful to you.

A curious thing I've noticed: among the friends whose inner monologues I get to hear, the most self-sacrificing ones are frequently worried they are being too selfish, the loudest ones are constantly afraid they are not being heard, the most introverted ones are regularly terrified that they're claiming more than their share of the conversation, the most assertive ones are always su... (Read more)

Nomination for 2018 Review

I think this post is a clear explanation of an important point in system dynamics, connected to personal psychology (and thus individual rationality). It connects to other important concepts, like the tails coming apart and Goodhart, while being distinct (in a way that I think helps clarify them all).

The user cousin_it has pointed out a problem with the counterfactual Oracle idea: the Oracle AIs may form a "bucket chain" bringing back a dangerous message from a future UFAI (unfriendly AI).

This is certainly a problem, and though there are ways of reducing the risk, there doesn't seem to be any clean solutions to it.

The bucket chain

Beginning the chain

The basic idea is simple. Suppose there is a counterfactual Oracle, . It makes a prediction about the value of some variable , in two days time.

However, in one day's time, an UFAI will be unleashed. It will take over everything, includ

... (Read more)

It looks like a reincarnation of the RB idea, now as a chain, not one-short game.

If there are many possible UFAIs in the future, they could acausally compete for the O's reward channel, and this would create some noise and may work as a protection.

It also reminds me of the SETI-attack, now in time, not space. Recently I had a random shower thought that if all quantum computers occured to be connected with each other via some form of entaglement, when aleins could infiltrate our quantum computers as their quantum computers will be connected to such pa... (Read more)(Click to expand thread. ⌘F to Expand All)Cmd/Ctrl F to expand all comments on this post

1Gurkenglas4h Let us always be looking forward a month, and let there be a UFAI that arises with 1% probability in that time frame. Assume that we can correctly incentivize the Oracle to compute the probability distribution over the messages sent back in case of erasure. Naively sampling the distribution dooms us with 1% probability. If we plan to send back only whether some research direction pays off within a month, then by asking whether "yes" has more than 20% chance, then depending on the answer, the "yes" chance conditional on no UFAI is either more than 19% or at most 21%, and this query gave the UFAI a 1% chance of 1 bit of optimization power, at most doubling the expected chance of UFAI for the next query. If you want to spread out this optimization power to bound the worst case, I reiterate that differential privacy theory seems applicable here [https://www.lesswrong.com/posts/cSzaxcmeYW6z7cgtc/contest-usd1-000-for-good-questions-to-ask-to-an-oracle-ai#ezpLgziqzEv3QrD4c] .
2Stuart_Armstrong5h Yep. If they do acausal trade with each other.
3cousin_it5h It seems that coordinated erasure, chokepoints and short horizons can help with this problem as well. But if many companies or governments have their own oracles and benefit from longer horizons, it gets harder.
Archiving Yahoo Groups
441mo2 min readShow Highlight

On December 14th Yahoo will shut down Yahoo Groups. Since my communities have mostly moved away from @yahoogroups.com hosting, to Facebook, @googlegroups, and other places, the bit that hit me was that they are deleting all the mailing list archives.

Digital archives of text conversations are close to ideal from the perspective of a historian: unlike in-person or audio-based interaction this naturally leaves a skimmable and easily searchable record. If I want to know, say, what people were thinking about in the early days of GiveWell, their early blog posts (including comments) ... (Read more)

Thanks for providing a clue and example of what to do with all these lovely *.json files after we've captured them. I wouldn't call those archives nice or ideal from a subject finding / rereading standpoint, but at least they work, and it doesn't take a lot of effort! Maybe better archive display strategies will emerge after the Yahoo debacle has been over with for a bit.

Exactly two years to the day I started writing this post I published Map and Territory's most popular post of all time, "Doxa, Episteme, and Gnosis" (also here on LW). In that post I describe a distinction ancient Greek made between three kinds of knowledge we might translate as hearsay, justified belief, and direct experience, respectively, although if I'm being totally honest I'm nowhere close to being a classics scholar so I probably drew a distinction between the three askew to the one ancient Attic Greeks would have made. Historical accuracy aside, the distinction... (Read more)

3Raemon17h I think I bounced off this at first because I wasn't sure how much of this was new vs a bit more polished. Could you briefly summarize/table-of-contents the "diff" between this and last post?

Everything that is not a literal quote from the previous post is new.

[Link]Realism about rationality
1561y4 min readShow Highlight

Epistemic status: trying to vaguely gesture at vague intuitions. A similar idea was explored here under the heading "the intelligibility of intelligence", although I hadn't seen it before writing this post.

There’s a mindset which is common in the rationalist community, which I call “realism about rationality” (the name being intended as a parallel to moral realism). I feel like my skepticism about agent foundations research is closely tied to my skepticism about this mindset, and so in this essay I try to articulate what it is.

Humans ascribe properties to entities in the world ... (Read more)

4DanielFilan15h Note that the linked technical report [https://intelligence.org/files/HowIntelligible.pdf] by Salamon, Rayhawk, and Kramar does a good job at looking at evidence for and against 'rationality realism', or as they call it, 'the intelligibility of intelligence'.
4DanielFilan15h I do think that it was an interesting choice for the post to be about 'realism about rationality' rather than its converse, which the author seems to subscribe to. This probably can be chalked up to it being easier to clearly see a thinking pattern that you don't frequently use, I guess?
3ricraz9h I think in general, if there's a belief system B that some people have, then it's much easier and more useful to describe B than ~B. It's pretty clear if, say, B = Christianity, or B = Newtonian physics. I think of rationality anti-realism less as a specific hypothesis about intelligence, and more as a default skepticism: why should intelligence be formalisable? Most things aren't! (I agree that if you think most things are formalisable, so that realism about rationality should be our default hypothesis, then phrasing it this way around might seem a little weird. But the version of realism about rationality that people buy into around here also depends on some of the formalisms that we've actually come up with being useful, which is a much more specific hypothesis, making skepticism again the default position.)

I think that rationality realism is to Bayesianism is to rationality anti-realism as theism is to Christianity is to atheism. Just like it's feasible and natural to write a post advocating and mainly talking about atheism, despite that position being based on default skepticism and in some sense defined by theism, I think it would be feasible and natural to write a post titled 'rationality anti-realism' that focussed on that proposition and described why it was true.

Wrinkles
593d3 min readShow Highlight

Why does our skin form wrinkles as we age?

This post will outline the answer in a few steps:

  • Under what conditions do materials form wrinkles, in general?
  • How does the general theory of wrinkles apply to aging human skin?
  • What underlying factors drive the physiological changes which result in wrinkles?

In the process, we’ll draw on sources from three different fields: mechanical engineering, animation, and physiology.

Why do Materials Wrinkle?

Imagine we have a material with two layers:

  • A thin, stiff top layer
  • A thick, elastic bottom layer

We squeeze this material from the sides, so the whole thi... (Read more)

1leggi13h Interesting. Have you anything to back that statement up? I don't think describing muscle loss as a major hallmark of ageing is accurate. Muscle loss does occur as we age - likely due to lack of use, poor diet (and lower testosterone in men) since the effect can be reversed.

The best source I've seen on the topic is this Physiological Reviews article (I've read other sources as well, but didn't keep around links for most of them).

Reversibility is specifically addressed - age-related muscle loss (aka sarcopenia) is not really reversible. There are things people can do at any age to add muscle (e.g. exercise), but muscle is lost if exercise/diet/etc is held constant. Masters athletes are a visible example of this.

Also, it's not just skeletal muscle. For example, the pupil muscle squeezes the lens of the eye t... (Read more)(Click to expand thread. ⌘F to Expand All)Cmd/Ctrl F to expand all comments on this post

12leggi13h I can only go on the words that are used so I'm not sure, but I didn't consider "kill" might be being used metaphorically on a rationality website. Especially when the poster then describes it as an educated guess based on a model that lets us make "darn good educated guesses" and admits they did not even type 'how does botox work' into a search engine. Kudos to the OP for the post but we're moving into bio-medical sciences (my background) so there are "facts" out there. Is it not better to use the knowledge that is available than making guesses?

The following was a presentation I made for Sören Elverlin's AI Safety Reading Group. I decided to draw everything by hand because powerpoint is boring. Thanks to Ben Pace for formatting it for LW! See also the IAF post detailing the research which this presentation is based on.

Nomination for 2018 Review

Abram's writing and illustrations often distill technical insights into accessible, fun adventures. I've come to appreciate the importance and value of this expository style more and more over the last year, and this post is what first put me on this track. While more rigorous communication certainly has its place, clearly communicating the key conceptual insights behind a piece of work makes those insights available to the entire community.

This is an ultra-condensed version of the research agenda on synthesising human preferences:

In order to infer what a human wants from what they do, an AI needs to have a human theory of mind.

Theory of mind is something that humans have instinctively and subconsciously, but that isn't easy to spell out explicitly; therefore, by Moravec's paradox, it will be very hard to implant it into an AI, and this needs to be done deliberately.

One way of defining theory of mind is to look at how humans internally model the value of various hypothetical actions and events (happening to themselves and to othe

... (Read more)

Having printed and read the full version, this ultra-simplified version was an useful summary.

Happy to read a (not-so-)simplified version (like 20-30 paragraphs).

3avturchin5h Maybe we could try to put the theory of mind out of the brackets? In that case, the following type of claims will be meaningful: "For the theory of mind T1, a human being H has the set of preferences P1, and for the another theory of mind T2 he has P2". Now we could compare P1 and P2 and if we find some invariants, they could be used as more robust presentations of the preferences.

NB: Originally published on Map and Territory on Medium. This is an old post originally published on 2016-09-10. It was never previously cross-posted or linked on LessWrong, so I'm adding it now for posterity. It's old enough that I can no longer confidently endorse it, and I won't bother trying to defend it if you find something wrong, but it might still be interesting.

I find Kegan’s model of psychological development extremely useful. Some folks I know disagree on various grounds. These are some accumulated responses to critiques I’ve encountered.

Before we dive i... (Read more)

So this is "replies to some objections to Kegan", and not "positive reasons to think Kegan is useful", yes? That is, it won't convince anyone that they should pay attention to Kegan, but if they've been previously convinced that they shouldn't, it might change their mind.

I confess I had been hoping for the positive reasons, but if it's not what you were going for, then fair enough.

Basically everyone trusts experts in math because it’s far from lived experience and we agree that mathematicians are the math experts even though we can only easily validate t

... (Read more)(Click to expand thread. ⌘F to Expand All)Cmd/Ctrl F to expand all comments on this post
The Power to Draw Better
375d2 min readShow Highlight

This is Part X of the Specificity Sequence

Cats notoriously get stuck in trees because their claws are better at climbing up than down. Throughout this sequence, we’ve seen how humans are similar: We get stuck in high-level abstractions because our brains struggle to unpack them into specifics. Our brains are better at climbing up (concrete→abstract) than down (abstract→concrete).

If you’ve ever struggled to draw a decent picture, you know what being stuck at a high level of abstraction feels like in the domain of visual processing. I know I do. My drawing skills are ... (Read more)

Rapid Viz by Kurt Hanks and How to Draw by Scott Robertson are excellent primers.

Solar One Year In
116h1 min readShow Highlight

Last fall we had solar panels installed. Our roof is pretty marginal for solar, large parts blocked by trees and the remainder mostly facing West, but incentives were high enough that it looked decent. And even if it only broke even I still liked it for the resiliency advantages. It's currently doing slightly better than expected, so I'm happy!

We have fourteen LG Neon-R 360 watt panels, three facing South and eleven facing West:

The installers predicted we would see 3.45MWh in the first year, slowly ramping down to 3.25MWh/y over the next twenty five years as the panels degr... (Read more)

Load More