All Posts

Sorted by Magic (New & Upvoted)

Tuesday, January 21st 2020
Tue, Jan 21st 2020

Shortform [Beta]
10ozziegooen15hCommunication should be judged for expected value, not intention (by consequentialists) TLDR: When trying to understand the value of information, understanding the public interpretations of that information could matter more than understanding the author's intent. When trying to understand the information for other purposes (like, reading a math paper to understand math), this does not apply. If I were to scream "FIRE!" in a crowded theater, it could cause a lot of damage, even if my intention were completely unrelated. Perhaps I was responding to a devious friend who asked, "Would you like more popcorn? If yes, should 'FIRE!'". Not all speech is protected by the First Amendment, in part because speech can be used for expected harm. One common defense of incorrect predictions is to claim that their interpretations weren't their intentions. "When I said that the US would fall if X were elected, I didn't mean it would literally end. I meant more that..." These kinds of statements were discussed at length in Expert Political Judgement. But this defense rests on the idea that communicators should be judged on intention, rather than expected outcomes. In those cases, it was often clear that many people interpreted these "experts" as making fairly specific claims that were later rejected by their authors. I'm sure that much of this could have been predicted. The "experts" often definitely didn't seem to be going out of their way to be making their after-the-outcome interpretations clear before-the-outcome. I think that it's clear that the intention-interpretation distinction is considered highly important by a lot of people, so much so as to argue that interpretations, even predictable ones, are less significant in decision making around speech acts than intentions. I.E. "The important thing is to say what you truly feel, don't worry about how it will be understood." But for a consequentialist, this distinction isn't particularly relevant. Speech acts are judged on
3Raemon5hI'm not sure which of these posts is a subset of the other: * The Backbone Bottleneck * The Leadership Bottleneck
2George1dI wonder why people don't protect themselves from memes more. Just to be clear, I mean meme in the broad memetic theory of spreading ideas/thoughts sense. I think there's almost an intuitive understanding, or at least one existed in the environment I was bought up in, that some ideas are virulent and useless. I think that from this it's rather easy to conclude that those ideas are harmful, since you only have space for so many ideas, so holding useless ideas is harmful in the sense that it eats away at a valuable resource (your mind). I think modern viral ideas also tend more and more towards the toxic side, toxic in the very literal sense of "designed to invoke a raise in cortisol and/or dopamine that makes them more engaging yet is arguably provably harmful to the human body. Though I think this is a point I don't trust that much, speculation at best. It's rather hard to figure out what memes one should protect themselves from under these conditions, some good heuristics I've come up with is: * 1. Memes that are new and seem to be embedded in the minds of many people, yet don't seem to increase their performance on any metric you care about. (e.g. wealth, lifespan, happiness) * 2. Memes that are old and seem to be embedded in the minds of many people, yet seem to decrease their performance on any metric you care about. * 3. Memes that are being recommended to you in an automated fashion by a capable algorithm you don't understand fully. I think if a meme ticks one of these boxes, it should be taken under serious consideration as harmful. Granted, there's memes that tick all 3 (e.g. wearing a warm coat during winter), but I think those are so "common" it's pointless to bring them into the discussion, they are already deeply embedded in our minds, so it's pointless to discuss them. A few examples I can think of. * Crypot currency in 2017&2018, passes 2 and 3, passes or fails 1 depending on the people you are looking at, => Depends * All a

Monday, January 20th 2020
Mon, Jan 20th 2020

Personal Blogposts
2[Event]WORKSHOP ON ASSURED AUTONOMOUS SYSTEMS (WAAS)5 Embarcadero Center, San FranciscoMay 21st
Shortform [Beta]
7Chris_Leong1dThere appears to be something of a Sensemaking community developing on the internet, which could roughly be described as a spirituality-inspired attempt at epistemology. This includes Rebel Wisdom [], Future Thinkers [], Emerge [] and maybe you could even count post-rationality. While there are undoubtedly lots of critiques that could be made of their epistemics, I'd suggest watching this space as I think some interesting ideas will emerge out of it.
3matthewhirschey1dJust found this site, and am going through these ideas. Love the core ideas (thinking, creativity, decision making, etc). I have recently started writing on some similar ideas (, and look forward to exchange!

Saturday, January 18th 2020
Sat, Jan 18th 2020

Shortform [Beta]
6rohinmshah3dI was reading Avoiding Side Effects By Considering Future Tasks [] , and it seemed like it was doing something very similar to relative reachability. This is an exploration of that; it assumes you have already read the paper and the relative reachability paper. It benefitted from discussion with Vika. Define the reachability R(s1,s2)=Eτ∼π[γn], where π is the optimal policy for getting from s1 to s2, and n=|τ| is the length of the trajectory. This is the notion of reachability both in the original paper and the new one. Then, for the new paper when using a baseline, the future task value V∗future(s,s′) is: Eg,τ∼πg,τ′∼π′g[γmax(n,n′)] where s′ is the baseline state and g is the future goal. In a deterministic environment, this can be rewritten as: V∗future(s,s′) =Eg[γmax(n,n′)] =Eg[min(R(s,g),R(s′,g))] =Eg[R(s′,g)−max(R(s′,g)−R(s,g),0)] =Eg[R(s′,g)]−Eg[max(R(s′,g)−R(s,g),0)] =Eg[R(s′,g)]−dRR(s,s′) Here, dRR is relative reachability, and the last line depends on the fact that the goal is equally likely to be any state. Note that the first term only depends on the number of timesteps, since it only depends on the baseline state s'. So for a fixed time step, the first term is a constant. The optimal value function in the new paper is (page 3, and using my notation of V∗future instead of their V∗i): V∗(st)=maxat∈A[r(st,at)+γ∑st+1∈Sp(st+1∣st,at)V∗(st+1)+(1−γ)βV∗future]. This is the regular Bellman equation, but with the following augmented reward (here s′t is the baseline state at time t): Terminal states: rnew(st) =r(st)+βV∗future(st,s′t) =r(st)−βdRR(st,s′t)+βEg[R(s′t,g)] Non-terminal states: rnew(st,at) =r(st,at)+(1−γ)βV∗future(st,s′t) =r(st)−(1−γ)βdRR(st,s′t)+(1−γ)βEg[R(s′t,g)] For comparison, the original relative reachability reward is: rRR(st,at)=r(st)−βdRR(st,s′t) The first and third terms in rnew are very similar to the two te
4Mary Chernyshenko4dThe unshareable evidence. I have a friend, a fellow biologist. A much more focused person, in terms of "gotta do this today", with lower barriers for action (e.g., I help her with simple English, but she is the one to tutor kids in it, and so on.) I have known her for about ten years. And over time, I learned that her cousin died at seventeen. It was the time when atypical pneumonia was around, and he died in a hospital a week after he fell ill with typical symptoms, but his certificate had another kind of pneumonia in it. Officially, there was no AP in the area. And his death changed the familial structure so that it is still unbalanced, in a way, years later. Her sister has recently lost half a finger, after an accident with a saw, when there was a good chance of saving it. Both her children (one 14, the other 3 years old) usually get horrifying allergic swellings and fever from even the common bugs, and then only slowly get better. In the city region where she lives, there is one neurologist for ten thousand people, and she can't get an appointment. I keep hearing about such things when I visit her. Her kids are unvaccinated. We have talked about it, and she said all the usual things about vaccines causing autism, and the mercury, and the questionable quality etc. The Kitchen Argument uniting people all over the world. Of course, the link between vaccines and autism was disproved, but this means that somebody did take it seriously. It's not one woman's struggle or suspicions, its The Statistics. You can discuss it much like weather - you're being polite! It gives me an ugly feeling, that a friend of mine should hide behind common and expected and false - she knows it's false - lore because she knows the script and to know that it was I who forced her to it. I and people like me gave her this shield. But the pneumonia, the finger and the swellings, the life which she builds her thoughts around, never get mentioned. We've had the same education, we both know

Thursday, January 16th 2020
Thu, Jan 16th 2020

Shortform [Beta]
6MichaelA6dWAYS OF DESCRIBING THE “TRUSTWORTHINESS” OF PROBABILITIES While doing research for a post on the idea of a distinction between “risk” and “(Knightian) uncertainty []”, I came across a surprisingly large number of different ways of describing the idea that some probabilities may be more or less “reliable”, “trustworthy”, “well-grounded”, etc. than others, or things along those lines. (Note that I’m referring to the idea of different degrees of trustworthiness-or-whatever, rather than two or more fundamentally different types of probability that vary in trustworthiness-or-whatever.) I realised that it might be valuable to write a post collecting all of these terms/concepts/framings together, analysing the extent to which some may be identical to others, highlighting ways in which they may differ, suggesting ways or contexts in which some of the concepts may be superior to others, etc.[1] [#fn-wGnf2warekZDiMkWj-1] But there’s already too many things I’m working on writing at the moment, so this is a low effort version of that idea - this is basically just a collection of the concepts, relevant quotes, and links where readers can find more. Comments on this post will inform whether I take the time to write something more substantial/useful on this topic later (and, if so, precisely what and how). Note that this post does not explicitly cover the “risk vs uncertainty” framing itself, as I’m already writing a separate, more thorough post on that. EPISTEMIC CREDENTIALS Dominic Roser [] speaks of how “high” or “low” the epistemic credentials of our probabilities are. He writes: He further explains what he means by this in a passage that also alludes to many other ways of describing or framing an idea along the lines of the trustworthiness of given probabilities: RESILIENCE (OF CREDENCES) Amanda Askell discusses the idea that we can have “more” or “less” res

Tuesday, January 14th 2020
Tue, Jan 14th 2020

Shortform [Beta]
12jacobjacob7dI saw an ad for a new kind of pant: stylish as suit pants, but flexible as sweatpants. I didn't have time to order them now. But I saved the link in a new tab in my clothes database -- an Airtable that tracks all the clothes I own. This crystallised some thoughts about external systems that have been brewing at the back of my mind. In particular, about the gears-level principles that make some of them useful, and powerful, When I say "external", I am pointing to things like spreadsheets, apps, databases, organisations, notebooks, institutions, room layouts... and distinguishing those from minds, thoughts and habits. (Though this distinction isn't exact, as will be clear below, and some of these ideas are at an early stage.) Externalising systems allows the following benefits... 1. GATHERING ANSWERS TO UNSEARCHABLE QUERIES There are often things I want lists of, which are very hard to Google or research. For example: * List of groundbreaking discoveries that seem trivial in hindsight * List of different kinds of confusion, identified by their phenomenological qualia * List of good-faith arguments which are elaborate and rigorous, though uncertain, and which turned out to be wrong etc. Currently there is no search engine (but the human mind) capable of finding many of these answers (if I am expecting a certain level of quality). But for that reason researching the lists is also very hard. The only way I can build these lists is by accumulating those nuggets of insight over time. And the way I make that happen, is to make sure to have external systems which are ready to capture those insights as they appear. 2. SEIZING SERENDIPITY Luck favours the prepared mind. Consider the following anecdote: I think this is true far beyond beyond intellectual discovery. In order for the most valuable companies to exist, there must be VCs ready to fund those companies when their founders are toying with the ideas. In order for the best jokes to exist, the
4ozziegooen7dOne question around the "Long Reflection" or around "What will AGI do?" is something like, "How bottlenecked will be by scientific advances that we'll need to then spend significant resources on?" I think some assumptions that this model typically holds are: 1. There will be decision-relevant unknowns. 2. Many decision-relevant unkowns will be EV-positive to work on. 3. Of the decision-relevant unknowns that are EV-positive to work on, these will take between 1% to 99% of our time. (3) seems quite uncertain to me in the steady state. I believe it makes an intuitive estimate between 2 orders of magnitude, while the actual uncertainty is much higher than that. If this were the case, it would mean: 1. Almost all possible experiments are either trivial (<0.01% of resources, in total), or not cost-effective. 2. If some things are cost-effective and still expensive (they will take over 1% of the AGI lifespan), it's likely that they will take 100%+ of the time. Even if they would take 10^10% of the time, in expectation, they could still be EV-positive to pursue. I wouldn't be surprised if there were one single optimal thing like this in the steady-state. So this strategy would look something like, "Do all the easy things, then spend a huge amount of resources on one gigantic-sized, but EV-high challenge." (This was inspired by a talk that Anders Sandberg gave)

Monday, January 13th 2020
Mon, Jan 13th 2020

Shortform [Beta]
15TurnTrout9dWhile reading Focusing today, I thought about the book and wondered how many exercises it would have. I felt a twinge of aversion. In keeping with my goal of increasing internal transparency, I said to myself: "I explicitly and consciously notice that I felt averse to some aspect of this book". I then Focused on the aversion. Turns out, I felt a little bit disgusted, because a part of me reasoned thusly: (Transcription of a deeper Focusing on this reasoning) I'm afraid of being slow. Part of it is surely the psychological remnants of the RSI I developed in the summer of 2018. That is, slowing down is now emotionally associated with disability and frustration. There was a period of meteoric progress as I started reading textbooks and doing great research, and then there was pain. That pain struck even when I was just trying to take care of myself, sleep, open doors. That pain then left me on the floor of my apartment, staring at the ceiling, desperately willing my hands to just get better. They didn't (for a long while), so I just lay there and cried. That was slow, and it hurt. No reviews, no posts, no typing, no coding. No writing, slow reading. That was slow, and it hurt. Part of it used to be a sense of "I need to catch up and learn these other subjects which [Eliezer / Paul / Luke / Nate] already know". Through internal double crux, I've nearly eradicated this line of thinking, which is neither helpful nor relevant nor conducive to excitedly learning the beautiful settled science of humanity. Although my most recent post [] touched on impostor syndrome, that isn't really a thing for me. I feel reasonably secure in who I am, now (although part of me worries that others wrongly view me as an impostor?). However, I mostly just want to feel fast, efficient, and swift again. I sometimes feel like I'm in a race with Alex2018, and I feel like I'm losing.
4toonalfrink9dSo here's two extremes. One is that human beings are a complete lookup table. The other one is that human beings are perfect agents with just one goal. Most likely both are somewhat true. We have subagents that are more like the latter, and subsystems more like the former. But the emphasis on "we're just a bunch of hardcoded heuristics" is making us stop looking for agency where there is in fact agency. Take for example romantic feelings. People tend to regard them as completely unpredictable, but it is actually possible to predict to some extent whether you'll fall in and out of love with someone based on some criteria, like whether they're compatible with your self-narrative and whether their opinions and interests align with yours, etc. The same is true for many intuitions that we often tend to dismiss as just "my brain" or "neurotransmitter xyz" or "some knee-jerk reaction". There tends to be a layer of agency in these things. A set of conditions that makes these things fire off, or not fire off. If we want to influence them, we should be looking for the levers, instead of just accepting these things as a given. So sure, we're godshatter, but the shards are larger than we give them credit for.
3Hazard9dSo a thing Galois theory does is explain: Which makes me wonder; would there be a formula if you used more machinery that normal stuff and radicals? What does "more than radicals" look like?
1rmoehn9dUpdated the Predicted AI alignment event/meeting calendar [] . * Application deadline for the AI Safety Camp Toronto extended. If you've missed it so far, you still have until 19th to apply. * Apparently no AI alignment workshop at ICLR, but another somewhat related one.

Load More Days