# All Posts

Sorted by Magic (New & Upvoted)

# Saturday, January 18th 2020Sat, Jan 18th 2020

Shortform [Beta]
4rohinmshah3hI was reading Avoiding Side Effects By Considering Future Tasks [https://drive.google.com/file/d/0B3mY6u_lryzdMGpEbEljRmFIS2hZWno1clExMDRuVVZWMnJV/view] , and it seemed like it was doing something very similar to relative reachability. This is an exploration of that; it assumes you have already read the paper and the relative reachability paper. It benefitted from discussion with Vika. Define the reachability R(s1,s2)=Eτ∼π[γn], where π is the optimal policy for getting from s1 to s2, and n=|τ| is the length of the trajectory. This is the notion of reachability both in the original paper and the new one. Then, for the new paper when using a baseline, the future task value V∗future(s,s′) is: Eg,τ∼πg,τ′∼π′g[γmax(n,n′)] where s′ is the baseline state and g is the future goal. In a deterministic environment, this can be rewritten as: V∗future(s,s′) =Eg[γmax(n,n′)] =Eg[min(R(s,g),R(s′,g))] =Eg[R(s′,g)−max(R(s′,g)−R(s,g),0)] =Eg[R(s′,g)]−Eg[max(R(s′,g)−R(s,g),0)] =Eg[R(s′,g)]−dRR(s,s′) Here, dRR is relative reachability, and the last line depends on the fact that the goal is equally likely to be any state. Note that the first term only depends on the number of timesteps, since it only depends on the baseline state s'. So for a fixed time step, the first term is a constant. The optimal value function in the new paper is (page 3, and using my notation of V∗future instead of their V∗i): V∗(st)=maxat∈A[r(st,at)+γ∑st+1∈Sp(st+1∣st,at)V∗(st+1)+(1−γ)βV∗future]. This is the regular Bellman equation, but with the following augmented reward (here s′t is the baseline state at time t): Terminal states: rnew(st) =r(st)+βV∗future(st,s′t) =r(st)−βdRR(st,s′t)+βEg[R(s′t,g)] Non-terminal states: rnew(st,at) =r(st,at)+(1−γ)βV∗future(st,s′t) =r(st)−(1−γ)βdRR(st,s′t)+(1−γ)βEg[R(s′t,g)] For comparison, the original relative reachability reward is: rRR(st,at)=r(st)−βdRR(st,s′t) The first and third terms in rnew are very similar to the two te
2Mary Chernyshenko12hThe unshareable evidence. I have a friend, a fellow biologist. A much more focused person, in terms of "gotta do this today", with lower barriers for action (e.g., I help her with simple English, but she is the one to tutor kids in it, and so on.) I have known her for about ten years. And over time, I learned that her cousin died at seventeen. It was the time when atypical pneumonia was around, and he died in a hospital a week after he fell ill with typical symptoms, but his certificate had another kind of pneumonia in it. Officially, there was no AP in the area. And his death changed the familial structure so that it is still unbalanced, in a way, years later. Her sister has recently lost half a finger, after an accident with a saw, when there was a good chance of saving it. Both her children (one 14, the other 3 years old) usually get horrifying allergic swellings and fever from even the common bugs, and then only slowly get better. In the city region where she lives, there is one neurologist for ten thousand people, and she can't get an appointment. I keep hearing about such things when I visit her. Her kids are unvaccinated. We have talked about it, and she said all the usual things about vaccines causing autism, and the mercury, and the questionable quality etc. The Kitchen Argument uniting people all over the world. Of course, the link between vaccines and autism was disproved, but this means that somebody did take it seriously. It's not one woman's struggle or suspicions, its The Statistics. You can discuss it much like weather - you're being polite! It gives me an ugly feeling, that a friend of mine should hide behind common and expected and false - she knows it's false - lore because she knows the script and to know that it was I who forced her to it. I and people like me gave her this shield. But the pneumonia, the finger and the swellings, the life which she builds her thoughts around, never get mentioned. We've had the same education, we both know

# Thursday, January 16th 2020Thu, Jan 16th 2020

Personal Blogposts
Shortform [Beta]
6MichaelA3dWAYS OF DESCRIBING THE “TRUSTWORTHINESS” OF PROBABILITIES While doing research for a post on the idea of a distinction between “risk” and “(Knightian) uncertainty [https://en.wikipedia.org/wiki/Knightian_uncertainty]”, I came across a surprisingly large number of different ways of describing the idea that some probabilities may be more or less “reliable”, “trustworthy”, “well-grounded”, etc. than others, or things along those lines. (Note that I’m referring to the idea of different degrees of trustworthiness-or-whatever, rather than two or more fundamentally different types of probability that vary in trustworthiness-or-whatever.) I realised that it might be valuable to write a post collecting all of these terms/concepts/framings together, analysing the extent to which some may be identical to others, highlighting ways in which they may differ, suggesting ways or contexts in which some of the concepts may be superior to others, etc.[1] [#fn-wGnf2warekZDiMkWj-1] But there’s already too many things I’m working on writing at the moment, so this is a low effort version of that idea - this is basically just a collection of the concepts, relevant quotes, and links where readers can find more. Comments on this post will inform whether I take the time to write something more substantial/useful on this topic later (and, if so, precisely what and how). Note that this post does not explicitly cover the “risk vs uncertainty” framing itself, as I’m already writing a separate, more thorough post on that. EPISTEMIC CREDENTIALS Dominic Roser [https://link.springer.com/article/10.1007%2Fs11948-017-9919-x] speaks of how “high” or “low” the epistemic credentials of our probabilities are. He writes: He further explains what he means by this in a passage that also alludes to many other ways of describing or framing an idea along the lines of the trustworthiness of given probabilities: RESILIENCE (OF CREDENCES) Amanda Askell discusses the idea that we can have “more” or “less” res

# Tuesday, January 14th 2020Tue, Jan 14th 2020

Personal Blogposts
6[Event] Anki (Memorization Software) for Beginners3045 Shattuck Avenue #1894, BerkeleyJan 25th
0
4[Event]San Francisco Meetup: Board Games170 Hawthorne St, San Francisco, CA 94107, USAJan 21st
1
3[Event]Advanced Anki (Memorization Software)3045 Shattuck Avenue #1894, BerkeleyJan 25th
0
Shortform [Beta]
12jacobjacob4dI saw an ad for a new kind of pant: stylish as suit pants, but flexible as sweatpants. I didn't have time to order them now. But I saved the link in a new tab in my clothes database -- an Airtable that tracks all the clothes I own. This crystallised some thoughts about external systems that have been brewing at the back of my mind. In particular, about the gears-level principles that make some of them useful, and powerful, When I say "external", I am pointing to things like spreadsheets, apps, databases, organisations, notebooks, institutions, room layouts... and distinguishing those from minds, thoughts and habits. (Though this distinction isn't exact, as will be clear below, and some of these ideas are at an early stage.) Externalising systems allows the following benefits... 1. GATHERING ANSWERS TO UNSEARCHABLE QUERIES There are often things I want lists of, which are very hard to Google or research. For example: * List of groundbreaking discoveries that seem trivial in hindsight * List of different kinds of confusion, identified by their phenomenological qualia * List of good-faith arguments which are elaborate and rigorous, though uncertain, and which turned out to be wrong etc. Currently there is no search engine (but the human mind) capable of finding many of these answers (if I am expecting a certain level of quality). But for that reason researching the lists is also very hard. The only way I can build these lists is by accumulating those nuggets of insight over time. And the way I make that happen, is to make sure to have external systems which are ready to capture those insights as they appear. 2. SEIZING SERENDIPITY Luck favours the prepared mind. Consider the following anecdote: I think this is true far beyond beyond intellectual discovery. In order for the most valuable companies to exist, there must be VCs ready to fund those companies when their founders are toying with the ideas. In order for the best jokes to exist, the
4ozziegooen4dOne question around the "Long Reflection" or around "What will AGI do?" is something like, "How bottlenecked will be by scientific advances that we'll need to then spend significant resources on?" I think some assumptions that this model typically holds are: 1. There will be decision-relevant unknowns. 2. Many decision-relevant unkowns will be EV-positive to work on. 3. Of the decision-relevant unknowns that are EV-positive to work on, these will take between 1% to 99% of our time. (3) seems quite uncertain to me in the steady state. I believe it makes an intuitive estimate between 2 orders of magnitude, while the actual uncertainty is much higher than that. If this were the case, it would mean: 1. Almost all possible experiments are either trivial (<0.01% of resources, in total), or not cost-effective. 2. If some things are cost-effective and still expensive (they will take over 1% of the AGI lifespan), it's likely that they will take 100%+ of the time. Even if they would take 10^10% of the time, in expectation, they could still be EV-positive to pursue. I wouldn't be surprised if there were one single optimal thing like this in the steady-state. So this strategy would look something like, "Do all the easy things, then spend a huge amount of resources on one gigantic-sized, but EV-high challenge." (This was inspired by a talk that Anders Sandberg gave)

# Monday, January 13th 2020Mon, Jan 13th 2020

Personal Blogposts
Shortform [Beta]
15TurnTrout6dWhile reading Focusing today, I thought about the book and wondered how many exercises it would have. I felt a twinge of aversion. In keeping with my goal of increasing internal transparency, I said to myself: "I explicitly and consciously notice that I felt averse to some aspect of this book". I then Focused on the aversion. Turns out, I felt a little bit disgusted, because a part of me reasoned thusly: (Transcription of a deeper Focusing on this reasoning) I'm afraid of being slow. Part of it is surely the psychological remnants of the RSI I developed in the summer of 2018. That is, slowing down is now emotionally associated with disability and frustration. There was a period of meteoric progress as I started reading textbooks and doing great research, and then there was pain. That pain struck even when I was just trying to take care of myself, sleep, open doors. That pain then left me on the floor of my apartment, staring at the ceiling, desperately willing my hands to just get better. They didn't (for a long while), so I just lay there and cried. That was slow, and it hurt. No reviews, no posts, no typing, no coding. No writing, slow reading. That was slow, and it hurt. Part of it used to be a sense of "I need to catch up and learn these other subjects which [Eliezer / Paul / Luke / Nate] already know". Through internal double crux, I've nearly eradicated this line of thinking, which is neither helpful nor relevant nor conducive to excitedly learning the beautiful settled science of humanity. Although my most recent post [https://www.lesswrong.com/posts/eX2aobNp5uCdcpsiK/on-being-robust] touched on impostor syndrome, that isn't really a thing for me. I feel reasonably secure in who I am, now (although part of me worries that others wrongly view me as an impostor?). However, I mostly just want to feel fast, efficient, and swift again. I sometimes feel like I'm in a race with Alex2018, and I feel like I'm losing.
4toonalfrink6dSo here's two extremes. One is that human beings are a complete lookup table. The other one is that human beings are perfect agents with just one goal. Most likely both are somewhat true. We have subagents that are more like the latter, and subsystems more like the former. But the emphasis on "we're just a bunch of hardcoded heuristics" is making us stop looking for agency where there is in fact agency. Take for example romantic feelings. People tend to regard them as completely unpredictable, but it is actually possible to predict to some extent whether you'll fall in and out of love with someone based on some criteria, like whether they're compatible with your self-narrative and whether their opinions and interests align with yours, etc. The same is true for many intuitions that we often tend to dismiss as just "my brain" or "neurotransmitter xyz" or "some knee-jerk reaction". There tends to be a layer of agency in these things. A set of conditions that makes these things fire off, or not fire off. If we want to influence them, we should be looking for the levers, instead of just accepting these things as a given. So sure, we're godshatter, but the shards are larger than we give them credit for.
3Hazard6dSo a thing Galois theory does is explain: Which makes me wonder; would there be a formula if you used more machinery that normal stuff and radicals? What does "more than radicals" look like?
1rmoehn6dUpdated the Predicted AI alignment event/meeting calendar [https://www.lesswrong.com/posts/h8gypTEKcwqGsjjFT/predicted-ai-alignment-event-meeting-calendar] . * Application deadline for the AI Safety Camp Toronto extended. If you've missed it so far, you still have until 19th to apply. * Apparently no AI alignment workshop at ICLR, but another somewhat related one.

# Friday, January 10th 2020Fri, Jan 10th 2020

Personal Blogposts
4[Event]Halifax SSC Meetup -- Saturday 11/1/20OB6, 1451 South Park Street suite 103, HalifaxJan 11th
0
Shortform [Beta]
15tragedyofthecomments8dI often see people making statements that sound to me like . . . "The entity in charge of bay area rationality should enforce these norms." or "The entity in charge of bay area rationality is bad for allowing x to happen." There is no entity in charge of bay area rationality. There's a bunch of small groups of people that interact with each other sometimes. They even have quite a bit of shared culture. But no one is in charge of this thing, there is no entity making the set of norms for rationalists, there is no one you can outsource the building of your desired group to.
2romeostevensit9dYou can't straightforwardly multiply uncertainty from different domains to propagate uncertainty through a model. Point estimates of differently shaped distributions can mean very different things i.e. the difference between the mean of a normal, bimodal, and fat tailed distribution. This gets worse when there are potential sign flips in various terms as we try to build a causal model out of the underlying distributions.