All Posts

Sorted by Magic (New & Upvoted)

Week Of Sunday, January 12th 2020
Week Of Sun, Jan 12th 2020

Frontpage Posts
Shortform [Beta]
15TurnTrout6dWhile reading Focusing today, I thought about the book and wondered how many exercises it would have. I felt a twinge of aversion. In keeping with my goal of increasing internal transparency, I said to myself: "I explicitly and consciously notice that I felt averse to some aspect of this book". I then Focused on the aversion. Turns out, I felt a little bit disgusted, because a part of me reasoned thusly: (Transcription of a deeper Focusing on this reasoning) I'm afraid of being slow. Part of it is surely the psychological remnants of the RSI I developed in the summer of 2018. That is, slowing down is now emotionally associated with disability and frustration. There was a period of meteoric progress as I started reading textbooks and doing great research, and then there was pain. That pain struck even when I was just trying to take care of myself, sleep, open doors. That pain then left me on the floor of my apartment, staring at the ceiling, desperately willing my hands to just get better. They didn't (for a long while), so I just lay there and cried. That was slow, and it hurt. No reviews, no posts, no typing, no coding. No writing, slow reading. That was slow, and it hurt. Part of it used to be a sense of "I need to catch up and learn these other subjects which [Eliezer / Paul / Luke / Nate] already know". Through internal double crux, I've nearly eradicated this line of thinking, which is neither helpful nor relevant nor conducive to excitedly learning the beautiful settled science of humanity. Although my most recent post [https://www.lesswrong.com/posts/eX2aobNp5uCdcpsiK/on-being-robust] touched on impostor syndrome, that isn't really a thing for me. I feel reasonably secure in who I am, now (although part of me worries that others wrongly view me as an impostor?). However, I mostly just want to feel fast, efficient, and swift again. I sometimes feel like I'm in a race with Alex2018, and I feel like I'm losing.
12jacobjacob4dI saw an ad for a new kind of pant: stylish as suit pants, but flexible as sweatpants. I didn't have time to order them now. But I saved the link in a new tab in my clothes database -- an Airtable that tracks all the clothes I own. This crystallised some thoughts about external systems that have been brewing at the back of my mind. In particular, about the gears-level principles that make some of them useful, and powerful, When I say "external", I am pointing to things like spreadsheets, apps, databases, organisations, notebooks, institutions, room layouts... and distinguishing those from minds, thoughts and habits. (Though this distinction isn't exact, as will be clear below, and some of these ideas are at an early stage.) Externalising systems allows the following benefits... 1. GATHERING ANSWERS TO UNSEARCHABLE QUERIES There are often things I want lists of, which are very hard to Google or research. For example: * List of groundbreaking discoveries that seem trivial in hindsight * List of different kinds of confusion, identified by their phenomenological qualia * List of good-faith arguments which are elaborate and rigorous, though uncertain, and which turned out to be wrong etc. Currently there is no search engine (but the human mind) capable of finding many of these answers (if I am expecting a certain level of quality). But for that reason researching the lists is also very hard. The only way I can build these lists is by accumulating those nuggets of insight over time. And the way I make that happen, is to make sure to have external systems which are ready to capture those insights as they appear. 2. SEIZING SERENDIPITY Luck favours the prepared mind. Consider the following anecdote: I think this is true far beyond beyond intellectual discovery. In order for the most valuable companies to exist, there must be VCs ready to fund those companies when their founders are toying with the ideas. In order for the best jokes to exist, the
6MichaelA3dWAYS OF DESCRIBING THE “TRUSTWORTHINESS” OF PROBABILITIES While doing research for a post on the idea of a distinction between “risk” and “(Knightian) uncertainty [https://en.wikipedia.org/wiki/Knightian_uncertainty]”, I came across a surprisingly large number of different ways of describing the idea that some probabilities may be more or less “reliable”, “trustworthy”, “well-grounded”, etc. than others, or things along those lines. (Note that I’m referring to the idea of different degrees of trustworthiness-or-whatever, rather than two or more fundamentally different types of probability that vary in trustworthiness-or-whatever.) I realised that it might be valuable to write a post collecting all of these terms/concepts/framings together, analysing the extent to which some may be identical to others, highlighting ways in which they may differ, suggesting ways or contexts in which some of the concepts may be superior to others, etc.[1] [#fn-wGnf2warekZDiMkWj-1] But there’s already too many things I’m working on writing at the moment, so this is a low effort version of that idea - this is basically just a collection of the concepts, relevant quotes, and links where readers can find more. Comments on this post will inform whether I take the time to write something more substantial/useful on this topic later (and, if so, precisely what and how). Note that this post does not explicitly cover the “risk vs uncertainty” framing itself, as I’m already writing a separate, more thorough post on that. EPISTEMIC CREDENTIALS Dominic Roser [https://link.springer.com/article/10.1007%2Fs11948-017-9919-x] speaks of how “high” or “low” the epistemic credentials of our probabilities are. He writes: He further explains what he means by this in a passage that also alludes to many other ways of describing or framing an idea along the lines of the trustworthiness of given probabilities: RESILIENCE (OF CREDENCES) Amanda Askell discusses the idea that we can have “more” or “less” res
4rohinmshah4hI was reading Avoiding Side Effects By Considering Future Tasks [https://drive.google.com/file/d/0B3mY6u_lryzdMGpEbEljRmFIS2hZWno1clExMDRuVVZWMnJV/view] , and it seemed like it was doing something very similar to relative reachability. This is an exploration of that; it assumes you have already read the paper and the relative reachability paper. It benefitted from discussion with Vika. Define the reachability R(s1,s2)=Eτ∼π[γn], where π is the optimal policy for getting from s1 to s2, and n=|τ| is the length of the trajectory. This is the notion of reachability both in the original paper and the new one. Then, for the new paper when using a baseline, the future task value V∗future(s,s′) is: Eg,τ∼πg,τ′∼π′g[γmax(n,n′)] where s′ is the baseline state and g is the future goal. In a deterministic environment, this can be rewritten as: V∗future(s,s′) =Eg[γmax(n,n′)] =Eg[min(R(s,g),R(s′,g))] =Eg[R(s′,g)−max(R(s′,g)−R(s,g),0)] =Eg[R(s′,g)]−Eg[max(R(s′,g)−R(s,g),0)] =Eg[R(s′,g)]−dRR(s,s′) Here, dRR is relative reachability, and the last line depends on the fact that the goal is equally likely to be any state. Note that the first term only depends on the number of timesteps, since it only depends on the baseline state s'. So for a fixed time step, the first term is a constant. The optimal value function in the new paper is (page 3, and using my notation of V∗future instead of their V∗i): V∗(st)=maxat∈A[r(st,at)+γ∑st+1∈Sp(st+1∣st,at)V∗(st+1)+(1−γ)βV∗future]. This is the regular Bellman equation, but with the following augmented reward (here s′t is the baseline state at time t): Terminal states: rnew(st) =r(st)+βV∗future(st,s′t) =r(st)−βdRR(st,s′t)+βEg[R(s′t,g)] Non-terminal states: rnew(st,at) =r(st,at)+(1−γ)βV∗future(st,s′t) =r(st)−(1−γ)βdRR(st,s′t)+(1−γ)βEg[R(s′t,g)] For comparison, the original relative reachability reward is: rRR(st,at)=r(st)−βdRR(st,s′t) The first and third terms in rnew are very similar to the two te
4ozziegooen4dOne question around the "Long Reflection" or around "What will AGI do?" is something like, "How bottlenecked will be by scientific advances that we'll need to then spend significant resources on?" I think some assumptions that this model typically holds are: 1. There will be decision-relevant unknowns. 2. Many decision-relevant unkowns will be EV-positive to work on. 3. Of the decision-relevant unknowns that are EV-positive to work on, these will take between 1% to 99% of our time. (3) seems quite uncertain to me in the steady state. I believe it makes an intuitive estimate between 2 orders of magnitude, while the actual uncertainty is much higher than that. If this were the case, it would mean: 1. Almost all possible experiments are either trivial (<0.01% of resources, in total), or not cost-effective. 2. If some things are cost-effective and still expensive (they will take over 1% of the AGI lifespan), it's likely that they will take 100%+ of the time. Even if they would take 10^10% of the time, in expectation, they could still be EV-positive to pursue. I wouldn't be surprised if there were one single optimal thing like this in the steady-state. So this strategy would look something like, "Do all the easy things, then spend a huge amount of resources on one gigantic-sized, but EV-high challenge." (This was inspired by a talk that Anders Sandberg gave)
Load More (5/9)

Week Of Sunday, January 5th 2020
Week Of Sun, Jan 5th 2020

Shortform [Beta]
18tragedyofthecomments8dI often see people making statements that sound to me like . . . "The entity in charge of bay area rationality should enforce these norms." or "The entity in charge of bay area rationality is bad for allowing x to happen." There is no entity in charge of bay area rationality. There's a bunch of small groups of people that interact with each other sometimes. They even have quite a bit of shared culture. But no one is in charge of this thing, there is no entity making the set of norms for rationalists, there is no one you can outsource the building of your desired group to.
17bgold13d* Yes And is an improv technique where you keep the energy in a scene alive by going w/ the other persons suggestion and adding more to it. "A: Wow is that your pet monkey? B: Yes and he's also my doctor!" * Yes And is generative (creates a lot of output), as opposed to Hmm No which is critical (distills output) * A lot of the Sequences is Hmm No * It's not that Hmm No is wrong, it's that it cuts off future paths down the Yes And thought-stream. * If there's a critical error at the beginning of a thought that will undermine everything else then it makes sense to Hmm No (we don't want to spend a bunch of energy on something that will be fundamentally unsound). But if the later parts of the thought stream are not closely dependent on the beginning, or if it's only part of the stream that gets cut off, then you've lost a lot of potential value that could've been generated by the Yes And. * In conversation yes and is much more fun, which might be why the Sequences are important as a corrective (yeah look it's not fun to remember about biases, but they exist and you should model/include them) * Write drunk, edit sober. Yes And drunk, Hmm No in the morning.
13Vanessa Kosoy13dSome thoughts about embedded agency. From a learning-theoretic perspective, we can reformulate the problem of embedded agency as follows: What kind of agent, and in what conditions, can effectively plan for events after its own death? For example, Alice bequeaths eir fortune to eir children, since ey want them be happy even when Alice emself is no longer alive. Here, "death" can be understood to include modification, since modification is effectively destroying an agent and replacing it by different agent[1] [#fn-Ny6yxNfPASvfJoXSr-1]. For example, Clippy 1.0 is an AI that values paperclips. Alice disabled Clippy 1.0 and reprogrammed it to value staples before running it again. Then, Clippy 2.0 can be considered to be a new, different agent. First, in order to meaningfully plan for death, the agent's reward function has to be defined in terms of something different than its direct perceptions. Indeed, by definition the agent no longer perceives anything after death. Instrumental reward functions [https://www.alignmentforum.org/posts/aAzApjEpdYwAxnsAS/reinforcement-learning-with-imperceptible-rewards] are somewhat relevant but still don't give the right object, since the reward is still tied to the agent's actions and observations. Therefore, we will consider reward functions defined in terms of some fixed ontology of the external world. Formally, such an ontology can be an incomplete[2] [#fn-Ny6yxNfPASvfJoXSr-2] Markov chain, the reward function being a function of the state. Examples: * The Markov chain is a representation of known physics (or some sector of known physics). The reward corresponds to the total mass of diamond in the world. To make this example work, we only need enough physics to be able to define diamonds. For example, we can make do with quantum electrodynamics + classical gravity and have the Knightian uncertainty account for all nuclear and high-energy phenomena. * The Markov chain is a representation of people and
11TurnTrout14dSuppose you could choose how much time to spend at your local library, during which: * you do not age. Time stands still outside; no one enters or exits the library (which is otherwise devoid of people). * you don't need to sleep/eat/get sunlight/etc * you can use any computers, but not access the internet or otherwise bring in materials with you * you can't leave before the requested time is up Suppose you don't go crazy from solitary confinement, etc. Remember that value drift is a potential thing. How long would you ask for?
9George12dNote 1: Not a relevant analogy unless you use the StackExchange Network. I think the stack overflow reputation system is a good analogous for the issues one encounters with a long-running monetary system. I like imaginary awards, when I was younger I specifically liked the imaginary awards from stack overflow (Reputation) because I though they'd help me get some recognition as a developer (silly, but in my defense, I was a teenager). However, it proved to be very difficult to find questions that nobody else had answered which I could answer and were popular enough to get more than one or two upvotes for said answer (upvotes generate reputation). I got to like 500 reputation and I slowly started being less active on SO (now the only question I answer are basically my own, in case nobody provides and answer but I end up finding a solution). I recently checked my reputation on SO and noticed I was close to 2000 point, despite not being active on the website in almost 4 years o.o Because reputation from "old questions" accumulate. I though "oh, how much would have young me loved to see this now-valueless currency reach such an arbitrarily high level". I think this is in many ways analogous to the issues with the monetary system. Money seems to loss its appeal as you get older, since it can buy less and less and you need less and less. All your social signaling and permanent possession needs are gone by the time you hit 60. All your "big dreams" now require too much energy, even if you theoretically have the capital to put them in practice. At the same time Stack Exchange reputation gives you the power to judge others, you can gift reputation for a good answer, you can edit people's answers and questions without approval, you can review questions and decide they are duplicates or don't fit the community and reject them. Again, something I'd be very good at when I was 19, and deeply passionate about software development. Something that I'm probably less good at no
Load More (5/18)

Week Of Sunday, December 29th 2019
Week Of Sun, Dec 29th 2019

Frontpage Posts
Shortform [Beta]
34BrienneYudkowsky20dI wrote up my shame processing method. I think it comes from some combination of Max (inspired by NVC maybe?), Anna (mostly indirectly), and a lot of trial and error. I've been using it for a couple of years (in various forms), but I don't have much PCK on it yet. If you'd like to try it out, I'd love for you to report back on how it went! Please also ask me questions. What's up with shame? According to me, shame is for keeping your actions in line with what you care about. It happens when you feel motivated to do something that you believe might damage what is valuable (whether or not you actually do the thing). Shame indicates a particular kind of internal conflict. There's something in favor of the motivation, and something else against it. Both parts are fighting for things that matter to you. What is this shame processing method supposed to do? This shame processing method is supposed to aid in the goal of shame itself: staying in contact with what you care about as you act. It's also supposed to develop a clearer awareness of what is at stake in the conflict so you can use your full intelligence to solve the problem. What is the method? The method is basically a series of statements with blanks to fill in. The statements guide you a little at a time toward a more direct way of seeing your conflict. Here's a template; it's meant to be filled out in order. I notice that I feel ashamed. I think I first started feeling it while ___. I care about ___(X). I'm not allowed to want ___ (Y). I worry that if I want Y, ___. What's good about Y is ___(Z). I care about Z, and I also care about X. Example (a real one, from this morning): I notice that I feel ashamed. I think I first started feeling it while reading the first paragraph of a Lesswrong post. I care about being creative. I'm not allowed to want to move at a comfortable pace. I worry that if I move at a comfortable pace, my thoughts will slow down more and more over time and I'll become a vegetable.
21bgold17d* Why do I not always have conscious access to my inner parts? Why, when speaking with authority figures, might I have a sudden sense of blankness. * Recently I've been thinking about this reaction in the frame of 'legibility', ala Seeing like a State. State's would impose organizational structures on societies that were easy to see and control - they made the society more legible - to the actors who ran the state, but these organizational structure were bad for the people in the society. * For example, census data, standardized weights and measures, and uniform languages make it easier to tax and control the population. [Wikipedia] * I'm toying with applying this concept across the stack. * If you have an existing model of people being made up of parts [Kaj's articles], I think there's a similar thing happening. I notice I'm angry but can't quite tell why or get a conceptual handle on it - if it were fully legible and accessible to conscious mind, then it would be much easier to apply pressure and control that 'part', regardless if the control I am exerting is good. So instead, it remains illegible.A level up, in a small group conversation, I notice I feel missed, like I'm not being heard in fullness, but someone else directly asks me about my model and I draw a blank, like I can't access this model or share it. If my model were legible, someone else would get more access to it and be able to control it/point out its flaws. That might be good or it might be bad, but if it's illegible it can't be "coerced"/"mistaken" by others.One more level up, I initially went down this track of thinking for a few reasons, one of which was wondering why prediction forecasting systems are so hard to adopt within organizations. Operationalization of terms is difficult and it's hard to get a precise enough question that everyone can agree on, but it's very 'unfun' to have uncertain terms (people are much more lik
17Ben Pace17dThere's a game for the Oculus Quest (that you can also buy on Steam) called "Keep Talking And Nobody Explodes". It's a two-player game. When playing with the VR headset, one of you wears the headset and has to defuse bombs in a limited amount of time (either 3, 4 or 5 mins), while the other person sits outside the headset with the bomb-defusal manual and tells you what to do. Whereas with other collaboration games, you're all looking at the screen together, with this game the substrate of communication is solely conversation, the other person is providing all of your inputs about how their half is going (i.e. not shown on a screen). The types of puzzles are fairly straightforward computational problems but with lots of fiddly instructions, and require the outer person to figure out what information they need from the inner person. It often involves things like counting numbers of wires of a certain colour, or remembering the previous digits that were being shown, or quickly describing symbols that are not any known letter or shape. So the game trains you and a partner in efficiently building a shared language for dealing with new problems. More than that, as the game gets harder, often some of the puzzles require substantial independent computation from the player on the outside. At this point, it can make sense to play with more than two people, and start practising methods for assigning computational work between the outer people (e.g. one of them works on defusing the first part of the bomb, and while they're computing in their head for ~40 seconds, the other works on defusing the second part of the bomb in dialogue with the person on the inside). This further creates a system which trains the ability to efficiently coordinate on informational work under. Overall I think it's a pretty great game for learning and practising a number of high-pressure communication skills with people you're close to.
10romeostevensit16dFlow is a sort of cybernetic pleasure. The pleasure of being in tight feedback with an environment that has fine grained intermediary steps allowing you to learn faster than you can even think.
9romeostevensit17dThe arguments against iq boosting on the grounds of evolution as an efficient search of the space of architecture given constrains would have applied equally well for people arguing that injectable steroids usable in humans would never be developed.
Load More (5/17)

Load More Weeks