All Posts

Sorted by Magic (New & Upvoted)

Week Of Sunday, January 12th 2020
Week Of Sun, Jan 12th 2020

Shortform [Beta]
15TurnTrout4dWhile reading Focusing today, I thought about the book and wondered how many exercises it would have. I felt a twinge of aversion. In keeping with my goal of increasing internal transparency, I said to myself: "I explicitly and consciously notice that I felt averse to some aspect of this book". I then Focused on the aversion. Turns out, I felt a little bit disgusted, because a part of me reasoned thusly: (Transcription of a deeper Focusing on this reasoning) I'm afraid of being slow. Part of it is surely the psychological remnants of the RSI I developed in the summer of 2018. That is, slowing down is now emotionally associated with disability and frustration. There was a period of meteoric progress as I started reading textbooks and doing great research, and then there was pain. That pain struck even when I was just trying to take care of myself, sleep, open doors. That pain then left me on the floor of my apartment, staring at the ceiling, desperately willing my hands to just get better. They didn't (for a long while), so I just lay there and cried. That was slow, and it hurt. No reviews, no posts, no typing, no coding. No writing, slow reading. That was slow, and it hurt. Part of it used to be a sense of "I need to catch up and learn these other subjects which [Eliezer / Paul / Luke / Nate] already know". Through internal double crux, I've nearly eradicated this line of thinking, which is neither helpful nor relevant nor conducive to excitedly learning the beautiful settled science of humanity. Although my most recent post [https://www.lesswrong.com/posts/eX2aobNp5uCdcpsiK/on-being-robust] touched on impostor syndrome, that isn't really a thing for me. I feel reasonably secure in who I am, now (although part of me worries that others wrongly view me as an impostor?). However, I mostly just want to feel fast, efficient, and swift again. I sometimes feel like I'm in a race with Alex2018, and I feel like I'm losing.
12jacobjacob3dI saw an ad for a new kind of pant: stylish as suit pants, but flexible as sweatpants. I didn't have time to order them now. But I saved the link in a new tab in my clothes database -- an Airtable that tracks all the clothes I own. This crystallised some thoughts about external systems that have been brewing at the back of my mind. In particular, about the gears-level principles that make some of them useful, and powerful, When I say "external", I am pointing to things like spreadsheets, apps, databases, organisations, notebooks, institutions, room layouts... and distinguishing those from minds, thoughts and habits. (Though this distinction isn't exact, as will be clear below, and some of these ideas are at an early stage.) Externalising systems allows the following benefits... 1. GATHERING ANSWERS TO UNSEARCHABLE QUERIES There are often things I want lists of, which are very hard to Google or research. For example: * List of groundbreaking discoveries that seem trivial in hindsight * List of different kinds of confusion, identified by their phenomenological qualia * List of good-faith arguments which are elaborate and rigorous, though uncertain, and which turned out to be wrong etc. Currently there is no search engine (but the human mind) capable of finding many of these answers (if I am expecting a certain level of quality). But for that reason researching the lists is also very hard. The only way I can build these lists is by accumulating those nuggets of insight over time. And the way I make that happen, is to make sure to have external systems which are ready to capture those insights as they appear. 2. SEIZING SERENDIPITY Luck favours the prepared mind. Consider the following anecdote: I think this is true far beyond beyond intellectual discovery. In order for the most valuable companies to exist, there must be VCs ready to fund those companies when their founders are toying with the ideas. In order for the best jokes to exist, the
6MichaelA1dWAYS OF DESCRIBING THE “TRUSTWORTHINESS” OF PROBABILITIES While doing research for a post on the idea of a distinction between “risk” and “(Knightian) uncertainty [https://en.wikipedia.org/wiki/Knightian_uncertainty]”, I came across a surprisingly large number of different ways of describing the idea that some probabilities may be more or less “reliable”, “trustworthy”, “well-grounded”, etc. than others, or things along those lines. (Note that I’m referring to the idea of different degrees of trustworthiness-or-whatever, rather than two or more fundamentally different types of probability that vary in trustworthiness-or-whatever.) I realised that it might be valuable to write a post collecting all of these terms/concepts/framings together, analysing the extent to which some may be identical to others, highlighting ways in which they may differ, suggesting ways or contexts in which some of the concepts may be superior to others, etc.[1] [#fn-wGnf2warekZDiMkWj-1] But there’s already too many things I’m working on writing at the moment, so this is a low effort version of that idea - this is basically just a collection of the concepts, relevant quotes, and links where readers can find more. Comments on this post will inform whether I take the time to write something more substantial/useful on this topic later (and, if so, precisely what and how). Note that this post does not explicitly cover the “risk vs uncertainty” framing itself, as I’m already writing a separate, more thorough post on that. EPISTEMIC CREDENTIALS Dominic Roser [https://link.springer.com/article/10.1007%2Fs11948-017-9919-x] speaks of how “high” or “low” the epistemic credentials of our probabilities are. He writes: He further explains what he means by this in a passage that also alludes to many other ways of describing or framing an idea along the lines of the trustworthiness of given probabilities: RESILIENCE (OF CREDENCES) Amanda Askell discusses the idea that we can have “more” or “less” res
4ozziegooen3dOne question around the "Long Reflection" or around "What will AGI do?" is something like, "How bottlenecked will be by scientific advances that we'll need to then spend significant resources on?" I think some assumptions that this model typically holds are: 1. There will be decision-relevant unknowns. 2. Many decision-relevant unkowns will be EV-positive to work on. 3. Of the decision-relevant unknowns that are EV-positive to work on, these will take between 1% to 99% of our time. (3) seems quite uncertain to me in the steady state. I believe it makes an intuitive estimate between 2 orders of magnitude, while the actual uncertainty is much higher than that. If this were the case, it would mean: 1. Almost all possible experiments are either trivial (<0.01% of resources, in total), or not cost-effective. 2. If some things are cost-effective and still expensive (they will take over 1% of the AGI lifespan), it's likely that they will take 100%+ of the time. Even if they would take 10^10% of the time, in expectation, they could still be EV-positive to pursue. I wouldn't be surprised if there were one single optimal thing like this in the steady-state. So this strategy would look something like, "Do all the easy things, then spend a huge amount of resources on one gigantic-sized, but EV-high challenge." (This was inspired by a talk that Anders Sandberg gave)
4toonalfrink4dSo here's two extremes. One is that human beings are a complete lookup table. The other one is that human beings are perfect agents with just one goal. Most likely both are somewhat true. We have subagents that are more like the latter, and subsystems more like the former. But the emphasis on "we're just a bunch of hardcoded heuristics" is making us stop looking for agency where there is in fact agency. Take for example romantic feelings. People tend to regard them as completely unpredictable, but it is actually possible to predict to some extent whether you'll fall in and out of love with someone based on some criteria, like whether they're compatible with your self-narrative and whether their opinions and interests align with yours, etc. The same is true for many intuitions that we often tend to dismiss as just "my brain" or "neurotransmitter xyz" or "some knee-jerk reaction". There tends to be a layer of agency in these things. A set of conditions that makes these things fire off, or not fire off. If we want to influence them, we should be looking for the levers, instead of just accepting these things as a given. So sure, we're godshatter, but the shards are larger than we give them credit for.
Load More (5/7)

Week Of Sunday, January 5th 2020
Week Of Sun, Jan 5th 2020

Shortform [Beta]
17bgold12d* Yes And is an improv technique where you keep the energy in a scene alive by going w/ the other persons suggestion and adding more to it. "A: Wow is that your pet monkey? B: Yes and he's also my doctor!" * Yes And is generative (creates a lot of output), as opposed to Hmm No which is critical (distills output) * A lot of the Sequences is Hmm No * It's not that Hmm No is wrong, it's that it cuts off future paths down the Yes And thought-stream. * If there's a critical error at the beginning of a thought that will undermine everything else then it makes sense to Hmm No (we don't want to spend a bunch of energy on something that will be fundamentally unsound). But if the later parts of the thought stream are not closely dependent on the beginning, or if it's only part of the stream that gets cut off, then you've lost a lot of potential value that could've been generated by the Yes And. * In conversation yes and is much more fun, which might be why the Sequences are important as a corrective (yeah look it's not fun to remember about biases, but they exist and you should model/include them) * Write drunk, edit sober. Yes And drunk, Hmm No in the morning.
15tragedyofthecomments7dI often see people making statements that sound to me like . . . "The entity in charge of bay area rationality should enforce these norms." or "The entity in charge of bay area rationality is bad for allowing x to happen." There is no entity in charge of bay area rationality. There's a bunch of small groups of people that interact with each other sometimes. They even have quite a bit of shared culture. But no one is in charge of this thing, there is no entity making the set of norms for rationalists, there is no one you can outsource the building of your desired group to.
13Vanessa Kosoy12dSome thoughts about embedded agency. From a learning-theoretic perspective, we can reformulate the problem of embedded agency as follows: What kind of agent, and in what conditions, can effectively plan for events after its own death? For example, Alice bequeaths eir fortune to eir children, since ey want them be happy even when Alice emself is no longer alive. Here, "death" can be understood to include modification, since modification is effectively destroying an agent and replacing it by different agent[1] [#fn-Ny6yxNfPASvfJoXSr-1]. For example, Clippy 1.0 is an AI that values paperclips. Alice disabled Clippy 1.0 and reprogrammed it to value staples before running it again. Then, Clippy 2.0 can be considered to be a new, different agent. First, in order to meaningfully plan for death, the agent's reward function has to be defined in terms of something different than its direct perceptions. Indeed, by definition the agent no longer perceives anything after death. Instrumental reward functions [https://www.alignmentforum.org/posts/aAzApjEpdYwAxnsAS/reinforcement-learning-with-imperceptible-rewards] are somewhat relevant but still don't give the right object, since the reward is still tied to the agent's actions and observations. Therefore, we will consider reward functions defined in terms of some fixed ontology of the external world. Formally, such an ontology can be an incomplete[2] [#fn-Ny6yxNfPASvfJoXSr-2] Markov chain, the reward function being a function of the state. Examples: * The Markov chain is a representation of known physics (or some sector of known physics). The reward corresponds to the total mass of diamond in the world. To make this example work, we only need enough physics to be able to define diamonds. For example, we can make do with quantum electrodynamics + classical gravity and have the Knightian uncertainty account for all nuclear and high-energy phenomena. * The Markov chain is a representation of people and
11TurnTrout12dSuppose you could choose how much time to spend at your local library, during which: * you do not age. Time stands still outside; no one enters or exits the library (which is otherwise devoid of people). * you don't need to sleep/eat/get sunlight/etc * you can use any computers, but not access the internet or otherwise bring in materials with you * you can't leave before the requested time is up Suppose you don't go crazy from solitary confinement, etc. Remember that value drift is a potential thing. How long would you ask for?
9George10dNote 1: Not a relevant analogy unless you use the StackExchange Network. I think the stack overflow reputation system is a good analogous for the issues one encounters with a long-running monetary system. I like imaginary awards, when I was younger I specifically liked the imaginary awards from stack overflow (Reputation) because I though they'd help me get some recognition as a developer (silly, but in my defense, I was a teenager). However, it proved to be very difficult to find questions that nobody else had answered which I could answer and were popular enough to get more than one or two upvotes for said answer (upvotes generate reputation). I got to like 500 reputation and I slowly started being less active on SO (now the only question I answer are basically my own, in case nobody provides and answer but I end up finding a solution). I recently checked my reputation on SO and noticed I was close to 2000 point, despite not being active on the website in almost 4 years o.o Because reputation from "old questions" accumulate. I though "oh, how much would have young me loved to see this now-valueless currency reach such an arbitrarily high level". I think this is in many ways analogous to the issues with the monetary system. Money seems to loss its appeal as you get older, since it can buy less and less and you need less and less. All your social signaling and permanent possession needs are gone by the time you hit 60. All your "big dreams" now require too much energy, even if you theoretically have the capital to put them in practice. At the same time Stack Exchange reputation gives you the power to judge others, you can gift reputation for a good answer, you can edit people's answers and questions without approval, you can review questions and decide they are duplicates or don't fit the community and reject them. Again, something I'd be very good at when I was 19, and deeply passionate about software development. Something that I'm probably less good at no
Load More (5/18)

Week Of Sunday, December 29th 2019
Week Of Sun, Dec 29th 2019

Frontpage Posts
Shortform [Beta]
34BrienneYudkowsky19dI wrote up my shame processing method. I think it comes from some combination of Max (inspired by NVC maybe?), Anna (mostly indirectly), and a lot of trial and error. I've been using it for a couple of years (in various forms), but I don't have much PCK on it yet. If you'd like to try it out, I'd love for you to report back on how it went! Please also ask me questions. What's up with shame? According to me, shame is for keeping your actions in line with what you care about. It happens when you feel motivated to do something that you believe might damage what is valuable (whether or not you actually do the thing). Shame indicates a particular kind of internal conflict. There's something in favor of the motivation, and something else against it. Both parts are fighting for things that matter to you. What is this shame processing method supposed to do? This shame processing method is supposed to aid in the goal of shame itself: staying in contact with what you care about as you act. It's also supposed to develop a clearer awareness of what is at stake in the conflict so you can use your full intelligence to solve the problem. What is the method? The method is basically a series of statements with blanks to fill in. The statements guide you a little at a time toward a more direct way of seeing your conflict. Here's a template; it's meant to be filled out in order. I notice that I feel ashamed. I think I first started feeling it while ___. I care about ___(X). I'm not allowed to want ___ (Y). I worry that if I want Y, ___. What's good about Y is ___(Z). I care about Z, and I also care about X. Example (a real one, from this morning): I notice that I feel ashamed. I think I first started feeling it while reading the first paragraph of a Lesswrong post. I care about being creative. I'm not allowed to want to move at a comfortable pace. I worry that if I move at a comfortable pace, my thoughts will slow down more and more over time and I'll become a vegetable.
21bgold16d* Why do I not always have conscious access to my inner parts? Why, when speaking with authority figures, might I have a sudden sense of blankness. * Recently I've been thinking about this reaction in the frame of 'legibility', ala Seeing like a State. State's would impose organizational structures on societies that were easy to see and control - they made the society more legible - to the actors who ran the state, but these organizational structure were bad for the people in the society. * For example, census data, standardized weights and measures, and uniform languages make it easier to tax and control the population. [Wikipedia] * I'm toying with applying this concept across the stack. * If you have an existing model of people being made up of parts [Kaj's articles], I think there's a similar thing happening. I notice I'm angry but can't quite tell why or get a conceptual handle on it - if it were fully legible and accessible to conscious mind, then it would be much easier to apply pressure and control that 'part', regardless if the control I am exerting is good. So instead, it remains illegible.A level up, in a small group conversation, I notice I feel missed, like I'm not being heard in fullness, but someone else directly asks me about my model and I draw a blank, like I can't access this model or share it. If my model were legible, someone else would get more access to it and be able to control it/point out its flaws. That might be good or it might be bad, but if it's illegible it can't be "coerced"/"mistaken" by others.One more level up, I initially went down this track of thinking for a few reasons, one of which was wondering why prediction forecasting systems are so hard to adopt within organizations. Operationalization of terms is difficult and it's hard to get a precise enough question that everyone can agree on, but it's very 'unfun' to have uncertain terms (people are much more lik
17Ben Pace16dThere's a game for the Oculus Quest (that you can also buy on Steam) called "Keep Talking And Nobody Explodes". It's a two-player game. When playing with the VR headset, one of you wears the headset and has to defuse bombs in a limited amount of time (either 3, 4 or 5 mins), while the other person sits outside the headset with the bomb-defusal manual and tells you what to do. Whereas with other collaboration games, you're all looking at the screen together, with this game the substrate of communication is solely conversation, the other person is providing all of your inputs about how their half is going (i.e. not shown on a screen). The types of puzzles are fairly straightforward computational problems but with lots of fiddly instructions, and require the outer person to figure out what information they need from the inner person. It often involves things like counting numbers of wires of a certain colour, or remembering the previous digits that were being shown, or quickly describing symbols that are not any known letter or shape. So the game trains you and a partner in efficiently building a shared language for dealing with new problems. More than that, as the game gets harder, often some of the puzzles require substantial independent computation from the player on the outside. At this point, it can make sense to play with more than two people, and start practising methods for assigning computational work between the outer people (e.g. one of them works on defusing the first part of the bomb, and while they're computing in their head for ~40 seconds, the other works on defusing the second part of the bomb in dialogue with the person on the inside). This further creates a system which trains the ability to efficiently coordinate on informational work under. Overall I think it's a pretty great game for learning and practising a number of high-pressure communication skills with people you're close to.
10romeostevensit14dFlow is a sort of cybernetic pleasure. The pleasure of being in tight feedback with an environment that has fine grained intermediary steps allowing you to learn faster than you can even think.
9romeostevensit16dThe arguments against iq boosting on the grounds of evolution as an efficient search of the space of architecture given constrains would have applied equally well for people arguing that injectable steroids usable in humans would never be developed.
Load More (5/17)

Week Of Sunday, December 22nd 2019
Week Of Sun, Dec 22nd 2019

Frontpage Posts
Shortform [Beta]
39Kaj_Sotala20dOccasionally I find myself nostalgic for the old, optimistic transhumanism of which e.g. this 2006 article [https://web.archive.org/web/20081008121438/http://www.acceleratingfuture.com/michael/blog/2006/09/overpopulation-no-problem/] is a good example. After some people argued that radical life extension would increase our population too much, the author countered that oh, that's not an issue, here are some calculations showing that our planet could support a population of 100 billion with ease! In those days, the ethos seemed to be something like... first, let's apply a straightforward engineering approach to eliminating aging [https://en.wikipedia.org/wiki/Strategies_for_Engineered_Negligible_Senescence], so that nobody who's alive needs to worry about dying from old age. Then let's get nanotechnology and molecular manufacturing to eliminate scarcity and environmental problems. Then let's re-engineer the biosphere and human psychology for maximum well-being, such as by using genetic engineering to eliminate suffering [https://www.abolitionist.com/] and/or making it a violation of the laws of physics to try to harm or coerce someone [http://www.mitchellhowe.com/sysopfaq.htm]. So something like "let's fix the most urgent pressing problems and stabilize the world, then let's turn into a utopia". X-risk was on the radar, but the prevailing mindset seemed to be something like "oh, x-risk? yeah, we need to get to that too". That whole mindset used to feel really nice. Alas, these days it feels like it was mostly wishful thinking. I haven't really seen that spirit in a long time; the thing that passes for optimism these days is "Moloch hasn't entirely won (yet [https://www.lesswrong.com/posts/ham9i5wf4JCexXnkN/moloch-hasn-t-won])". If "overpopulation? no problem!" felt like a prototypical article to pick from the Old Optimistic Era, then Today's Era feels more described by Inadequate Equilibria [https://equilibriabook.com/] and a post saying "if you can afford it, c
29hamnox25dI went to some martial arts class, jiu jitsu, and before they taught me anything else they taught me how to break falls safely. Same with parkour class. You're going to fall, they said. You need a way to catch yourself without fucking up your arms or back. It's not just as mistakes when you're learning a new move, either, though it will certainly happen more often then. You're throwing yourself all over the place, tripping each other; you're going to hit the ground at momentum. You need to know how to handle yourself when that happens, how to roll with it and get up right after safe and sound. Every class, the first thing we do is drill break falls. I don't think The Art of Rationality has that. Yes we notice the skulls [https://slatestarcodex.com/2017/04/07/yes-we-have-noticed-the-skulls/]. It seems like I see a new treatise pointing out the valley of bad rationality every few months. And yet... * When you share what you know, do you share safety skills and warnings with it? * Do you have a sense of how likely are you to injure yourself in your practice? * What specific actions do you take when you notice you're taking epistemic damage? * How strong are your skills in harm-minimization? Do you have it down to ingrained reaction or habit? * Do you practice locating individual personal abilities + limits with the distribution of expected human traits as a guide, or are you fitting your strategies to a population-level statistic? I have some ideas. I wanna hear yours.
15ozziegooen24dNamespace pollution and name collision are two great concepts in computer programming. They way they are handled in many academic environments seems quite naive to me. Programs can get quite large and thus naming things well is surprisingly important. Many of my code reviews are primarily about coming up with good names for things. In a large codebase, every time symbolicGenerator() is mentioned, it refers to the same exact thing. If after one part of the codebase has been using symbolicGenerator for a reasonable set of functions, and later another part comes up, and it's programmer realizes that symbolicGenerator is also the best name for that piece, they have to make a tough decision. Either they could refactor the codebase to change all previous mentions of symbolicGenerator to use an alternative name, or they have to come up with an alternative name. They can't have it both ways. Therefore, naming becomes a political process. Names touch many programmers who have different intuitions and preferences. A large refactor of naming in a section of the codebase that others use would often be taken quite hesitantly by that group. This makes it all the more important that good names are used initially. As such, reviewers care a lot about the names being pretty good; hopefully they are generic enough so that their components could be expanded while the name remains meaningful; but specific enough to be useful for remembering. Names that get submitted via pull requests represent much of the human part of the interface/API; they're harder to change later on, so obviously require extra work to get right the first time. To be clear, a name collision is when two unrelated variables have the same name, and namespace pollution refers to when code is initially submitted in ways that are likely to create unnecessary conflicts later on. Academia My impression is that in much of academia, there are few formal processes for groups of experts to agree on the names for things.
15ozziegooen25dEXPERIMENTAL PREDICTABILITY AND GENERALIZABILITY ARE CORRELATED A criticism to having people attempt to predict the results of experiments is that this will be near impossible. The idea is that experiments are highly sensitive to parameters and these would need to be deeply understood in order for predictors to have a chance at being more accurate than an uninformed prior. For example, in a psychological survey, it would be important that the predictors knew the specific questions being asked, details about the population being sampled, many details about the experimenters, et cetera. One counter-argument may not be to say that prediction will be easy in many cases, but rather that if these experiments cannot be predicted in a useful fashion without very substantial amounts of time, then these experiments aren’t probably going to be very useful anyway. Good scientific experiments produce results are generalizable [https://en.wikipedia.org/wiki/External_validity]. For instance, a study on the effectiveness of Malaria on a population should give us useful information (probably for use with forecasting) about the effectiveness on Malaria on other populations. If it doesn’t, then value would be limited. It would really be more of a historic statement than a scientific finding. Possible statement from a non-generalizable experiment: -------------------------------------------------------------------------------- FORMALIZATION One possible way of starting to formalize this a bit is to imagine experiments (assuming internal validity) as mathematical functions. The inputs would be the parameters and details of how the experiment was performed, and the results would be the main findings that the experiment found. experimentn(inputs)=findings If the experiment has internal validity, then observers should predict that if an identical (but subsequent) experiment were performed, it would result in identical findings. p((experimentn+1(inputsi)=findingsi)|(experimentn(inpu
14jacobjacob21dI made a Foretold notebook [https://www.foretold.io/c/52f5c66f-7b65-4d84-b1f6-ee345fe50df0/n/9a5d4b4b-bcd4-4dcd-8fd0-042ac68dd346] for predicting which posts will end up in the Best of 2018 book, following the LessWrong review. You can submit your own predictions as well. At some point I might write a longer post explaining why I think having something like "futures markets" on these things can create a more "efficient market" for content.
Load More (5/24)

Load More Weeks