All Posts

Sorted by Magic (New & Upvoted)

Saturday, September 19th 2020
Sat, Sep 19th 2020

Personal Blogposts
2Thomas Kwa10hGiven that social science research often doesn't replicate [] , is there a good way to search a social science finding or paper and see if it's valid? Ideally, one would be able to type in e.g. "growth mindset" or a link to Dweck's original research, and see: * a statement of the idea e.g. 'When "students believe their basic abilities, their intelligence, their talents, are just fixed traits", they underperform students who "understand that their talents and abilities can be developed through effort, good teaching and persistence." Carol Dweck initially studied this in 2012, measuring 5th graders on IQ tests.' * an opinion from someone reputable * any attempted replications, or meta-analyses that mention it * the Replication Markets [] predicted replication probability, if no replications have been attempted.

Friday, September 18th 2020
Fri, Sep 18th 2020

7capybaralet11hIt seems like a lot of people are still thinking of alignment as too binary, which leads to critical errors in thinking like: "there will be sufficient economic incentives to solve alignment", and "once alignment is a bottleneck, nobody will want to deploy unaligned systems, since such a system won't actually do what they want". It seems clear to me that: 1) These statements are true for a certain level of alignment, which I've called "approximate value learning" in the past ( [(] I think I might have also referred to it as "pretty good alignment" or "good enough alignment" at various times. 2) This level of alignment is suboptimal from the point of view of x-safety, since the downside risk of extinction for the actors deploying the system is less than the downside risk of extinction summed over all humans. 3) We will develop techniques for "good enough" alignment before we develop techniques that are acceptable from the standpoint of x-safety. 4) Therefore, the expected outcome is: once "good enough alignment" is developed, a lot of actors deploy systems that are aligned enough for them to benefit from them, but still carry an unacceptably high level of x-risk. 5) Thus if we don't improve alignment techniques quickly enough after developing "good enough alignment", it's development will likely lead to a period of increased x-risk (under the "alignment bottleneck" model).
3capybaralet11hA lot of the discussion of mesa-optimization seems confused. One thing that might be relevant towards clearing up the confusion is just to remember that "learning" and "inference" should not be thought of as cleanly separated, in the first place, see, e.g. AIXI... So when we ask "is it learning? Or just solving the task without learning", this seems like a confused framing to me. Suppose your ML system learned an excellent prior, and then just did Bayesian inference at test time. Is that learning? Sure, why not. It might not use a traditional search/optimization algorithm, but probably is has to do *something* like that for computational reasons if it wants to do efficient approximate Bayesian inference over a large hypothesis space, so...
1niplav14hIf we don't program philosophical reasoning into AI systems, they won't be able to reason philosophically.
1Mati_Roy1dI created a Facebook group to discuss moral philosophies that value life in and of itself:
1Mati_Roy1dHOW TO CALCULATE SUBJECTIVE YEARS OF LIFE? * If the brain is uniformly sped up (or slowed down), I would count this as proportionally more (or less) * Biostasis would be a complete slow down, so wouldn't count at all * I would not count unconscious sleeping or coma * I would only count dreaming if some of it is remembered ([ [] ](more on this)) For non-human animal brains, I would compare them to the baseline of individuals in their own species. For transhumans that had their mind expanded, I don't think there's an obvious way to get an equivalence. What would be a subjective year for a Jupiter brain? Maybe it could be in terms of information processed, but in that case, a Jupiter brain would be living A LOT of subjective time per objective time. Ultimately, given I don't have "intrinsic" diminishing returns on additional experience, the natural definition for me would the amount of 'thinking' that is as valuable. So a subjective year for my future Jupiter brain would be the duration for which I find that experience as valuable as a subjective year now. Maybe that could even account for diminishing value of experience at a specific mind size because events would start looking more and more similar?? But it otherwise wouldn't work for people that have "intrinsic" diminishing returns on additional experience. It would notably not work with people for whom marginal experiences start becoming undesirable at some point.

Thursday, September 17th 2020
Thu, Sep 17th 2020

11DanielFilan2dModels and considerations. There are two typical ways of deciding whether on net something is worth doing. The first is to come up with a model of the relevant part of the world, look at all the consequences of doing the thing in the model, and determine if those consequences are net positive. When this is done right, the consequences should be easy to evaluate and weigh off against each other. The second way is to think of a bunch of considerations in favour of and against doing something, and decide whether the balance of considerations supports doing the thing or not. I prefer model-building to consideration-listing, for the following reasons: * By building a model, you're forcing yourself to explicitly think about how important various consequences are, which is often elided in consideration-listing. Or rather, I don't know how to quantitatively compare importances of considerations without doing something very close to model-building. * Building a model lets you check which possible consequences are actually likely. This is an improvement on considerations, which are often of the form "such-and-such consequence might occur". * Building a model lets you notice consequences which you might not have immediately thought of. This can either cause you to believe that those consequences are likely, or look for a faulty modelling assumption that is producing those assumptions within the model. * Building a model helps you integrate your knowledge of the world, and explicitly enforces consistency in your beliefs about different questions. However, there are also upsides to consideration-listing: * The process of constructing a model is pretty similar to consideration-listing: specifically, the part where one uses one's judgement to determine which aspects of reality are important enough to include. * Consideration-listing is much easier to do, which is why it's the form that this hastily-written shortform post takes.
5capybaralet2dI'm frustrated with the meme that "mesa-optimization/pseudo-alignment is a robustness (i.e. OOD) problem". IIUC, this is definitionally true in the mesa-optimization paper, but I think this misses the point. In particular, this seems to exclude an important (maybe the most important) threat model: the AI understands how to appear aligned, and does so, while covertly pursues its own objective on-distribution, during training. This is exactly how I imagine a treacherous turn from a boxed superintelligent AI agent to occur, for instance. It secretly begins breaking out of the box (e.g. via manipulating humans) and we don't notice until its too late.
4Richard_Ngo2dGreg Egan on universality []:
2Dagon2dI always enjoy convoluted Omega situations, but I don't understand how these theoretical entities get to the point where their priors are as stated (and especially the meta-priors about how they should frame the decision problem). Before the start of the game, Omega has some prior distribution of the Agent's beliefs and update mechanisms. And the Agent has some distribution of beliefs about Omega's predictive power over situations where the Agent "feels like" it has a choice. What experiences cause Omega to update sufficiently to even offer the problem (ok, this is easy: quantum brain scan or other Star-Trek technobabble)? But what lets the Agent update to believing that their qualia of free-will is such an illusion in this case? And how do they then NOT meta-update to understand the belief-action-payout matrix well enough to take the most-profitable action?

Wednesday, September 16th 2020
Wed, Sep 16th 2020

6AllAmericanBreakfast2dI've been thinking about honesty over the last 10 years. It can play into at least three dynamics. One is authority and resistance. The revelation or extraction of information, and the norms, rules, laws, and incentives surrounding this, including moral concepts, are for the primary purpose of shaping the power dynamic. The second is practical communication. Honesty is the idea that specific people have a "right to know" certain pieces of information from you, and that you meet this obligation. There is wide latitude for "white lies," exaggeration, storytelling, "noble lies," self-protective omissions, image management, and so on in this conception. It's up to the individual's sense of integrity to figure out what the "right to know" entails in any given context. The third is honesty as a rigid rule. Honesty is about revealing every thought that crosses your mind, regardless of the effect it has on other people. Dishonesty is considered a person's natural and undesirable state, and the ability to reveal thoughts regardless of external considerations is considered a form of personal strength.
3capybaralet2dI like "tell culture" and find myself leaning towards it more often these days, but e.g. as I'm composing an email, I'll find myself worrying that the recipient will just interpret a statement like: "I'm curious about X" as a somewhat passive request for information about X (which it sort of is, but also I really don't want it to come across that way...) Anyone have thoughts/suggestions?
3capybaralet3dAs alignment techniques improve, they'll get good enough to solve new tasks before they get good enough to do so safely. This is a source of x-risk.

Tuesday, September 15th 2020
Tue, Sep 15th 2020

Personal Blogposts
13capybaralet4dWow this is a lot better than my FB/Twitter feed :P :D :D :D Let's do this guys! This is the new FB :P
12capybaralet4dI have the intention to convert a number of draft LW blog posts into short-forms. Then I will write a LW post linking to all of them and asking people to request that I elaborate on any that they are particularly interested in.
9mr-hire4dTrying to describe a particular aspect of Moloch I'm calling hyper-inductivity: The machine is hyper-inductive. Your descriptions of the machine are part of the machine. The machine wants you to escape, that is part of the machine. The machine knows that you know this. That is part of the machine. Your trauma fuels the machine. Healing your trauma fuels the machine. Traumatizing your kids fuels the machine. Failing to traumatize your kids fuels the machine. Defecting on the prisoner's dilemma fuels the machine. Telling others not to defect on the prisoner's dilemma fuels the machine. Your intentional community is part of the machine. Your meditation practice is part of the machine. Your art installation is part of the machine. Your protest is part of the machine. A select few will escape the machine. That is part of the machine. The machine will simplify, the machine will distort, the machine will politicize, the machine will consumerize. Jesus is part of the machine. Buddha is part of the machine. Elijah is part of the machine. Zuess is part of the machine. Your Kegan-5 ability to see outside the machine is part of the machine. Your mental models are part of the machine. Your bayesianism is part of the machine. Your shitposts are part of the machine. The machine devours. The machine creates. Your attempts to protect your ideas from the machine is part of the machine. Your attempts to fix the machine is part of the machine. Your attempts to see that the machine is an illusion is part of the machine. Your attempts to use the machine for your own purposes is part of the machine. The machine's goal is to grow the machine. The machine does not have a goal. The machine is designed to be anti-fragile. The machine is not designed. This post is part of the machine.
6capybaralet4d"No Free Lunch" (NFL) results in machine learning (ML) basically say that success all comes down to having a good prior. So we know that we need a sufficiently good prior in order to succeed. But we don't know what "sufficiently good" means. e.g. I've heard speculation that maybe we can use 2^-MDL in any widely used Turing-complete programming language (e.g. Python) for our prior, and that will give enough information about our particular physics for something AIXI-like to become superintelligent e.g. within our lifetime. Or maybe we can't get anywhere without a much better prior. DOES ANYONE KNOW of any work/(intelligent thoughts) on this?
4ryan_b4dI wonder how hard it would be to design a cost+confidence widget that would be easily compatible (for liberal values of easy) with spreadsheets. I'm reading a Bloomberg piece about Boeing [] which quotes employees talking about the MAX as being largely a problem of choosing the lowest bidder. This is also a notorious problem in other places where there are rules which specify using the lowest cost contractor, like parts of California and many federal procurement programs. It's a pretty widespread complaint. It would also be completely crazy for it to be any other way, for what feels like a simple reason: no one knows anything except the quoted price. There's no way to communicate confidence simply or easily. The lowest bid is easily transmitted through any spreadsheet, word document, or accounting software, quality of work be damned. Any attempt to even evaluate the work is a complicated one-off report, usually with poorly considered graphics, which isn't even included with the budget information so many decision makers are unlikely to ever see it. So, a cost+confidence widget. On the simple end I suppose it could just be a re-calculation using a reference class forecast. But if I'm honest what I really want is something like a spreadsheet where each cell has both a numerical and graphical value, so I could hover my mouse over the number to see the confidence graph, or switch views to see the confidence graphs instead of the values. Then if we're getting really ambitious, something like a time+confidence widget would also be awesome. Then when the time+confidence and cost+confidence values are all multiplied together, the output is a like a heat map of the two values, showing the distribution of project outcomes overall.

Monday, September 14th 2020
Mon, Sep 14th 2020

Frontpage Posts
15rohinmshah5dI often have the experience of being in the middle of a discussion and wanting to reference some simple but important idea / point, but there doesn't exist any such thing. Often my reaction is "if only there was time to write an LW post that I can then link to in the future". So far I've just been letting these ideas be forgotten, because it would be Yet Another Thing To Keep Track Of. I'm now going to experiment with making subcomments here simply collecting the ideas; perhaps other people will write posts about them at some point, if they're even understandable.
3ESRogs5dIf GPT-3 is what you get when you do a massive amount of unsupervised learning on internet text, what do you get when you do a massive amount of unsupervised learning on video data from cars? (In other words, can we expect anything interesting to come from Tesla's Dojo project, besides just better autopilot?)
2ofer5d[COVID-19 related] (Probably already obvious to most LW readers.) There seems to be a lot of uncertainty about the chances of COVID-19 causing long-term effects [] (including for young healthy people who experience only mild symptoms). Make sure to take this into account when deciding how much effort you're willing to put into not getting infected.

Sunday, September 13th 2020
Sun, Sep 13th 2020

29elityre6dTL;DR: I’m offering to help people productively have difficult conversations and resolve disagreements, for free. Feel free to email me if and when that seems helpful. elitrye [at] FACILITATION Over the past 4-ish years, I’ve had a side project of learning, developing, and iterating on methods for resolving tricky disagreements, and failures to communicate. A lot of this has been in the Double Crux frame, but I’ve also been exploring a number of other frameworks (including, NVC, Convergent Facilitation, Circling-inspired stuff, intuition extraction, and some home-grown methods). As part of that, I’ve had a standing offer to facilitate / mediate tricky conversations for folks in the CFAR and MIRI spheres (testimonials below). Facilitating “real disagreements”, allows me to get feedback on my current conversational frameworks and techniques. When I encounter blockers that I don’t know how to deal with, I can go back to the drawing board to model those problems and interventions that would solve them, and iterate from there, developing new methods. I generally like doing this kind of conversational facilitation and am open to doing a lot more of it with a wider selection of people. I am extending an offer to help mediate tricky conversations, to anyone that might read this post, for the foreseeable future. [If I retract this offer, I’ll come back and leave a note here.] WHAT SORT OF THING IS THIS GOOD FOR? I’m open to trying to help with a wide variety of difficult conversations, but the situations where I have been most helpful in the past have had the following features: 1. Two* people are either having some conflict or disagreement or are having difficulty understanding something about what the other person is saying. 2. There’s some reason to expect the conversation to not “work”, by default: either they’ve tried already, and made little progress etc. or, at least one person can predict that this conversation will be tricky or heated
6capybaralet6dRegarding the "Safety/Alignment vs. Capabilities" meme: it seems like people are sometimes using "capabilities" to use 2 different things: 1) "intelligence" or "optimization power"... i.e. the ability to optimize some objective function 2) "usefulness": the ability to do economically valuable tasks or things that people consider useful I think it is meant to refer to (1). Alignment is likely to be a bottleneck for (2). For a given task, we can expect 3 stages of progress: i) sufficient capabilities(1) to perform the task ii) sufficient alignment to perform the task unsafely iii) sufficient alignment to perform the task safely Between (i) and (ii) we can expect a "capabilities(1) overhang". When we go from (i) to (ii) we will see unsafe AI systems deployed and a potentially discontinuous jump in their ability to do the task.
44thWayWastrel6dAs we all do from time to time, I imagined what I'd do if through some impossible vagary I was made president of the USA all of a sudden. What would one do differently? How would you perform the kind of rapid sense making needed in such a situation? The obvious immediate step to me would be to use my presidential powers to assemble a room of people that seem smart, sane, practical, easy to work with and it got me thinking about who would be on that emergency list for you? One might expect the who's who of the rat-sphere, your Eliezer, Scotst, Zvi's, etc. But who else does your mind immediately go to in this thought experiment? And what qualities are you looking for? And how would you organise them?

Saturday, September 12th 2020
Sat, Sep 12th 2020

22Vaniver7dMy boyfriend: "I want a version of the Dune fear mantra but as applied to ugh fields [] instead" Me: Tho they later shortened it, and I think that one was better: Him: Nice, that feels like flinch towards []
6Rafael Harth7dCommon wisdom says that someone accusing you of x especially hurts if, deep down, you know that x is true. This is confusing because the general pattern I observe is closer to the opposite. At the same time, I don't think common wisdom is totally without a basis here. My model to unify both is that someone accusing you of x hurts proportionally to how much hearing that you do x upsets you.[1] [#fn-bKPBMazRGkKgSGhw6-1] And of course, one reason that it might upset you is that it's not true. But a separate reason is that you've made an effort to delude yourself about it. If you're a selfish person but spend a lot of effort pretending that you're not selfish at all, you super don't want to hear that you're actually selfish. Under this model, if someone gets very upset, it might be that that deep down they know the accusation is true, and they've tried to pretend it's not, but it might also be that the accusation is super duper not true, and they're upset precisely because it's so outrageous. -------------------------------------------------------------------------------- 1. Proportional just means it's one multiplicative factor, though. I think it also matters how high-status you perceive the other person to be. ↩︎ [#fnref-bKPBMazRGkKgSGhw6-1]
3ChristianKl7dRoom humidity matters for COVID-19 transmission. If you are going to spent a lot of time in the same rooms as other people in the next months, invest into proper humidity to reduce your risk of getting ill:

Friday, September 11th 2020
Fri, Sep 11th 2020

20John_Maxwell8dProgress Studies: Hair Loss Forums I still have about 95% of my hair. But I figure it's best to be proactive. So over the past few days I've been reading a lot about how to prevent hair loss. My goal here is to get a broad overview (i.e. I don't want to put in the time necessary to understand what a 5-alpha-reductase inhibitor actually is, beyond just "an antiandrogenic drug that helps with hair loss"). I want to identify safe, inexpensive treatments that have both research and anecdotal support. In the hair loss world, the "Big 3" refers to 3 well-known treatments for hair loss: finasteride, minoxidil, and ketoconazole. These treatments all have problems. Some finasteride users report permanent loss of sexual function. If you go off minoxidil, you lose all the hair you gained, and some say it wrinkles their skin. Ketoconazole doesn't work very well. To research treatments beyond the Big 3, I've been using various tools, including both Google Scholar and a "custom search engine" I created for digging up anecdotes from forums. Basically, take whatever query I'm interested in ("pumpkin seed oil" for instance), add this OR OR OR OR OR OR OR OR OR OR OR OR OR OR OR OR and then search on Google. Doing this repeatedly has left me feeling like a geologist who's excavated a narrow stratigraphic column [] of Internet history. And my big takeaway is how much dumber people got collectively between the "old school phpBB forum" layer and the "subreddit" layer. This is a caricature, but I don't think it wo
8Ruby8dJust updated the Concepts Portal. Tags that got added are: * Infra-Bayesianism [] * Aversion/Ugh Fields [] * Murphyjitsu [] * Coherent Extrapolated Volition [] * Tool AI [] * Computer Science [] * Sleeping Beauty Paradox [] * Simulation Hypothesis [] * Counterfactuals [] * Trolley Problem [] * Climate Change [] * Organizational Design and Culture [] * Acausal Trade [] * Privacy [] * 80,000 Hours [] * GiveWell [] * Note-Taking [] * Reading Group []
6TurnTrout8dWhen I imagine configuring an imaginary pile of blocks, I can feel the blocks in front of me in this fake imaginary plane of existence. I feel aware of their spatial relationships to me, in the same way that it feels different to have your eyes closed in a closet vs in an empty auditorium. But what is this mental workspace? Is it disjoint and separated from my normal spatial awareness, or does my brain copy/paste->modify my real-life spatial awareness. Like, if my brother is five feet in front of me, and then I imagine a blade flying five feet in front of me in my imaginary mental space where he doesn't exist, do I reflexively flinch? Does my brain overlay these two mental spaces, or are they separate? I don't know. When I run the test, I at least flinch at the thought of such a thing happening. This isn't a good experiment because I know what I'm testing for; I need to think of a better test.
4avturchin8dTwo types of Occam' razor: 1) The simplest explanation is the most probable, so the distribution of probabilities for hypotheses looks like: 0.75, 0.12, 0.04 .... if hypothesis are ordered from simplest to more complex. 2) The simplest explanation is the just more probable, so the distribution of probabilities for hypotheses looks like: 0.09, 0.07, 0.06, 0.05. The interesting feature of the second type is that simplest explanation is more likely to be wrong than right (its probability is less than 0.5). Different types of Occam razor are applicable in different situations. If the simplest hypothesis is significantly simpler than others, it is the first case. If all hypothesis are complex, it is the second. First situation is more applicable some inherently simple models, e.g. laws of physics or games. The second situation is more about complex situation real life.
2ChristianKl8dSchools around the world seem to start using automated grading for tests. If that technology exists, it would be interesting to have a forum that enforces posts to have a minimum score on those grading forms.

Thursday, September 10th 2020
Thu, Sep 10th 2020

10jimrandomh9dVitamin D reduces the severity of COVID-19, with a very large effect size, in an RCT. Vitamin D has a history of weird health claims around it failing to hold up in RCTs (this SSC post [] has a decent overview). But, suppose the mechanism of vitamin D is primarily immunological. This has a surprising implication: It means negative results in RCTs of vitamin D are not trustworthy. There are many health conditions where having had a particular infection, especially a severe case of that infection, is a major risk factor. For example, 90% of cases of cervical cancer are caused by HPV infection. There are many known infection-disease pairs like this (albeit usually with smaller effect size), and presumably also many unknown infection-disease pairs like this as well. Now suppose vitamin D makes you resistant to getting a severe case of a particular infection, which increases risk of a cancer at some delay. Researchers do an RCT of vitamin D for prevention of that kind of cancer, and their methodology is perfect. Problem: What if that infection wasn't common in at the time and place the RCT was performed, but is common somewhere else? Then the study will give a negative result. This throws a wrench into the usual epistemic strategies around vitamin D, and around every other drug and supplement where the primary mechanism of action is immune-mediated.
7romeostevensit9dIt would be really cool to link the physical open or closed state of your bedroom door to your digital notifications and 'online' statuses.
2[comment deleted]9d
2Donald Hobson9dI was working on a result about Turing machines in nonstandard models, Then I found I had rediscovered Chaitin's incompleteness theorem. I am trying to figure out how this relates to an AI that uses Kolmogorov complexity.

Load More Days