AIXSU - AI and X-risk Strategy Unconference
Portland SSC Meetup 10/01/19
Petrov Day Celebration 2019 - Oxford Campsite
Sorted by New
Magic (New & Upvoted)
Show Low Karma
How Specificity Works
Do Anki while Weightlifting Many rationalists appear to be interested in weightlifting. I certainly have enjoyed having a gym habit. I have a recommendation for those who do: Try studying Anki cards [https://twitter.com/michael_nielsen/status/957763229454774272?lang=en] while resting between weightlifting sets. The upside is high. Building the habit of studying Anki cards is hard, and if doing it at the gym causes it to stick, you can now remember things by choice not chance. And the cost is pretty low. I rest for 90 seconds between sets, and do about 20 sets when I go to the gym. Assuming I get a minute in once the overheads are accounted for, that gives me 20 minutes of studying. I go through about 4 cards per minute, so I could do 80 cards per visit to the gym. In practice I spend only ~5 minutes studying per visit, because I don't have that many cards. I'm not too tired to concentrate. In fact, the adrenaline high makes me happy to have something mentally active to do. Probably because of this, it doesn't at all decrease my desire to go to the gym. I find I can add simple cards to my Anki deck at the gym, although the mobile app does make it slow. Give it a try! It's cheap to experiment and the value of a positive result is high.
Selected Aphorisms from Francis Bacon's Novum Organum I'm currently working to format Francis Bacon's Novum Organum [https://en.wikipedia.org/wiki/Novum_Organum] as a LessWrong sequence. It's a moderate-sized project as I have to work through the entire work myself, and write an introduction which does Novum Organum justice and explains the novel move of taking an existing work and posting in on LessWrong (short answer: NovOrg is some serious hardcore rationality and contains central tenets of the LW foundational philosophy notwithstanding being published back in 1620, not to mention that Bacon and his works are credited with launching the modern Scientific Revolution) While I'm still working on this, I want to go ahead and share some of my favorite aphorisms from is so far: 3. . . . The only way to command reality is to obey it . . . 9. Nearly all the things that go wrong in the sciences have a single cause and root, namely: while wrongly admiring and praising the powers of the human mind, we don’t look for true helps for it.Bacon sees the unaided human mind as entirely inadequate for scientific progress. He sees for the way forward for scientific progress as constructing tools/infrastructure/methodogy to help the human mind think/reason/do science. 10. Nature is much subtler than are our senses and intellect; so that all those elegant meditations, theorizings and defensive moves that men indulge in are crazy—except that no-one pays attention to them. [Bacon often uses a word meaning ‘subtle’ in the sense of ‘fine-grained, delicately complex’; no one current English word will serve.] 24. There’s no way that axioms •established by argumentation could help us in the discovery of new things, because the subtlety of nature is many times greater than the subtlety of argument. But axioms •abstracted from particulars in the proper way often herald the discovery of new particulars and point them out, thereby returning the sciences to their active status.Bacon repeat
Facebook comment I wrote in February, in response to the question [https://www.facebook.com/bshlgrs/posts/10215817023473267?comment_id=10215823190387436] 'Why might having beauty in the world matter?': I assume you're asking about why it might be better for beautiful objects in the world to exist (even if no one experiences them), and not asking about why it might be better for experiences of beauty to exist. [... S]ome reasons I think this: 1. If it cost me literally nothing, I feel like I'd rather there exist a planet that's beautiful, ornate, and complex than one that's dull and simple -- even if the planet can never be seen or visited by anyone, and has no other impact on anyone's life. This feels like a weak preference, but it helps get a foot in the door for beauty. (The obvious counterargument here is that my brain might be bad at simulating the scenario where there's literally zero chance I'll ever interact with a thing; or I may be otherwise confused about my values.) 2. Another weak foot-in-the-door argument: People seem to value beauty, and some people claim to value it terminally. Since human value is complicated and messy and idiosyncratic (compare person-specific ASMR triggers or nostalgia triggers or culinary preferences) and terminal and instrumental values are easily altered and interchanged in our brain, our prior should be that at least some people really do have weird preferences like that at least some of the time. (And if it's just a few other people who value beauty, and not me, I should still value it for the sake of altruism and cooperativeness.) 3. If morality isn't "special" -- if it's just one of many facets of human values, and isn't a particularly natural-kind-ish facet -- then it's likelier that a full understanding of human value would lead us to treat aesthetic and moral preferences as more coextensive, interconnected, and fuzzy. If I can value someone else's happiness inherently, without needing to experience or know about i
Eliezer has written about the notion of security mindset [https://www.lesswrong.com/posts/8gqrbnW758qjHFTrH/security-mindset-and-ordinary-paranoia] , and there's an important idea that attaches to that phrase, which some people have an intuitive sense of and ability to recognize, but I don't think Eliezer's post quite captured the essence of the idea, or presented anything like a usable roadmap of how to acquire it. An1lam's recent shortform post [https://www.lesswrong.com/posts/xDWGELFkyKdBpySAf/an1lam-s-short-form-feed#jBwdmYjPCkSCDngX6] talked about the distinction between engineering mindset and scientist mindset, and I realized that, with the exception of Eliezer and perhaps a few people he works closely with, all of the people I know of with security mindset are engineer-types rather than scientist-types. That seemed like a clue; my first theory was that the reason for this is because engineer-types get to actually write software that might have security holes, and have the feedback cycle of trying to write secure software. But I also know plenty of otherwise-decent software engineers who don't have security mindset, at least of the type Eliezer described. My hypothesis is that to acquire security mindset, you have to: * Practice optimizing from a red team/attacker perspective, * Practice optimizing from a defender perspective; and * Practice modeling the interplay between those two perspectives. So a software engineer can acquire security mindset because they practice writing software which they don't want to have vulnerabilities, they practice searching for vulnerabilities (usually as an auditor simulating an attacker rather as an actual attacker, but the cognitive algorithm is the same), and they practice going meta when they're designing the architecture of new projects. This explains why security mindset is very common among experienced senior engineers (who have done each of the three many times), and rare among junior engineers (who haven't yet)
Someone tried to solve a big schlep of event organizing. Through this app, you: * Pledge money when signing up to an event * Lose it if you don't attend * Get it back if you attend + a share of the money from all the no-shows For some reason it uses crypto as the currency. I'm also not sure about the third clause, which seems to incentivise you to want others to no-show to get their deposits. Anyway, I've heard people wanting something like this to exist and might try it myself at some future event I'll organize. https://kickback.events/ [https://kickback.events/] H/T Vitalik Buterin's Twitter
Problems in AI Alignment that philosophers could potentially contribute to
Book Review: Secular Cycles
I think that an extremely effective way to get a better feel for a new subject is to pay an online tutor to answer your questions about it for an hour. It turns that there are a bunch of grad students on Wyzant who mostly work tutoring high school math or whatever but who are very happy to spend an hour answering your weird questions. For example, a few weeks ago I had a session with a first-year Harvard synthetic biology PhD. Before the session, I spent a ten-minute timer writing down things that I currently didn't get about biology. (This is an exercise worth doing even if you're not going to have a tutor, IMO.) We spent the time talking about some mix of the questions I'd prepared, various tangents that came up during those explanations, and his sense of the field overall. I came away with a whole bunch of my minor misconceptions fixed, a few pointers to topics I wanted to learn more about, and a way better sense of what the field feels like and what the important problems and recent developments are. There are a few reasons that having a paid tutor is a way better way of learning about a field than trying to meet people who happen to be in that field. I really like it that I'm paying them, and so I can aggressively direct the conversation to wherever my curiosity is, whether it's about their work or some minor point or whatever. I don't need to worry about them getting bored with me, so I can just keep asking questions until I get something. Conversational moves I particularly like: * "I'm going to try to give the thirty second explanation of how gene expression is controlled in animals; you should tell me the most important things I'm wrong about." * "Why don't people talk about X?" * "What should I read to learn more about X, based on what you know about me from this conversation?" All of the above are way faster with a live human than with the internet. I think that doing this for an hour or two weekly will make me substantially more knowl
Old post: RAND needed the "say oops" skill [https://musingsandroughdrafts.wordpress.com/2018/12/01/rand-needed-the-say-oops-skill/] [Epistemic status: a middling argument] A few months ago [https://musingsandroughdrafts.wordpress.com/2018/06/21/initial-comparison-between-rand-and-the-rationality-cluster/] , I wrote about how RAND, and the “Defense Intellectuals” of the cold war represent another precious datapoint of “very smart people, trying to prevent the destruction of the world, in a civilization that they acknowledge to be inadequate to dealing sanely with x-risk.” Since then I spent some time doing additional research into what cognitive errors and mistakesthose consultants, military officials, and politicians made that endangered the world. The idea being that if we could diagnose which specific irrationalities they were subject to, that this would suggest errors that might also be relevant to contemporary x-risk mitigators, and might point out some specific areas where development of rationality training is needed. However, this proved somewhat less fruitful than I was hoping, and I’ve put it aside for the time being. I might come back to it in the coming months. It does seem worth sharing at least one relevant anecdote, from Daniel Ellsberg’s excellent book, the Doomsday Machine, and analysis, given that I’ve already written it up. The missile gap In the late nineteen-fifties it was widely understood that there was a “missile gap”: that the soviets had many more ICBM (“intercontinental ballistic missiles” armed with nuclear warheads) than the US. Estimates varied widely on how many missiles the soviets had. The Army and the Navy gave estimates of about 40 missiles, which was about at parity with the the US’s strategic nuclear force. The Air Force and the Strategic Air Command, in contrast, gave estimates of as many as 1000 soviet missiles, 20 times more than the US’s count. (The Air Force and SAC were incentivized to inflate their estimates of the
New post: What is mental energy? [https://wordpress.com/post/musingsandroughdrafts.wordpress.com/398] [Note: I’ve started a research side project on this question, and it is already obvious to me that this ontology importantly wrong.] There’s a common phenomenology of “mental energy”. For instance, if I spend a couple of hours thinking hard (maybe doing math), I find it harder to do more mental work afterwards. My thinking may be slower and less productive. And I feel tired, or drained, (mentally, instead of physically). Mental energy is one of the primary resources that one has to allocate, in doing productive work. In almost all cases, humans have less mental energy than they have time, and therefore effective productivity is a matter of energy management, more than time management. If we want to maximize personal effectiveness, mental energy seems like an extremely important domain to understand. So what is it? The naive story is that mental energy is an actual energy resource that one expends and then needs to recoup. That is, when one is doing cognitive work, they are burning calories, depleting their bodies energy stores. As they use energy, they have less fuel to burn. My current understanding is that this story is not physiologically realistic. Thinking hard does consume more of the body’s energy than baseline, but not that much more. And we experience mental fatigue long before we even get close to depleting our calorie stores. It isn’t literal energy that is being consumed. [ The Psychology of Fatigue pg.27] So if not that, what is going on here? A few hypotheses: (The first few, are all of a cluster, so I labeled them 1a, 1b, 1c, etc.) Hypothesis 1a: Mental fatigue is a natural control system that redirects our attention to our other goals. The explanation that I’ve heard most frequently in recent years (since it became obvious that much of the literature on ego-depletion was off the mark), is the following: A human mind is composed of a bunch
A couple weeks ago I spent an hour talking over video chat with Daniel Cantu, a UCLA neuroscience postdoc who I hired on Wyzant.com [https://www.wyzant.com/match/tutor/87443576?fbclid=IwAR3n91qFP_ijKlfMHrw1UmOVOhdw3jyG1r1A-whIJBaFPzpBWtWCmzBe414] to spend an hour answering a variety of questions about neuroscience I had. (Thanks Daniel for reviewing this blog post for me!) The most interesting thing I learned is that I had quite substantially misunderstood the connection between convolutional neural nets and the human visual system. People claim that these are somewhat bio-inspired, and that if you look at early layers of the visual cortex you'll find that it operates kind of like the early layers of a CNN, and so on. The claim that the visual system works like a CNN didn’t quite make sense to me though. According to my extremely rough understanding, biological neurons operate kind of like the artificial neurons in a fully connected neural net layer--they have some input connections and a nonlinearity and some output connections, and they have some kind of mechanism for Hebbian learning or backpropagation or something. But that story doesn't seem to have a mechanism for how neurons do weight tying, which to me is the key feature of CNNs. Daniel claimed that indeed human brains don't have weight tying, and we achieve the efficiency gains over dense neural nets by two other mechanisms instead: Firstly, the early layers of the visual cortex are set up to recognize particular low-level visual features like edges and motion, but this is largely genetically encoded rather than learned with weight-sharing. One way that we know this is that mice develop a lot of these features before their eyes open. These low-level features can be reinforced by positive signals from later layers, like other neurons, but these updates aren't done with weight-tying. So the weight-sharing and learning here is done at the genetic level. Secondly, he thinks that we get around the need for
How I managed to stop craving sweets in 3 weeksFor me at least, it is possible to eliminate/drastically reduce my sugar cravings. Typically I feel cravings for something sweet whenever I’m hungry, bored, have just finished a meal, am feeling sad, or am feeling happy. In short, I eat a lot of sweets and also spend a lot of time and effort trying to resist them. LAST TIME A few years ago I managed to cold-turkey sweets while I was following a Keto diet. I noticed that in week 3 of keto, my cravings had vanished. No longer did the desire to finish a meal with a bowl of ice cream plague me. For about 6 weeks total, if memory serves, I managed to eat no desserts or sweets at all. Everything was going great. Then I went to a birthday party, and my hubris let me astray. “I’m doing so well! I don’t need it, but I can just have a slice of chocolate cake, and it’s no big deal!” Alas, the very next day, my cravings were back, I fell off the wagon, and the experiment was over. I tried several times over the years to quit cold turkey again, but I never managed to keep at it for long, and I more or less gave up and decided to just make peace with the yearly expansion of my waistline. THIS TIME Near the end of June, I managed to have a few really busy days in a row, and for whatever reason, I realized suddenly, I hadn’t had any sweets for the last 3 days. Noticing that I had a little bit of a “head start” on getting through the 3 week sugar withdrawal, I decided to give it another go. I’m not sure where I got the idea, but I decided to modify my strategy. It was in the prime of the summer fruit season in the Bay Area, and nectarines, plums, pluots, peaches, and mangos were all at their ripest and sweetest. Instead of going cold turkey, I would try to eat a piece of fruit anytime I had a craving for sugar. I don’t think this would have worked if the fruit hadn’t been extremely good and satisfying. Another thing that I did was I didn’t try to limit or moderate how much
How to Ignore Your Emotions (while also thinking you're awesome at emotions)
Does it become easier, or harder, for the world to coordinate around not building AGI as time goes on?
The Real Rules Have No Exceptions
Integrity and accountability are core parts of rationality
Jeff Hawkins on neuromorphic AGI within 20 years
There was a particular mistake I made over in this thread [https://www.lesswrong.com/posts/5nH5Qtax9ae8CQjZ9/no-it-s-not-the-incentives-it-s-you#vPj9E9iqXjnNdyhob] . Noticing the mistake didn't change my overall position (and also my overall position was even weirder than I think people thought it was). But, seemed worth noting somewhere. I think most folk morality (or at least my own folk morality), generally has the following crimes in ascending order of badness: * Lying * Stealing * Killing * Torturing people to death (I'm not sure if torture-without-death is generally considered better/worse/about-the-same-as killing) But this is the conflation of a few different things. One axis I was ignoring was "morality as coordination tool" vs "morality as 'doing the right thing because I think it's right'." And these are actually quite different. And, importantly, you don't get to spend many resources on morality-as-doing-the-right-thing unless you have a solid foundation of the morality-as-coordination-tool. There's actually a 4x3 matrix you can plot lying/stealing/killing/torture-killing into which are: * harming the ingroup * harming the outgroup (who you may benefit from trading with) * harming powerless people who don't have the ability to trade or collaborate with you And you basically need to tackle these in this order. If you live in a world where even people in your tribe backstab each other all the time, you won't have spare resources to spend on the outgroup or the powerless until your tribe has gotten it's basic shit together and figured out that lying/stealing/killing each other sucks. If your tribe has it's basic shit together, then maybe you have the slack to ask the question: "hey, that outgroup over there, who we regularly raid and steal their sheep and stuff, maybe it'd be better if we traded with them instead of stealing their sheep?" and then begin to develop cosmopolitan norms. If you eventually become a powerful empire (or simil
Yesterday I noticed that I had a pretty big disconnect from this: There's a very real chance that we'll all be around, business somewhat-as-usual in 30 years. I mean, in this world many things have a good chance of changing radically, but automation of optimisation will not cause any change on the level of the industrial revolution. DeepMind will just be a really cool tech company that builds great stuff. You should make plans for important research and coordination to happen in this world (and definitely not just decide to spend everything on a last-ditch effort to make everything go well in the next 10 years, only to burn up the commons and your credibility for the subsequent 20). Only yesterday when reading Jessica's post did I notice that I wasn't thinking realistically/in-detail about it, and start doing that.
The Indian grammarian Pāṇini [https://en.wikipedia.org/wiki/P%C4%81%E1%B9%87ini] wanted to exactly specify what Sanskrit grammar was in the shortest possible length. As a result, he did some crazy stuff: Pāṇini's theory of morphological analysis was more advanced than any equivalent Western theory before the 20th century. His treatise is generative and descriptive, uses metalanguage and meta-rules, and has been compared to the Turing machine wherein the logical structure of any computing device has been reduced to its essentials using an idealized mathematical model. There are two surprising facts about this: 1. His grammar was written in the 4th century BC. 2. People then failed to build on this machinery to do things like formalise the foundations of mathematics, formalise a bunch of linguistics, or even do the same thing for languages other than Sanskrit, in a way that is preserved in the historical record. I've been obsessing about this for the last few days.
Responding to Scott's response to Jessica [https://slatestarcodex.com/2019/07/16/against-lie-inflation/]. The post makes the important argument that if we have a word whose boundary is around a pretty important set of phenomena that are useful to have a quick handle to refer to, then * It's really unhelpful for people to start using the word to also refer to a phenomena with 10x or 100x more occurrences in the world because then I'm no longer able to point to the specific important parts of the phenomena that I was previously talking about * e.g. Currently the word 'abuser' describes a small number of people during some of their lives. Someone might want to say that technically it should refer to all people all of the time. The argument is understandable, but it wholly destroys the usefulness of the concept handle. * People often have political incentives to push the concept boundary to include a specific case in a way that, if it were principled, indeed makes most of the phenomena in the category no use to talk about. This allows for selective policing being the people with the political incentive. * It's often fine for people to bend words a little bit (e.g. when people verb nouns), but when it's in the class of terms we use for norm violation, it's often correct to impose quite high standards of evidence for doing so, as we can have strong incentives (and unconscious biases!) to do it for political reasons. These are key points that argue against changing the concept boundary to include all conscious reporting of unconscious bias, and more generally push back against unprincipled additions to the concept boundary. This is not an argument against expanding the concept to include a specific set of phenomena that share the key similarities with the original set, as long as the expansion does not explode the set. I think there may be some things like that within the category of 'unconscious bias'. While it is the case
FEELINGS AND TRUTH SEEKING NORMS Stephen Covey says that maturity is being able to find the balance between Courage and Consideration. Courage being the desire to express yourself and say your truth, consideration being recognizing the consequences of what you say on others. I often wish that this was common knowledge in the rationality community (or just society in general) because I see so many fights between people who are on opposite sides of the spectrum and don't recognize the need for balance. Courage is putting your needs first, consideration is putting someone else's needs first, the balance is putting your needs equally. There are some other dichotomies that I think are pointing to a similar distinction. Courage------->Maturity----->Consideration From parenting literature: Authoritarian------->Authoritative----->Permissive From a course on confidence: Aggressive------->Assertive----->Passive From attachment theory: Avoidant------->Secure----->Preoccupied From my three types of safe spaces: [https://www.lesswrong.com/posts/i2XikYzeL39HoSSTr/matt-goldenberg-s-short-form-feed#b5MrrSLstZsASkW6J] We'll make you grow---->Own Your Safety---> We'll protect you. -------------------------------------------------------------------- Certain people may be wondering how caring about your feelings and others feelings relate to truth seeking. The answer is that our feelings are based on system 1 beliefs. I suspect this isn't strictly 100% true but its' a useful model, one behind Focusing, Connection Theory, Cognitive Behavioral Therapy, Internal Double Crux, and a good portion of other successful therapeutic interventions. How this caches out is that being able to fully express yourself is a necessary prerequisite to being able to bring all your beliefs to bear on a situation. Now sometimes, when someone is getting upset, its' not a belief like (this thing is bad) but "I believe that believing what you're saying is unsafe for my identity" or some similar b
Writing children's picture books
Being the (Pareto) Best in the World
Reason isn't magic
Research Agenda v0.9: Synthesising a human's preferences into a utility function
Mistakes with Conservation of Expected Evidence
The Schelling Choice is "Rabbit", not "Stag"
Book Review: The Secret Of Our Success
What is the evidence for productivity benefits of weightlifting?
Thinking about value-drift:It seems correct to me to term the earlier set of values a more "naive set.*" To keep a semi-consistent emotional tone across classes, I'll call the values you get later the "jaded set" (although I think jadedness is only one of several possible later-life attractor states, which I'll go over later). I believe it's unreasonable to assume that any person's set of naive values are, or were, perfect values. But it's worth noting that there are several useful properties that you can generally associate with them, which I'll list below. My internal model of how value drift has worked within myself looks a lot more like "updating payoff matrices" and "re-binning things." Its direction feels determined by not purely by drift, but by some weird mix of deliberate steering, getting dragged by currents, random walk, and more accurately mapping the surrounding environment. That said, here's some generalizations... NAIVE: * Often oversimplified * This is useful in generating ideals, but bad for murphyjitsu * Some zones are sparsely-populated * A single strong value (positive or negative) anywhere near those zones will tend to color the assumed value of a large nearby area. Sometimes it does this incorrectly. * Generally stronger emotional affects (utility assignments), and larger step sizes * Fast, large-step, more likely to see erratic changes * Closer to black-and-white thinking; some zones have intense positive or negative associations, and if you perform nearest neighbor on sparse data, the cutoff can be steep JADED: * Usually, your direct experience of reality has been fed more data and is more useful. * Things might shift so the experientially-determined system plays a larger role determining what actions you take relative to theory * Experiential ~ Sys1, Theoretical ~ Sys2, but the match isn't perfect. The former handles the bulk of the data right as it come
Lately I've been noticing myself getting drawn into more demon-thready discussions on LessWrong. This is in part due to UI choice – demon threads (i.e. usually "arguments framed through 'who is good and bad and what is acceptable in the overton window'") are already selected for getting above-average at engagement. Any "neutral" sorting mechanism for showing recent comments is going to reward demon-threads disproportionately. An option might be to replace the Recent Discussion section with a version of itself that only shows comments and posts from the Questions page (in particular for questions that were marked as 'frontpage', i.e. questions that are not about politics). I've had some good experiences with question-answering, where I actually get into a groove where the thing I'm doing is actual object-level intellectual work rather than "having opinions on the internet." I think it might be good for the health of the site for this mode to be more heavily emphasized. In any case, I'm interested in making a LW Team internal option where the mods can opt into a "replace recent discussion with recent question activity" to experiment with living in a world that contains more nudges towards the object level and seeing how that goes. My current best guess is that the best option includes giving people more choices about Recent Discussion works, and then having the default choice for new users be something a little more magical that is filtered to push things more towards the object level.
Out-of-context quote: I'd like for things I do to be either ignored or deeply-understood and I don't know what to do with the in-betweens.
New post: some musings on deliberate practice [https://musingsandroughdrafts.wordpress.com/2019/05/31/some-musings-on-deliberate-practice/]
Load More Months