All Posts

Sorted by Magic (New & Upvoted)

Wednesday, November 20th 2019
Wed, Nov 20th 2019

Shortform [Beta]
9TurnTrout9h I feel very excited by the AI alignment discussion group I'm running at Oregon State University. Three weeks ago, most attendees didn't know much about "AI security mindset"-ish considerations. This week, I asked the question "what, if anything, could go wrong with a superhuman reward maximizer which is rewarded for pictures of smiling people? Don't just fit a bad story to the reward function. Think carefully." There was some discussion and initial optimism, after which someone said "wait, those optimistic solutions are just the ones you'd prioritize! What's that called, again?" (It's called anthropomorphic optimism []) I'm so proud.
6crabman13h In my understanding, here are the main features of deep convolutional neural networks (DCNN) that make them work really well. (Disclaimer: I am not a specialist in CNNs, I have done one masters level deep learning course, and I have worked on accelerating DCNNs for 3 months.) For each feature, I give my probability, that having this feature is an important component of DCNN success, compared to having this feature to the extent that an average non-DCNN machine learning model has it (e.g. DCNN has weight sharing, an average model doesn't have weight sharing). 1. DCNNs heavily use transformations, which are the same for each window of the input - 95% 2. For any set of pixels of the input, large distances between pixels in the set make the DCNN model interactions between these pixels less accurately - 90% (perhaps usage of dilution in some DCNNs is a counterargument to this) 3. Large depth (together with the use of activation functions) lets us model complicated features, interactions, logic - 82% 4. Having a lot of parameters lets us model complicated features, interactions, logic - 60% 5. Given 3 and 4, SGD-like optimization works unexpectedly fast for some reason - 40% 6. Given 3 and 4, SGD-like optimization with early stopping doesn't overfit too much for some reason - 87% (I am not sure if S in SGD is important, and how important is early stopping) 7. Given 3 and 4, ReLu-like activation function works really well (compared to, for example, sigmoid). 8. Modern deep neural network libraries are easy to use compared to the baseline of not having specific well-developed libraries - 60% 9. Deep neural networks work really fast, when using modern deep neural network libraries and modern hardware - 33% 10. DCNNs find such features in photos, which are invisible to the human eye and to most ML algorithms - 20% 11. Dropout helps reducing overfitting a lot - 25% 12. Batch normalization improve
5strangepoop13h The expectations you do not know you have control your happiness more than you know. High expectations that you currently have don't look like high expectations from the inside, they just look like how the world is/would be. But "lower your expectations" can often be almost useless advice, kind of like "do the right thing". Trying to incorporate "lower expectations" often amounts to "be sad". How low should you go? It's not clear at all if you're using territory-free un-asymmetric simple rules like "lower". Like any other attempt at truth-finding, it is not magic. It requires thermodynamic work. The thing is, the payoff is rather amazing. You can just get down to work. As soon as you're free of a constant stream of abuse from beliefs previously housed in your head, you can Choose without Suffering. The problem is, I'm not sure how to strategically go about doing this, other than using my full brain with Constant Vigilance. Coda: A large portion of the LW project (or at least, more than a few offshoots) is about noticing you have beliefs that respond to incentives other than pure epistemic ones, and trying not to reload when shooting your foot off with those. So unsurprisingly, there's a failure mode here: when you publicly declare really low expectations (eg "everyone's an asshole"), it works to challenge people, urges them to prove you wrong. It's a cool trick to win games of Chicken but as usual, it works by handicapping you. So make sure you at least understand the costs and the contexts it works in.
2G Gordon Worley III1d Story stats are my favorite feature of Medium. Let me tell you why. I write primarily to impact others. Although I sometimes choose to do very little work to make myself understandable to anyone who is more than a few inferential steps behind me and then write out on a far frontier of thought, nonetheless my purpose remains sharing my ideas with others. If it weren't for that, I wouldn't bother to write much at all, and certainly not in the same way as I do when writing for others. Thus I care instrumentally a lot about being able to assess if I am having the desired impact so that I can improve in ways that might help serve my purposes. LessWrong provides some good, high detail clues about impact: votes and comments. Comments on LW are great, and definitely better in quality and depth of engagement than what I find other places. Votes are also relatively useful here, caveat the weaknesses of LW voting I've talked about before. If I post something on LW and it gets lots of votes (up or down) or lots of comments, relative to what other posts receive, then I'm confident people have read what I wrote and I impacted them in some way, whether or not it was in the way I had hoped. That's basically where story stats stop on LessWrong. Here's a screen shot of the info I get from Medium: For each story you can see a few things here: views, reads, read ratio, and fans, which is basically likes. I also get an email every week telling me about the largest updates to my story stats, like how many additional views, reads, and fans a story had in the last week. If I click the little "Details" link under a story name I get more stats: average read time, referral sources, internal vs. external views (external views are views on RSS, etc.), and even a list of "interests" associated with readers who read my story.All of this is great. Each week I get a little positive reward letting me know what I did that worked, what didn't, and most importantly to me, how much people are engag

Monday, November 18th 2019
Mon, Nov 18th 2019

Shortform [Beta]
5Matthew Barnett3d Bertrand Russell's advice to future generations, from 1959
4Chris_Leong3d Anti-induction and Self-Reinforcement Induction is the belief that the more often a pattern happens the more likely it is to continue. Anti-induction is the opposite claim: the more likely a pattern happens the less likely future events are to follow it. Somehow I seem to have gotten the idea in my head that anti-induction is self-reinforcing. The argument for it is as follows: Suppose we have a game where at each step a screen flashes an A or a B and we try to predict what it will show. Suppose that the screen always flashes A, but the agent initially thinks that the screen is more likely to display B. So it guesses B, observes that it guessed incorrectly and then, if it is an anti-inductive agent will increase it's likelihood that the next symbol will be B because of anti-induction. So in this scenario your confidence that the next symbol will be B, despite the long stream of As, will keep increasing. This particular anti-inductive belief is self-reinforcing. However, there is a sense in which anti-induction is contradictory - if you observe anti-induction working, then you should update towards it not working in the future. I suppose the distinction here is that we are using anti-induction to update our beliefs on anti-induction and not just our concrete beliefs. And each of these is a valid update rule: in the first we apply this update rule to everything including itself and in the other we apply this update rule to things other than itself. The idea of a rule applying to everything except itself feels suspicious, but is not invalid. Also, it's not that the anti-inductive belief that B will be next is self-reinforcing. After all, anti-induction given consistent As pushes you towards believing B more and more regardless of what you believe initially. In other words, it's more of an attractor state.

Sunday, November 17th 2019
Sun, Nov 17th 2019

Personal Blogposts
Shortform [Beta]
2orthonormal3d Is there a word for problems where, as they get worse, the exactly wrong response becomes more intuitively appealing? For example, I'm thinking of the following chain (sorry for a political example, this is typically a political phenomenon): resistance to new construction (using the ability of local boards to block projects) causes skyrocketing rent which together mean that the rare properties allowed to be developed get bid up to where they can only become high-end housing which leads to anger at rich developers for building "luxury housing" which leads to further resistance to new construction and so on until you get San Francisco

Thursday, November 14th 2019
Thu, Nov 14th 2019

Shortform [Beta]
11Ben Pace7d Trying to think about building some content organisations and filtering systems on LessWrong. I'm new to a bunch of the things I discuss below, so I'm interested in other people's models of these subjects, or links to sites that solve the problems in different ways. Two Problems So, one problem you might try to solve is that people want to see all of a thing on a site. You might want to see all the posts on reductionism on LessWrong, or all the practical how-to guides (e.g. how to beat procrastination, Alignment Research Field Guide, etc), or all the literature reviews on LessWrong. And so you want people to help build those pages. You might also want to see all the posts corresponding to a certain concept, so that you can find out what that concept refers to (e.g. what is the term "goodhart's law" or "slack" or "mesa-optimisers" etc). Another problem you might try to solve, is that while many users are interested in lots of the content on the site, they have varying levels of interest in the different topics. Some people are mostly interested in the posts on big picture historical narratives, and less so on models of one's own mind that help with dealing with emotions and trauma. Some people are very interested AI alignment, some are interested in only the best such posts, and some are interested in none. I think the first problem is supposed to be solved by Wikis, and the second problem is supposed to be solved by Tagging. Speaking generally, Wikis allow dedicated users to curated pages around certain types of content, highlighting the best examples, some side examples, writing some context for people arriving on the page to understand what the page is about. It's a canonical, update-able, highly editable page built around one idea. Tagging is much more about filtering than about curating. Tagging Let me describe some different styles of tagging. One the site there are about 100 tags in total. Most tags give a very broad description of an area o
7Raemon7d The 2018 Long Review (Notes and Current Plans) I've spent much of the past couple years pushing features that help with the early stages of the intellectual-pipeline – things like shortform [], and giving authors moderation tools [] that let them have the sort of conversation they want (which often is higher-context, and assuming a particular paradigm that the author is operating in) Early stage ideas benefit from a brainstorming, playful, low-filter environment. I think an appropriate metaphor for those parts of LessWrong are "a couple people in a research department chatting about their ideas." But longterm incentives and filters matter a lot as well. I've focused on the early stages because that's where the bottleneck seemed to be, but LessWrong is now at a place where I think we should start prioritizing the later stages of the pipeline – something more analogous to publishing papers, and eventually distilling them into textbooks. So, here's the current draft of a plan that I've been discussing with other LW Team members: — The Long Review Format — Many LessWrong posts are more conceptual than empirical, and it's hard to tell immediately how useful they are. I think they benefit a lot from hindsight. So, once each year, we could reflect as a group about the best posts of the previous year*, and which them seem to have withstood the tests of time as something useful, true, and (possibly), something that should enter in the LessWrong longterm canon that people are expected to be familiar with. Here's my current best guess for the format: [note: I currently expect the entire process to be fully public, because it's not really possible for it to be completely private, and "half public" seems like the worst situation to me] * (1 week) Nomination * Users with 1000+ karma can nominate posts from 2018-or-earlier, desc

Wednesday, November 13th 2019
Wed, Nov 13th 2019

Shortform [Beta]
12TurnTrout8d Yesterday, I put the finishing touches on my chef d'œuvre, a series of important safety-relevant proofs I've been striving for since early June. Strangely, I felt a great exhaustion come over me. These proofs had been my obsession for so long, and now - now, I'm done. I've had this feeling before; three years ago, I studied fervently for a Google interview. The literal moment the interview concluded, a fever overtook me. I was sick for days. All the stress and expectation and readiness-to-fight which had been pent up, released. I don't know why this happens. But right now, I'm still a little tired, even after getting a good night's sleep.
10elityre7d new post: Metacognitive space [Part of my Psychological Principles of Personal Productivity, which I am writing mostly in my Roam, now.] Metacognitive space is a term of art that refers to a particular first person state / experience. In particular it refers to my propensity to be reflective about my urges and deliberate about the use of my resources. I think it might literally be having the broader context of my life, including my goals and values, and my personal resource constraints loaded up in peripheral awareness. Metacognitive space allows me to notice aversions and flinches, and take them as object, so that I can respond to them with Focusing or dialogue, instead of being swept around by them. Similarly, it seems to, in practice, to reduce my propensity to act on immediate urges and temptations. [Having MCS is the opposite of being [[{Urge-y-ness | reactivity | compulsiveness}]]?] It allows me to “absorb” and respond to happenings in my environment, including problems and opportunities, taking considered instead of semi-automatic, first response that occurred to me, action. [That sentence there feels a little fake, or maybe about something else, or maybe is just playing into a stereotype?] When I “run out” of meta cognitive space, I will tend to become ensnared in immediate urges or short term goals. Often this will entail spinning off into distractions, or becoming obsessed with some task (of high or low importance), for up to 10 hours at a time. Some activities that (I think) contribute to metacogntive space: * Rest days * Having a few free hours between the end of work for the day and going to bed * Weekly [[Scheduling]]. (In particular, weekly scheduling clarifies for me the resource constraints on my life.) * Daily [[Scheduling]] * [[meditation]], including short meditation. * Notably, I’m not sure if meditation is much more efficient than just taking the same time to go for a walk. I think it might be or might not be. * [[Exerc
6Naryan Wong8d Meta-moves may look off-topic from the object level Halfway through a double-crux about efficient markets, my interlocutor asks how I'm feeling. I'm deciding where to go for lunch and my friend asks me if I'm ready for the presentation in the afternoon. I'm planning my road-trip route on Waze and my partner asks what time we plan on leaving. Imagine if every time someone mentions something you consider irrelevant or off-topic, instead of dismissing it - you view it as a meta-move and considered their meta-frame on your thinking.

Tuesday, November 12th 2019
Tue, Nov 12th 2019

Personal Blogposts
Shortform [Beta]
12Kaj_Sotala9d Here's a mistake which I've sometimes committed and gotten defensive as a result, and which I've seen make other people defensive when they've committed the same mistake. Take some vaguely defined, multidimensional thing that people could do or not do. In my case it was something like "trying to understand other people". Now there are different ways in which you can try to understand other people. For me, if someone opened up and told me of their experiences, I would put a lot of effort into really trying to understand their perspective, to try to understand how they thought and why they felt that way. At the same time, I thought that everyone was so unique that there wasn't much point in trying to understand them by any *other* way than hearing them explain their experience. So I wouldn't really, for example, try to make guesses about people based on what they seemed to have in common with other people I knew. Now someone comes and happens to mention that I "don't seem to try to understand other people". I get upset and defensive because I totally do, this person hasn't understood me at all! And in one sense, I'm right - it's true that there's a dimension of "trying to understand other people" that I've put a lot of effort into, in which I've probably invested more than other people have. And in another sense, the other person is right - while I was good at one dimension of "trying to understand other people", I was severely underinvested in others. And I had not really even properly acknowledged that "trying to understand other people" had other important dimensions too, because I was justifiably proud of my investment in one of them. But from the point of view of someone who *had* invested in those other dimensions, they could see the aspects in which I was deficient compared to them, or maybe even compared to the median person. (To some extent I thought that my underinvestment in those other dimensions was *virtuous*, because I was "not making assumption
9elityre9d New (short) post: Desires vs. Reflexes [] [Epistemic status: a quick thought that I had a minute ago.] There are goals / desires (I want to have sex, I want to stop working, I want to eat ice cream) and there are reflexes (anger, “wasted motions”, complaining about a problem, etc.). If you try and squash goals / desires, they will often (not always?) resurface around the side, or find some way to get met. (Why not always? What are the difference between those that do and those that don’t?) You need to bargain with them, or design outlet polices for them. Reflexes on the other hand are strategies / motions that are more or less habitual to you. These you train or untrain.
2Hazard9d Have some horrible jargon: I spit out a question or topic and ask you for your NeMRIT, your Next Most Relevant Interesting Take. Either give your thoughts about the idea I presented as you understand it, unless that's boring, then give thoughts that interests you that seem conceptually closest to the idea I brought up.
1Evan Rysdam9d I just noticed that I've got two similarity clusters in my mind that keep getting called to my attention by wording dichotomies like high-priority and low-priority, but that would themselves be better labeled as big and small. This was causing me to interpret phrases like "doing a string of low-priority tasks" as having a positive affect (!) because what it called to mind was my own activity of doing a string of small, on-average medium-priority tasks. My thought process might improve overall if I toss out the "big" and "small" similarity clusters and replace them with clusters that really are centered around "high-priority" and "low-priority".

Load More Days