Recent Discussion

I just got home from a six day meditation retreat and began writing.

The catch is that I arrived at the retreat yesterday.

I knew going in that it was a high variance operation. All who had experience with such things warned us we would hate the first few days, even if things were going well. I was determined to push through that.

Alas or otherwise, I was not sufficiently determined to make that determination stick. I didn’t have a regular practice at all going in, was entirely unfamiliar with the details of how this group operated, and found the Buddhist philosophy involved highly off putting, i... (Read more)

CFAR recently launched its 2019 fundraiser, and to coincide with that, we wanted to give folks a chance to ask us about our mission, plans, and strategy. Ask any questions you like; we’ll respond to as many as we can from 10am PST on 12/20 until 10am PST the following day (12/21).

Topics that may be interesting include (but are not limited to):

  • Why we think there should be a CFAR;
  • Whether we should change our name to be less general;
  • How running mainline CFAR workshops does/doesn't relate to running "AI Risk for Computer Scientist" type workshops. Why we both do a lot of rec
... (Read more)
2elityre2hActually, I think this touches on something that is useful to understand about CFAR in general. Most of our "knowledge" (about rationality, about running workshops, about how people can react to x-risk, etc.) is what I might call "trade knowledge", it comes from having lots of personal experience in the domain, and building up good procedures via mostly-trial and error (plus metacognition and theorizing about noticed problems might be, and how to fix them). This is distinct from scientific knowledge, which is build up from robustly verified premises, tested by explicit attempts at falsification. (I'm reminded of an old LW post, that I can't find, about Eliezer giving some young kid (who wants to be a writer) writing advice, while a bunch of bystanders signal that they don't regard Eliezer as trustworthy.) For instance, I might lead someone through an IDC like process at a CFAR workshop. This isn't because I've done rigorous tests (or I know of others who have done rigorous tests) of IDC, or because I've concluded from the neuroscience literature the IDC is the optimal process for arriving at true beliefs. Rather, its that I (and other CFAR staff) have interacted with people who have a conflict between beliefs / models / urges / "parts", a lot, in addition to spending even more time engaging with those problems in ourselves. And from that exploration, this IDC-process seems to work well, in the sense of getting good results. So, I have a prior that it will be useful for the nth person. (Of course sometime this isn't the case, because people can be really different, and occasionally a tool will be ineffective, or even harmful, despite being extremely useful for most people.) The same goes for, for instance, whatever conversational facilitation acumen I've acquired. I don't want to be making a claim that, say, "finding a Double Crux is the objectively correct process, or the optimal process, for resolving disagreements." Only that I've spent a lot of time resolv

I don’t think this works.

A carpenter might say that his knowledge is trade knowledge and not scientific knowledge, and when challenged to provide some evidence that this supposed “trade knowledge” is real, and is worth something, may point to the chairs, tables, cabinets, etc., which he has made. The quality of these items may be easily examined, by someone with no knowledge of carpentry at all. “I am a trained and skilled carpenter, who can make various useful things for you out of wood” is a claim which is very, very easy to verify.

But as I understand it

... (Read more)(Click to expand thread. ⌘/CTRL+F to Expand All)Cmd/Ctrl F to expand all comments on this post
3Vaniver2hYou're thinking of You're Calling *Who* A Cult Leader? [https://www.lesswrong.com/posts/cyzXoCv7nagDWCMNS/you-re-calling-who-a-cult-leader] An important clarification, at least from my experience of the metacognition, is that it's both getting good results and not triggering alarms (in the form of participant pushback or us feeling skeevy about doing it). Something that gets people to nod along (for the wrong reasons) or has some people really like it and other people really dislike it is often the sort of thing where we go "hmm, can we do better?"
2mr-hire11hI think he's bad at this. You can see this in some aspects of his companies. High micromanagement. High turnover. Disgruntled former employees.
Propagating Facts into Aesthetics
779d9 min readShow Highlight

Epistemic status: Tentative. I’ve been practicing this on-and-off for a year and it’s seemed valuable, but it’s the sort of thing I might look back on and say “hmm, that wasn’t really the right frame to approach it from.”

 

In doublecrux, the focus is on “what observations would change my mind?” 

In some cases this is (relatively) straightforward. If you believe minimum wage helps workers, or harms them, there are some fairly obvious experiments you might run. “Which places have instituted minimum wage laws? What happened to wages? What happened to unemployment? What happened to worker

... (Read more)

In addition to modifying the perceived beauty or distastefulness of a given concept, there are knobs you can turn related to the concepts themselves: nudging, splitting, merging, or even destroying (and assigning all remaining aesthetic value to other, related concepts.

5Raemon1hPotentially relevant: Is hard, physical labor in a greasy engine room beautiful or ugly [https://www.lesswrong.com/posts/fwvr3fXdAFTdfszMB/the-steampunk-aesthetic]?
4habryka5hPromoted to curated: I think this post is pointing at something that I expect will turn out to be obviously really important in a few years. I also think it's written in a really example-heavy way that allows people to engage with it, whereas most writing in this space usually stays abstract and as such often lacks grounding and concreteness.

I need help. Pretty much the entire scientific community and everyone I trust as an intellectual role model has said that vaccines are an almost entirely good thing, yet a close member of my family has made a somewhat convincing argument they are dangerous, and I’m terribly confused. I’ve been trying to figure this issue out for months now, and I just can’t. I’ve seen some (a lot of) dark side epistemology used by the more… out there antivaxers (i.e. homeopathy and essential oil people), but although I have a creeping sense some of what my family member is sa... (Read more)

I'd recommend, for each argument, finding someone who makes that argument online, and posting it to skeptics stack exchange. I used to do that years ago and found people were very helpful in doing research and finding good sources on a wide variety of topics.

1remizidae38mMost vaccines are made without (or can be made without) thimerosal. In addition, thimerosal is safe. https://www.fda.gov/vaccines-blood-biologics/safety-availability-biologics/thimerosal-and-vaccines [https://www.fda.gov/vaccines-blood-biologics/safety-availability-biologics/thimerosal-and-vaccines]
2Jonathan_Graehl1hVaccines sometimes kill people. Several serious diseases that killed many more people, we're told, are a much smaller risk now. At some point, you'd think people would want to selfishly avoid vaccinating so much. And that's what we see happening. There's a lot of rationalization going on.
3eigen1hYeah… that is not what I mean at all. You want a site, what about this one or SSC? I hardly think that you need any research paper, or meta-analyses (although, you can most certainly find them.) Instead, if what you need is to "beat" your uncle by telling him, “You see… I've got this paper right here, Golden et al. Which Indicates that the aluminum and thimerosal content within vaccines is not harmful at all...” Then you need another thing. And that is not the solution to your problem. If what you are, is involved in a domination game right here, right now in the middle of Christmas then the solution is to pass! And of course, to vaccinate your children, and persuade everyone to vaccinate their children (Or you know… give them a pass on the genetic pool? — I joke, of course.) For next year your uncle will come and say, “The earth? Yeah, it's flat.” You will get wide-eyed, you will shrug and say, “No uncle, not again!” And then, you will get at the right solution.

It feels like community discussion has largely abandoned the topic of AGI having the self-modifying property, which makes sense because there are a lot of more fundamental things to figure out.

But I think we should revisit the question at least in the context of narrow AI, because the tools are now available to accomplish exactly this on several levels. This thought was driven by reading a blog post, Writing BPF Code in Rust.

BPF stands for Berkeley Packet Filter, which was originally for network traffic analysis but has since been used for tracing the Linux kernel. The pitch is that this can n... (Read more)

I'm confused. I read you as suggesting that self-modifying code has recently become possible, but I think that self-modifying code has been possible for about as long as we have had digital computers?

What specific things are possible to do now that weren't possible before, and what kind of AGI-relevant questions does that make testable?

14jimrandomh4hIn practice, self-modification is a special case of arbitrary code execution; it's just running a program that looks like yourself, with some changes. That means there are two routes to get there: either communicate with the internet (to, eg, pay Amazon EC2 to run the modified program), or use a security vulnerability. In the context of computer security, preventing arbitrary code execution is an extremely well-studied problem. Unfortunately, the outcome of all the study is that it's really hard, and multiple vulnerabilities are discovered every year with low probability of them ever stopping.

I find myself somewhat confused as to why I should find Part I of “What failure looks like” (hereafter "WFLL1") likely enough to be worth worrying about. I have 3 basic objections, although I don't claim that any are decisive. First, let me summarize WFLL1 as I understand it:

In general, it's easier to optimize easy-to-measure goals than hard-to-measure ones, but this disparity is much larger with ML models than with humans and human-made institutions. As special-purpose AI becomes more powerful, this will lead to a form of differential progress where easy-to-m... (Read more)

Values Assimilation Premortem
162d4 min readShow Highlight

In the past 3-4 years, I went through a prolonged and painful life crisis in which I systematically deconstructed my existing worldview and slowly moved away from Evangelical Christianity into something Rationalist or Rationalist-adjacent. In the past 4 months, I've started hanging around the Berkeley Rationality community and am now dating someone embedded therein. At this point my partner is still my main connection to the specific values and practices of the community, and given that my worldview is currently being fleshed-out, she has an outsized influence on what my future beliefs and val

... (Read more)

Welcome!

Not sure how relevant can be my advice, because I was never in your position. I was never religious. I grew up in a communist country, which is kinda similar to growing up in a cult, but I wasn't a true believer of that either.

My prediction is that in the process of your change, you will fail to update on some points, and overcompensate on other points. Which is okay, because growing up happens in multiple iterations. What you do wrong in the first step, you can fix in the second one. As long as you keep some basic humility and admit that you ... (Read more)(Click to expand thread. ⌘/CTRL+F to Expand All)Cmd/Ctrl F to expand all comments on this post

(A traditional folk tale of the rashunuhlist people, as told by Jessica Taylor, and literarily and mathematically adapted by the present author.)

In the days of auld lang syne on Earth-that-was, there was a population of agents playing the Nash demand game under a replicator dynamic with uniform random encounters. Whenever two agents met, each of them would name a number between 0 and 10. If the two numbers added up to 10 or less, both agents would receive of payoff of the number they named. But if the two numbers added up to more than 10, both agents would receive nothing. Agents that received

... (Read more)

Based on the quote from Jessica Taylor, it seems like the FDT agents are trying to maximize their long-term share of the population, rather than their absolute payoffs in a single generation? If I understand the model correctly, that means the FDT agents should try to maximize the ratio of FDT payoff : 9-bot payoff (to maximize the ratio of FDT:9-bot in the next generation). The algebra then shows that they should refuse to submit to 9-bots once the population of 9-bots gets low enough (Wolfram|Alpha link), without needing to drop the random encounters ass... (Read more)(Click to expand thread. ⌘/CTRL+F to Expand All)Cmd/Ctrl F to expand all comments on this post

A dominant framework in rationality is internal alignment [citation needed]: sort out conflicts between parts of yourself, stop working at cross-purposes to yourself, stop doing internal violence, aim to take coherent action based on coherent beliefs towards coherent goals, etc. I think the alternate/complementary orientation of aiming for internal empowerment is often neglected / underemphasized. By internal empowerment I mean prioritizing giving each "part" (subsystem, motive, drive, goal, desire, subagent, whatever) the resources it needs to increase its capability to understand the world,

... (Read more)

I think this is great advice. I find in myself and others a common source of psychological shadow is the blocking out of parts of the self in a failed attempt to achieve an end that is ultimately counterproductive even if it occasionally works in limited circumstances.

The recent adversarial collaboration on spiritual experiences on Slate Star Codex includes this paragraph:

It was also discovered that people in the United States, Australia, the United Kingdom, and Scandinavia do not tend to share their spiritual experiences with others. Hood et al. wonder if this is why such spiritual experiences are thought to be uncommon (as fewer people in these societies might have heard reports of others’ spiritual experiences).

This naturally lead me to wonder, what spiritual experiences have LessWrong readers have that they are willing to share, since the readers... (Read more)

Note that I would not usually describe this as a spiritual experience.

2Said Achmiz3hWhat exactly constitutes a “spiritual experience” or “perception” or what have you? That is—what, specifically, are you asking about? (I don’t think I’ve ever had any “spiritual experience”, but perhaps this is a mere difference of terminology…?) EDIT: Ah, I just realized this was a question and I posted this as an answer and not a comment. Is it possible for a moderator to change it?
3Raemon3h(note: on LessWrong I believe you should be able to move comments and answers back and forth yourself)
5Answer by noggin-scratcher3hThe closest experience that comes to mind was in an undergraduate tutoring session for a first-year mathematics module, where "just for fun" at the end of the session we were taken along a path of derivations from the subject matter we'd just covered, up into some more abstract math, and then back down into something more concrete and familiar that had (until that point) always seemed like an entirely separate area of mathematics. For a brief moment it was like everything fell into place, and I was face to face with the infinite / eternal / perfect structure of the universe. But then the session ended and the spell broke, and I realised I couldn't quite remember it all well enough to recreate what had just happened. But there's no experience I can report that ever made me suspect the involvement of the supernatural or the divine.
Defining "Antimeme"
102d1 min readShow Highlight

An antimeme is a meme with the following three characteristics:

  • Learning it threatens the egos and identities of adherants to the mainstream of a culture[1].
  • Learning the meme renders mainstream knowledge in the field unimportant by broadening the problem space of a knowledge domain, usually by increasing the dimensionality.
  • Mainstream wisdom considers detailed knowledge of the antimeme irrelevant, unimportant or low priority. Mainstream culture may just ignore the antimeme altogether instead.

I call these "antimemes" because they exhibit behavior opposite that of regular memes. The typical

... (Read more)

Words can't be defined arbitrarily, so I am going to examine your definition first.

First, I am not sure what exactly counts as "mainstream", and why is it even important. What you describe seems like a relationship between a meme and a culture, whether large or small. So you could have "anti-memes of antimemes" as Isnasene describes. Or you could have a polarized society with two approximately equally large cultures, each of them having their own "anti-memes". Or a small minority, such as cult, that strongly ignores the s... (Read more)(Click to expand thread. ⌘/CTRL+F to Expand All)Cmd/Ctrl F to expand all comments on this post

Maybe this is a well known kind of problem but I am a novice and it looks puzzling to me.

Here is a lottery: I have these two choices:

  • (a) get 0.5$ for sure
  • (b) win 1$ with probability or nothing with probability

My utility function is .

What should I choose?

Let's compute the expected utilities:

  • expected utility for one single game is for (b) while for (a) is so I have maximized expected utility with choice (a)
  • if I compute expected utility for two games I get a different prescription:
    • utility for chosing (a) two tim
... (Read more)

The intuitive result you would expect only holds for utility function which are linear in x (I believe..), since we could then apply the utility function at each step and it would yield the same value as if applied to the whole amount.

Another case would be if you were to receive your utility immediately after playing each game (like in a reinforcement learning algorithm). In those cases is also applied to each outcome separately and would yield the result you would expect.

Also: (b) has a better EV in terms of raw $ and due to law of large numbers we wou... (Read more)(Click to expand thread. ⌘/CTRL+F to Expand All)Cmd/Ctrl F to expand all comments on this post

agai's Shortform
12dShow Highlight

Trying to find Katja Grace's account which she mentions she has here for a PM conversation. If someone (or herself) would PM me that would be awesome.

[This comment is no longer endorsed by its author]Reply
3TAG12hLinux has had the advantages it has for twenty years...so why now?
1agai4hI have two default questions when attempting to choose between potential actions: I ask both "why" and "why not?".
5Viliam14hIt's called progress. In my youth, we only had a bridge to sell you.

In light of reading Hazard's Shortform Feed -- which I really enjoy -- based on Raemon's Shortform feed, I'm making my own. There be thoughts here. Hopefully, this will also get me posting more.

Hmm. It may actually be possible to regenerate the motor neurons (or repurpose the already existing ones somehow). I'm not sure on the exact differences between them.

Somehow the action I would expect to help is for the person's limbs to be moved by others/machines as if they are acting themselves, because I think the body can adapt somehow?

Difficult to be specific without reading a lot of biology here though.

My post and Twitter thread about the controversy over the 1954 polio vaccine trials generated many replies on Twitter, so here is a followup.

First, I’m very sympathetic to the dilemma that Salk faced. I think it’s a tough problem, and it’s worth thinking about different ways to approach it. I didn’t mean to cast aspersions on Salk.

One way in general to improve this situation is to make sure that all the controls get the treatment immediately after the trial, if it is proved safe and effective. But in this case, that wouldn’t have changed anything. Polio was a... (Read more)

Gamma Andromeda, where philosophical stoicism went too far. Its inhabitants, tired of the roller coaster ride of daily existence, decided to learn equanimity in the face of gain or misfortune, neither dreading disaster nor taking joy in success.

But that turned out to be really hard, so instead they just hacked it. Whenever something good happens, the Gammandromedans give themselves an electric shock proportional in strength to its goodness. Whenever something bad happens, the Gammandromedans take an opiate-like drug that directly stimulates the pleasure centers of their brain, in a dose propor... (Read more)

" So another research program was started, and the result were fully immersive, fully life-supporting virtual reality capsules. Stacked in huge warehouses by the millions, the elderly sit in their virtual worlds, vague sunny fields and old gabled houses where it is always the Good Old Days and their grandchildren are always visiting. "


Is this a reference to the futurama episode with the death star type thing with all the old people in it?

Reply to: Meta-Honesty: Firming Up Honesty Around Its Edge-Cases

Eliezer Yudkowsky, listing advantages of a "wizard's oath" ethical code of "Don't say things that are literally false", writes—

Repeatedly asking yourself of every sentence you say aloud to another person, "Is this statement actually and literally true?", helps you build a skill for navigating out of your internal smog of not-quite-truths.

I mean, that's one hypothesis about the psychological effects of adopting the wizard's code.

A potential problem with this is that human natural language contains a lot of ambiguity. Words can

... (Read more)

It seems to me like 'intent to inform' is worth thinking about in the context of its siblings; 'intent to misinform' and 'intent to conceal.' Cousins, like 'intent to aggrandize' or 'intent to seduce' or so on, I'll leave to another time, tho you're right to point out they're almost always present, if just by being replaced by their reaction (like self-deprecation, to be sure of not self-aggrandizement).

Quakers were long renowned for following four virtues: peace, equality, simplicity, and truth. Unlike wizards, they have the benefit of being real, and so

... (Read more)(Click to expand thread. ⌘/CTRL+F to Expand All)Cmd/Ctrl F to expand all comments on this post
5Dagon8hLet me argue for intentionality in communication. If your intent is to inform and communicate a fact, do so. If your intend is to convince someone to undertake an action, do so. If your intent is to impress people with your knowledge and savvy, do so. If your intent is to elicit ideas and models to see where you differ, do so. One size does not fit all situations or all people. Talking is an act. Choose the mechanisms that fit your goals, like you do in all actions. Humans aren't perfect, and most humans are actually pretty bad at both giving and receiving "honest" communication. Attempting to communicate with them on the level you prefer, rather than the level they're ready for, is arrogant and unhelpful. Humans are not naturally nor necessarily aligned with your goals (in fact, nobody is fully aligned, though many of us are compatible if you zoom out far enough). It's an important social fiction to pretend they are, in order to cooperate with them, but you don't have to actually believe this falsehood.
10Viliam10hI wouldn't mind removing hyperboles from socially accepted language. Don't say "everyone" if you don't mean literally everyone, duh. (I suppose that many General Semantic fans would agree with this.) For me a complicated question is one that compares against an unspecified stardard, such as "is this cake sweet?" I don't know what kind of cakes you are used to eat, so maybe what's "quite sweet" to me is "only a bit sweet" for you. Telling literal truths, such as "yes, it has a nonzero amount of sugar, but also a nonzero amount of other things" will not help here. I don't know exactly how much sugar it contains. So, "it tastes quite sweet to me" is the best I can do here. Maybe that should be the norm. I agree about the "nearest unblocked strategy". You make the rules; people maximize within the rules (or break them when you are not watching). People wanting to do X will do the thing closest to X that doesn't break the most literal interpretation of the anti-X rules (or break the rules in a deniable way). -- On the other hand, even trivial inconveniences can make a difference. We are not discussing superhuman AI trying to get out of the box, but humans with limited willpower who may at some level of difficulty simply give up. The linked article "telling truth is social aggression" ignores the fact that even in competition, people make coalitions. And if you have large amounts of players, math is in favor of cooperation, at least on relatively small scale. If your school grades on a curve, it discourages helping your classmate without getting anything in return. But mutual cooperation with one classmate still helps you both against the rest of the class. The same is true about helping people create better models of the world, when the size of your group is tiny compared to the rest of the population. The real danger these days usually isn't Gestapo, but thousands of Twitter celebrities trying to convert parts of your writing taken out of context into polarizing twe
6Benquo11hhttps://www.lesswrong.com/posts/xdwbX9pFEr7Pomaxv/meta-honesty-firming-up-honesty-around-its-edge-cases [https://www.lesswrong.com/posts/xdwbX9pFEr7Pomaxv/meta-honesty-firming-up-honesty-around-its-edge-cases]

Cross-posted to the EA forum here.

Introduction

As in 2016, 2017 and 2018, I have attempted to review the research that has been produced by various organisations working on AI safety, to help potential donors gain a better understanding of the landscape. This is a similar role to that which GiveWell performs for global health charities, and somewhat similar to a securities analyst with regards to possible investments.

My aim is basically to judge the output of each organisation in 2019 and compare it to their budget. This should give a sense of the organisations' average cost-effectivenes... (Read more)

See also My current thoughts on MIRI's "highly reliable agent design" work by Daniel Dewey (Open Phil lead on technical AI grant-making).

From the "What do I think of HRAD?" section:

... This reduces my credence in HRAD being very helpful to around 10%. I think this is the decision-relevant credence.
What are you reading?
64d1 min readShow Highlight

In my short-form, I write:

[...] This is way more obvious and way more clear in Inadequate Equilibria. Take a problem, a question and deconstruct it completely. It was concise and to the point, I think it's one of the best things Eliezer has written; I cannot recommend it enough.

Just finished Inadequate Equilibria. Now, I'm reading:

  • The Big Picture from Sean Carroll (which seems a really, really good companion to The Sequences.) I'm at chapter 17/50, and I'm really enjoying it so far; it's an ambitious book though!
  • In fiction I picked up UNSONG from Scott Alexander; I a
... (Read more)

What is your verdict?

I'm currently reading through his blog Metamoderna and feel like there are some similarities to rationalist thoughts on there (e.g. this post on what he calls "game change" and this post on what he calls proto-synthesis).

jacobjacob's Shortform Feed
185mo1 min readShow Highlight

What it says on the tin.

I made a Foretold notebook for predicting which posts will end up in the Best of 2018 book, following the LessWrong review.

You can submit your own predictions as well.

At some point I might write a longer post explaining why I think having something like "futures markets" on these things can create a more "efficient market" for content.

Load More