Sorted by New

Wiki Contributions


 My guess is mostly that the space is so wide that you don't even end up with AIs warping existing humans into unrecognizable states, but do in fact just end up with the people dead

Why? I see a lot of opportunities for s-risk or just generally suboptimal future in such options, but "we don't want to die, or at any rate we don't want to die out as a species" seems like an extremely simple, deeply-ingrained goal that almost any metric by which the AI judges our desires should be expected to pick up, assuming it's at all pseudokind. (In many cases, humans do a lot to protect endangered species without doing diddly-squat to fulfill individual specimens' preferences!) 

It's about trade-offs. HPMOR/an equally cringey analogue will attract a certain sector of weird people into the community who can then be redirected towards A.I. stuff — but it will repel a majority of novices because it "taints" the A.I. stuff with cringiness by association.

This is a reasonable trade-off if:

  1. the kind of weird people who'll get into HPMOR are also the kind of weird people who'd be useful to A.I. safety;
  2. the normies were already likely to dismiss the A.I. stuff with or without the added load of cringe.

In the West, 1. is true because there's a strong association between techy people and niche fandom, so even though weird nerds are a minority, they might represent a substantial fraction of the people you want to reach.  And 2. is kind of true for a related reason, which is that "nerds" are viewed as generally cringe even if they don't specifically talk about HP fanfiction; it's already assumed that someone who thinks about computers all days is probably the kind of cringe who'd be big into a semi-self-insert HP fanfiction. 

But in China, from @Lao Mein's testimony, 1. is definitely not true (a lot of the people we want to reach would be on Team "this sounds weird and cringe, I'm not touching it") and 2. is possibly not true (if computer experts ≠ fandom nerds in Chinese popular consciousness, it may be easier to get broad audiences to listen to a non-nerdy computer expert talking about A.I.). 

If I was feeling persistently sad or hopeless and someone asked me for the quality of my mental health, and I had the energy to reply, I would reply ‘poor, thanks for asking.’

I wouldn't, not if I was in fact experiencing a rough enough patch of life that I rationally and correctly believed these feelings to be accurate. If I had been diagnosed with terminal cancer, for example, I would probably say that I was indeed sad and hopeless, but not that I had any mental health issues; indeed I'd be concerned with my mental health if I wasn't feeling that way. I find that this extends to beliefs about the future in general being screwed rather than your personal future (take A.I. doomerism: I think Eliezer is fairly sad and hopeless, and I don't think he'd say that makes him mental ill). So if 13% of the kids genuinely believe to some degree that their personal life sucks and will realistically always suck, and/or that the world is doomed for whatever combination of climate change and other known or perceived x-risks, that would account for this, surely?

At a guess, focusing on transforming information from images and videos into text, rather than generating text qua text, ought to help — no? 

We maybe need an introduction to all the advance work done on nanotechnology for everyone who didn't grow up reading "Engines of Creation" as a twelve-year-old or "Nanosystems" as a twenty-year-old.

Ah. Yeah, that does sound like something LessWrong resources have been missing, then — and not just for my personal sake. Anecdotally, I've seen several why-I'm-an-AI-skeptic posts circulating on social media for whom "EY makes crazy leaps of faith about nanotech" was a key point of why they rejected the overall AI-risk argument.

(As it stands, my objection to your mini-summary would be that that sure, "blind" grey goo does trivially seem possible, but programmable/'smart' goo that seeks out e.g. computer CPUs in particular could be a whole other challenge, and a less obviously solvable one looking at bacteria. But maybe that "common-sense" distinction dissolves with a better understanding of the actual theory.)

Hang on — how confident are you that this kind of nanotech is actually, physically possible? Why? In the past I've assumed that you used "nanotech" as a generic hypothetical example of technologies beyond our current understanding that an AGI could develop and use to alter the physical world very quickly. And it's a fair one as far as that goes; a general intelligence will very likely come up with at least one thing as good as these hypothetical nanobots. 

But as a specific, practical plan for what to do with a narrow AI, this just seems like it makes a lot of specific unstated assumption about what you can in fact do with nanotech in particular. Plausibly the real technologies you'd need for a pivotal act can't be designed without thinking about minds. How do we know otherwise? Why is that even a reasonable assumption?

Slightly boggling at the idea that nuts and eggs aren't tasty? And I completely lose the plot at "condiments". Isn't the whole point of condiments that they are tasty? What sort of definition of "tasty" are you going with?

Yes, I agree. This is why I said "I don't think this is correct". But unless you specify this, I don't think a layperson would guess this.

Thank you! This is helpful. I'll start with the bit where I still disagree and/or am still confused, which is the future people. You write:

The reductio for caring more about future peoples' agency is in cases where you can just choose their preferences for them. If the main thing you care about is their ability to fulfil their preferences, then you can just make sure that only people with easily-satisfied preferences (like: the preference that grass is green) come into existence.

Sure. But also, if the main thing you care about is their ability to be happy, you can just make sure that only people whom green grass sends to the heights of ecstasy come into existence? This reasoning seems like it proves too much. 

I'd guess that your reply is going to involve your kludgier, non-wireheading-friendly idea of "welfare". And that's fair enough in terms of handling this kind of dilemma in the real world; but running with a definition of "welfare" that smuggles in that we also care about agency a bit… seems, to me, like it muddles the original point of wanting to cleanly separate the three "primary colours" of morality.

That aside:

Re: animals, I think most of our disagreement just dissolves into semantics. (Yay!) IMO, keeping animals away from situations which they don't realize would kill them just falls under the umbrella of using our superior knowledge/technology to help them fulfill their own extrapolated preference to not-get-run-over-by-a-car. In your map this probably taken care of by your including some component of agency in "welfare", so it all works out.

Re: caring about paperclip paximizers: intuitively I care about creatures' agencies iff they're conscious/sentient, and I care more if they have feelings and emotions I can grok. So, I care a little about the paperclip-maximizers getting to maximize paperclips to their heart's content if I am assured that they are conscious; and I care a bit more if I am assured that they feel what I would recognise as joy and sadness based on the current number of paperclips. I care not at all otherwise.

I like this breakdown! But I have one fairly big asterisk — so big, in fact, that I wonder if I'm misunderstanding you completely.

Care-morality mainly makes sense as an attitude towards agents who are much less capable than you - for example animals, future people, and people who aren’t able to effectively make decisions for themselves.

I'm not sure animals belong on that list, and I'm very sure that future people don't. I don't see why it should be more natural to care about future humans' happiness than about their preferences/agency (unless, of course, one decides to be that breed of utilitarian across the board, for present-day people as well as future ones). 

Indeed, the fact that one of the futures we want to avoid is one of future humans losing all control over their destiny, and instead being wireheaded to one degree or another by a misaligned A.I., handily demonstrates that we don't think about future-people in those terms at all, but in fact generally value their freedom and ability to pursue their own preferences, just as we do our contemporaries'. 

(As I said, I also disagree with taking this approach for animals. I believe that insofar as animals have intelligible preferences, we should try to follow those, not perform naive raw-utility calculations — so that e.g. the question is not whether a creature's life is "worth living" in terms of a naive pleasure/pain ratio, but whether the animal itself seems to desire to exist. That being said, I do know nonzero amounts of people in this community have differing intuitions on this specific question, so it's probably fair game to include in your descriptive breakdown.)

Load More