Rafael Harth - LessWrong

MIRI announces new "Death With Dignity" strategy

As I understand it, Eliezer generally thinks suffering risks are unlikely, basically because the situation is best viewed as, there is this incredibly high dimensional space of possible futures (where the dimensions are how much certain values are satisfied), and the alignment problem consists of aiming at an incredibly small area in this space. The area of really bad futures may be much larger than the area of good futures, but it's still so tiny that even the <1% chance of solving alignment probably dominates the probability of landing in the space of bad futures by accident, if we don't know what we're doing. 99.999999...% of the space neither has positive nor negative value.

What's the problem with Oracular AIs?

Answer by Rafael HarthApr 03, 202230

You don't. The AI you described is myopic. If you can create a myopic oracle, you don't die.

Entering At the 11th Hour (Babble & Anaylsis)

Rafael Harth3y30

I became aware of AI safety as a cause area relatively recently, and despite it likely being the 11th hour, I want to contribute.

PSA: Lots of people disagree with Eliezer about timelines, and Eliezer famously does not want you to adopt his positions without questioning.

Counter-theses on Sleep

Rafael Harth3y90

I'm pretty surprised at Guzey's tone in responding; even this last response starting with an apology makes arguments that suggest argument motivated by some sort of psychological trigger rather than rational consideration.

I don't know if this generalizes, but my experience with tone is that it's mostly unintentional. There've been many instances where I've written something that seemed perfectly appropriate to me at the time, only to be horrified at how sound when I read it a month later (and the result pattern-matches to guzey's comment). It also does not require a psychological trigger, it just happens by default when arguing with someone in text form (and it happens more easily when it's about something status-related like who made better arguments). Took a lot of deliberate effort to change the default to sounding respectful.

I agree that it's bad enough to be worth mentioning, but I'd be quite surprised if it's the result of a strategic effort rather than of an unconscious signaling-related instinct.

Good Heart Week: Extending the Experiment

Rafael Harth3y50

I believe no-one has stated what to me seems rather obvious:

There's a difference between not winning the lottery and almost winning the lottery. If yesterday you had told me that I would not be making 200$ via LW comments, this wouldn't have bothered me, but now that I perceive an opportunity to win 200$ via LW comments, it does bother me if I don't succeed. I'm not saying this makes the experiment a bad idea, but it is psychologically unnerving.

Good Heart Week: Extending the Experiment

Rafael Harth3y80

Why? The existing multiplier implies that you agree getting karma with posts is harder, so why shouldn't it extend?

LESSWRONG
LW

Posts

Wiki Contributions

Comments