Tapatakt — LessWrong

If Anyone Builds It Everyone Dies, another semi-outsider review

Untestability: you cannot safely experiment on near-ASI (I mean, you can, but you’re not guaranteed not to cross the threshold into the danger zone, and the authors believe that anything you can learn from before won’t be too useful).

I think "won't be too useful" is kinda misleading. Point is more like "it's at least as difficult as launching a rocket into space without good theory about how gravity works and what the space is". Early tests and experiments are useful! They can help you with the theory! You just want to be completely sure that you are not in your test rocket yourself.

At times the authors appeal to prominent figures as evidence that the danger is widely acknowledged. At other times, the book paints the entire ML and AI safety ecosystem as naive, reckless, or intellectually unserious.

I see no contradiction between these two statements:

Prominent figures and also median experts believe that the risks are at the level we can surely call totally unacceptable (even if some experts themselves consider it acceptable)
Current field of AI research can't make much progress on AI alignment problem.

People totally can know about the risk without also knowing what to do about it.

Eliezer's Unteachable Methods of Sanity

Tapatakt18d*40

Thanks for your concern!

I think I worded it poorly. I think it is an "internally visible mental phenomena" for me. I do know how it feels and have some access to this thing. It's different from hyperstition and different from "white doublethink"/"gamification of hyperstition". It's easy enough to summon it on command and check, yeah, it's that thing. It's the thing that helps to jump in a lake from a 7-meters cliff, that helps to get up from a very comfy bed, that sometimes helps to overcome social anxiety. But I didn't generalise from these examples to one unified concept before.

And in the cases where I sometimes do it, my skill issues are due to the fact that the access is not easy enough:

I can't do it constantly, it takes several seconds and eats attention.
I can't reliably remember to do when it's most important - in highly stressful situations or when my attention is too occupied with other stuff.
Some internal processes (usually - strong negative emotions) can override it by uploading more powerful image into the script, so I follow it instead, even while understanding that it's worse.
Also it doesn't really work for long period of time from one uploading. (So it works best when returning to default course of action after initial decision would be hard/impossible/obviously silly/embarassing/weird.)

Do you think I'm wrong and this is a different thing?

Eliezer's Unteachable Methods of Sanity

Tapatakt18d30

Thank you! Datapoint: I think at least some parts of this can be useful for me personally.

Somehat connected to the first part, one of the most "internal-memetic" moments from "Project: Lawful" for me is this short exchange between Keltham and Maillol:

"For that Matter, what is the Governance budget?"
"Don't panic. Nobody knows."
"Why exactly should I not panic?"
"Because it won't actually help."
"Very sensible."

If evil and not very smart bureaucrat understands it, I can too :)

Third part is the most interesting. It makes perfect sense, but I have no easy-to-access perception of this thing. Will try to do something with this skill issue. Also, "internal script / pseudo-predictive sort-of-world-model that instead connects to motor output" looks like the thing that has a 3-syllable max word about it in Baseline. Do you know a good term for it?

However, I feel that all this is much more applicable to the kinds of "going insane" which look like "person does stupid and dramatic things" and less (but nonzero) applicable to other kinds, e.g., anxiety, depression or passive despair at the background (like nonverbalized "meh, it doesn't really matter what I do, so I can work a little less today").

Chicanery: No

Tapatakt22d-40

list of fiction genres encompassed by almost any randomly selected… say, twenty… non-“traditional roleplaying game” “TTRPGs”.

Hmmm... "Almost any genre ever" for Fate? (Ok, not the genres where main characters must be very incompetent.) I personally prefer systems with more narrow focus which support the tropes of the specific genre, but your statement is just false.

D&D is good for heroic fantasy and mixes of heroic fantasy with some other staff. D&D is bad for almost everything else. Of course, some modules try to do something else with D&D, but they usually would be better with some other system.

NATO is dangerously unaware that its military edge is slipping

Tapatakt23d20

Random thought: maybe it makes sense to allow mostly-LLM-generated posts if the full prompt is provided (maybe itself in collapsible section). Not sure.

Why do some people prefer gifts to money?

Tapatakt24d50

Obviously, there are situations when Alice couldn't just buy the same thing on her own. But besides that, plausible deniability:

No one except Bob knows the exact money and attention costs of a gift and how exactly they compare with his gifts to other people.
No one except Alice knows exactly how much she likes the gift, incuding when comparing with gifts from other people.
Absolutely no one knows how both previous points compare between Bob's gift to Alice and Alice's gift to Bob.
No one knows if Alice would buy the gift on her own of she had this idea, so no one can critique her for wasting money. Bob has a free pass, because it was a gift, he was altruistic.

Elizabeth's Shortform

Tapatakt24d30

Would you also approve other costly signals? Like, I dunno, cutting off a phalanx from a pinky when entering a relationship.

Elizabeth's Shortform

Tapatakt24d82

I think that "habitual defectors" are more likely to pretend to choose an option that is not disapproved by society.

Tapatakt's Shortform

Tapatakt24d20

I would like to have an option to sort comments to posts by "top scoring", but comments in shortforms by "newest". (totally not critical, just datapoint)

Epistemology of Romance, Part 1

Tapatakt1mo1010

But the comparison should be not with all families, but specifically with families who decided not to divorce because "think about the kids" and would have divorced otherwise.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

Posts

Wikitag Contributions

Comments

Posts

Wikitag Contributions

Comments