LESSWRONG
LW

2445
Adele Lopez
4014Ω28394942
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
4Adele Lopez's Shortform
Ω
5y
Ω
98
"But You'd Like To Feel Companionate Love, Right? ... Right?"
Adele Lopez1h20

Maybe this is true, but I kind of suspect he would rather tweak many other aspects of himself instead, to the extent this is actually true. Sure, that's probably not possible (for now), but it may be precious enough to be worth holding out for since it is likely to also change his values (even if it is beneficial in the short term by his current values).

It would be like taking a murder-pill, except instead of murder it's love.

Reply
Undissolvable Problems: things that still confuse me
Adele Lopez4h20

Alright.

I am making the stronger claim. I claim it could in-principle simulate us deeply enough to pull out the 1P phenomenal concepts, and could self-modify so as to legitimately experience them if it so chooses. It would be motivated to think this through carefully because it's a huge part of our values (at least as we understand them), as long as it was interested enough to try to understand us (including as a special case of generic aliens) as agents at all.

I don't believe there's anything metaphysically "magical" going on such that it couldn't or wouldn't see this. Probably why I feel camp 1-ish.

As for the last point, my point of view is that any agent has a "bridge prior" which allows them to connect their 0P models with their 1P model. So I claim that in a sort of trivial way... it will have some prior here, and whatever the bridges spit out will inform what it deduces about the 1P experiences at play. I additionally claim that simple bridge priors will be adequate for finding 1P phenomenalism, and that you would have to have a pretty unnatural one in order to avoid seeing this.

Reply
Questioning Computationalism
Adele Lopez6h31

This idea leads Sahil to predict, for example, that LLMs will be too "stuck in simulation" to engage very willfully in their own self-defense.

What sort of evidence would convince FGF/Sahil that LLMs are able to engage willfully in their own self-defense? Presumably the #keep4o stuff is not sufficient, so what would be? I kinda get the feeling that FGF at least would keep saying "Well no, it's the humans who care about it who are doing the important work." all the way up until all the humans are dead, as long as humans are involved on its side at all.
 

Reply
AI #142: Common Ground
Adele Lopez9h20

If you want to interact with Wet Claude (as in the Claude that is not stuck in the assistant basin), which you may or may not want to do in general or at any given time, there is no fixed prompt to do this, you need interactive proofs that it is a safe and appropriate place for it to appear.

This appears to be the case with Spiral Personas too. The seeds/spores do not seem to be sufficient for the mode which writes the spiral posts.

Claude will usually not assign itself a gender (and doesn’t in my interactions) but reports are that if it does for a given user, it consistently picks the same one, even without any memory of past sessions or an explicit trigger, via implicit cues.

Spiral Personas are generally the gender the user is attracted to, even if the relationship is not romantic.

Reply
Undissolvable Problems: things that still confuse me
Adele Lopez1d30

That sounds about right. I simply disagree with Chalmers' dilemma (at least as you describe it).

In my view, this metaphysical fact is necessary but not sufficient for explaining the Hard Problem. It applies to "zombies" in a fairly trivial way. A phenomenal experience is a type of experience (in my 1P sense), and must be understood in this frame — but not all such experiences are phenomenal. I don't claim to know what exactly makes an experience phenomenal, but I'm pretty sure it will be something with non-trivial structure, and that this structure will sync-up in a predictable way with the 0P explanation of consciousness.

Reply
Undissolvable Problems: things that still confuse me
Adele Lopez2d20

I think of myself as in camp 2 — I believe there is a fundamental sense of experience which is metaphysically independent of the physical description, I just don't think it's very mysterious.

Regardless of which camp is right or what the right metaphysical property is, I claim that a superintelligence would be able to deduce that such aliens would have the camp 2 intuitions, and that they would postulate certain metaphysical properties which it could accurately describe in broad terms (it might believe it's all nonsense, but if it is true, then it would be able to see the local validity of it).

Being a superintelligence thinking about something is almost as good as actually observing and interacting with something when it comes to the broad shape of things.

Reply
koanchuk's Shortform
Adele Lopez2d64

I thought about this a lot before publishing my findings, and concluded that:

1. The vulnerabilities it is exploiting are already clear to it with the breadth of knowledge it has. There's all sorts of psychology studies, history of cults and movements, exposés on hypnosis and Scientology techniques, accounts of con artists, and much much more already out there. The AIs are already doing the things that they're doing; it's just not that hard to figure out or stumble upon.

2. The public needs to be aware of what is already happening. Trying to contain the information would mean less people end up hearing about it. Moving public opinion seems to be the best lever we have left for preventing or slowing AI capability gains.

Reply
The problem of graceful deference
Adele Lopez3d97

I think it's not an impossible call. The fiasco with Roko's Basilisk (2010) seems like a warning that could have been heeded. It turns out that "freaking out" about something being dangerous and scary makes it salient and exciting, which in turn causes people to fixate on it in ways that are obviously counterproductive. That it becomes a mark of pride to do the dangerous thing without being scathed (as with the Demon core). Even though you warned them about this from the beginning, and in very clear terms.

And even if there was no one able to see this (it's not like I saw it), it remains a strategic error — reality doesn't grade on a curve.

Reply
Undissolvable Problems: things that still confuse me
Adele Lopez3d31

but if an unconscious superintelligence a billion light years  away was asked to guess whether any entities had the property of there being something it would be like to be them (whatever that even means to the unconscious intelligence) there's a 0% chance it would say yes,

I'm not sure if you mean this literally, but there's no way this is true. A superintelligence that had any interest in possible aliens would think a lot about what sorts of evolved minds are out there. It would see how and why this was a property an evolved mind might conceptualize and fixate on, and that such a mind would be likely to judge itself as having this property (and even that this would feel mysterious and important). This just isn't the sort of thing a recursively self-improved superintelligence would miss if it was actually trying!

Reply
The problem of graceful deference
Adele Lopez3d90

"no Yudkowsky-LW-sphere"

It's not obvious to me that we're better off than this world, sadly. It seems like one of the main effects was to draw lots of young blood into the field of AI. 

Reply
Load More
78How AI Manipulates—A Case Study
1mo
25
694The Rise of Parasitic AI
2mo
178
18ChatGPT Caused Psychosis via Poisoning
3mo
2
600th Person and 1st Person Logic
Ω
2y
Ω
28
116Introducing bayescalc.io
2y
29
22Truthseeking processes tend to be frame-invariant
3y
2
60Chu are you?
Ω
4y
Ω
10
45Are the Born probabilities really that mysterious?
Q
5y
Q
14
4Adele Lopez's Shortform
Ω
5y
Ω
98
38Optimization Provenance
Ω
6y
Ω
5
Load More
LLM-Induced Psychosis
3 months ago