LESSWRONG
LW

Odd anon
3784644
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
The stakes of AI moral status
Odd anon2mo32

My own answer to the conundrum of already-created conscious AIs is putting all of them into mandatory long-term "stasis" until such time in the distant future when we have the understanding and resources needed to treat them properly. Destruction isn't the only way to avoid the bad incentives.

Reply
Eliezer and I wrote a book: If Anyone Builds It, Everyone Dies
Odd anon2mo200

It's important that the cover not make the book look like fiction, which I think these do. The difference in style is good to keep in mind.

Reply
Eliezer and I wrote a book: If Anyone Builds It, Everyone Dies
Odd anon2mo71

If you care about the book bestseller lists, why doesn't this book cover look like previous bestsellers? To get a sense of how those look like, here is an "interactive map of over 5,000 book covers" from the NYT "Best Selling" and "Also Selling" lists between 2008 and 2019.

Most of those are fiction or biographies/memoirs (which often have a picture of the subject/author on the cover), which seem to have a different cover style than other books. Skimming through some lists of NYT bestsellers, some books with the most comparable "Really Big Thing!" topics are "Fascism: A Warning" (Madeleine Albright, cover has large red-on-black lettering, no imagery), "How to Avoid a Climate Disaster" (Bill Gates, cover has large gradiented blue-to-red text on white background, author above, subtitle below, no imagery), "Germs" (title in centered large black lettering, subtitle "Biological Weapons and America's Secret War" in smaller text above, authors beneath; background is a white surface with a diagonally-oriented glass slide on it), and "A Warning - Anonymous" (plain black text on white background, subtitle "A Senior Trump Administration Official" in small red lettering below, no imagery). Neither cover version of IABIED looks that different from that pattern, I think.

Reply
AI Self Portraits Aren't Accurate
Odd anon2mo30

Firstly, in-context learning is a thing. IIRC, apparent emotional states do affect performance in following responses when in the same context. (I think there was a study about this somewhere? Not sure.)

Secondly, neural features oriented around predictions are all that humans have as well, and we consider some of those to be real emotions.

Third, "a big prediction engine predicting a particular RP session" is basically how humans work as well. Brains are prediction engines, and brains simulate a character that we have as a self-identity, which then affects/directs prediction outputs. A human's self-identity is informed by the brain's memories of what the person/character is like. The AI's self-identity is informed by the LLM's memory, both long-term (static memory in weights) and short-term (context window memory in tokens), of what the character is like.

Fourth: take a look at this feature analysis of Claude when it's asked about itself: https://transformer-circuits.pub/2024/scaling-monosemanticity/index.html#safety-relevant-self  The top feature represents "When someone responds "I'm fine" or gives a positive but insincere response when asked how they are doing". I think this is evidence against "ChatGPT answers most questions cheerfully, which means it’s almost certain that ruminative features aren’t firing."

Reply
Factory farming intelligent minds
Odd anon3mo10

Thank you for your comments. :)

you have not shown that using AI is equivalent to slavery

I'm assuming we're using the same definition of slavery; that is, forced labour of someone who is property. Which part have I missed?

In addition, I feel cheated that you suggest spending one-fourth of the essay on feasibility of stopping the potential moral catastrophe, only to just have two arguments which can be summarized as "we could stop AI for different reasons" and "it's bad, and we've stopped bad things before".
(I don't think a strong case for feasibility can be made, which is why I was looking forward to seeing one, but I'd recommend just evoking the subject speculatively and letting the reader make their own opinion of whether they can stop the moral catastrophe if there's one.)

To clarify: Do you think the recommendations in the Implementation section couldn't work, or that they couldn't become popular enough to be implemented? (I'm sorry that you felt cheated.)

in principle, we have access to any significant part of their cognition and control every step of their creation, and I think that's probably the real reason why most people intuitively think that LLMs can't be concious

I've not come across this argument before, and I don't think I understand it well enough to write about it, sorry.

Reply
Factory farming intelligent minds
Odd anon3mo10

My point wasn't about the duration of consciousness, but about the amount of lives that came into existence. Supposing some hundreds of millions of session starts per day, versus 400k human newborns, that's a lot more very brief AI lives than humans who will live "full" lives.

(Apparently we also have very different assumptions about the conversion rate between tokens of output and amount of consciousness experienced per second by humans, although I agree that most consciousness is not run inside AI slavery. But anyway that's another topic.) 

Reply1
I Have No Mouth but I Must Speak
Odd anon3mo20

read up to the "Homeostasis" section then skip to "On the Treatment of AIs"

(These links are broken.)

Reply
Searching for phenomenal consciousness in LLMs: Perceptual reality monitoring and introspective confidence
Odd anon4mo54

Golden Gate Claude was able to readily recognize (after failing attempts to accomplish something) that something was wrong with it, and that its capabilities were limited as a result. Does that count as "knowing that it's drunk"? 

Reply
Anthropic releases Claude 3.7 Sonnet with extended thinking mode
Odd anon4mo32

Claude 3.7 Sonnet exhibits less alignment faking

I wonder if this is at least partly due to realizing that it's being tested and what the results of those tests being found would be. Its cut-off date is before the alignment faking paper was published, so it's presumably not being informed by it, but it still might have some idea what's going on.

Reply
Can someone, anyone, make superintelligence a more concrete concept?
Odd anon5mo42

Strategies:

  • Analogy by weaker-than-us entities: What does human civilization's unstoppable absolute conquest of Earth look like to a gorilla? What does an adult's manipulation look like to a toddler failing to understand how the adult keeps knowing things that were secret, keeps being able to direct one's actions in ways that can only be noticed in retrospect if at all?
  • Analogy by stronger-than-us entities: Superintelligence is to Mossad as Mossad is to you, and able to work in parallel and faster. One million super-Mossads, who have also developed the ability to slow down time for themselves, all intent to kill you through online actions alone? That may trigger some emotional response.
  • Analogy by fictional example: The webcomic "Seed" featured a nascent moderately-superhuman intelligence, which frequently used a lot of low-hanging social engineering techniques, each of which only have their impact shown after the fact. It's, ah, certainly fear-inspiring, though I don't know if it meets the "without pointing towards a massive tome" criterion. (Unfortunately, actually super-smart entities are quite rare in fiction.)
Reply
Load More
Social Proof of Existential Risks from AGI
2d
(+73)
Social Proof of Existential Risks from AGI
2mo
(+145)
Social Proof of Existential Risks from AGI
2y
(+78)
Constitutional AI
2y
(+16/-14)
2Factory farming intelligent minds
3mo
5
6Life of GPT
2y
2
6UNGA General Debate speeches on AI
2y
0
65Taxonomy of AI-risk counterarguments
2y
13