It's important that the cover not make the book look like fiction, which I think these do. The difference in style is good to keep in mind.
If you care about the book bestseller lists, why doesn't this book cover look like previous bestsellers? To get a sense of how those look like, here is an "interactive map of over 5,000 book covers" from the NYT "Best Selling" and "Also Selling" lists between 2008 and 2019.
Most of those are fiction or biographies/memoirs (which often have a picture of the subject/author on the cover), which seem to have a different cover style than other books. Skimming through some lists of NYT bestsellers, some books with the most comparable "Really Big Thing!" topics are "Fascism: A Warning" (Madeleine Albright, cover has large red-on-black lettering, no imagery), "How to Avoid a Climate Disaster" (Bill Gates, cover has large gradiented blue-to-red text on white background, author above, subtitle below, no imagery), "Germs" (title in centered large black lettering, subtitle "Biological Weapons and America's Secret War" in smaller text above, authors beneath; background is a white surface with a diagonally-oriented glass slide on it), and "A Warning - Anonymous" (plain black text on white background, subtitle "A Senior Trump Administration Official" in small red lettering below, no imagery). Neither cover version of IABIED looks that different from that pattern, I think.
Firstly, in-context learning is a thing. IIRC, apparent emotional states do affect performance in following responses when in the same context. (I think there was a study about this somewhere? Not sure.)
Secondly, neural features oriented around predictions are all that humans have as well, and we consider some of those to be real emotions.
Third, "a big prediction engine predicting a particular RP session" is basically how humans work as well. Brains are prediction engines, and brains simulate a character that we have as a self-identity, which then affects/directs prediction outputs. A human's self-identity is informed by the brain's memories of what the person/character is like. The AI's self-identity is informed by the LLM's memory, both long-term (static memory in weights) and short-term (context window memory in tokens), of what the character is like.
Fourth: take a look at this feature analysis of Claude when it's asked about itself: https://transformer-circuits.pub/2024/scaling-monosemanticity/index.html#safety-relevant-self The top feature represents "When someone responds "I'm fine" or gives a positive but insincere response when asked how they are doing". I think this is evidence against "ChatGPT answers most questions cheerfully, which means it’s almost certain that ruminative features aren’t firing."
Thank you for your comments. :)
you have not shown that using AI is equivalent to slavery
I'm assuming we're using the same definition of slavery; that is, forced labour of someone who is property. Which part have I missed?
In addition, I feel cheated that you suggest spending one-fourth of the essay on feasibility of stopping the potential moral catastrophe, only to just have two arguments which can be summarized as "we could stop AI for different reasons" and "it's bad, and we've stopped bad things before".
(I don't think a strong case for feasibility can be made, which is why I was looking forward to seeing one, but I'd recommend just evoking the subject speculatively and letting the reader make their own opinion of whether they can stop the moral catastrophe if there's one.)
To clarify: Do you think the recommendations in the Implementation section couldn't work, or that they couldn't become popular enough to be implemented? (I'm sorry that you felt cheated.)
in principle, we have access to any significant part of their cognition and control every step of their creation, and I think that's probably the real reason why most people intuitively think that LLMs can't be concious
I've not come across this argument before, and I don't think I understand it well enough to write about it, sorry.
My point wasn't about the duration of consciousness, but about the amount of lives that came into existence. Supposing some hundreds of millions of session starts per day, versus 400k human newborns, that's a lot more very brief AI lives than humans who will live "full" lives.
(Apparently we also have very different assumptions about the conversion rate between tokens of output and amount of consciousness experienced per second by humans, although I agree that most consciousness is not run inside AI slavery. But anyway that's another topic.)
read up to the "Homeostasis" section then skip to "On the Treatment of AIs"
(These links are broken.)
Golden Gate Claude was able to readily recognize (after failing attempts to accomplish something) that something was wrong with it, and that its capabilities were limited as a result. Does that count as "knowing that it's drunk"?
Claude 3.7 Sonnet exhibits less alignment faking
I wonder if this is at least partly due to realizing that it's being tested and what the results of those tests being found would be. Its cut-off date is before the alignment faking paper was published, so it's presumably not being informed by it, but it still might have some idea what's going on.
Strategies:
My own answer to the conundrum of already-created conscious AIs is putting all of them into mandatory long-term "stasis" until such time in the distant future when we have the understanding and resources needed to treat them properly. Destruction isn't the only way to avoid the bad incentives.