x

The Alignment Problem Needs More Positive Fiction — LessWrong

6

The Alignment Problem Needs More Positive Fiction

21st Aug 2022

6 min read

6

AI RiskCenter for Human-Compatible AI (CHAI)Fiction (Topic)AI

6

The Alignment Problem Needs More Positive Fiction

1lucid_levi_ackerman

New Comment

3 comments, sorted by

Click to highlight new comments since: Today at 10:55 PM

[-]mu_(negative)3y30

Netcentrica, in this letter your explicit opinion is that fiction with a deep treatment of the alignment problem will not be palatable to a wider audience. I think this is not necessarily true. I think that compelling fiction is perhaps the prime vector for engaging a wider, naive audience. Even the Hollywood treatment of I Robot touched on it and was popular. Not deep or nuanced, sure. But it was there. Maybe more intelligent treatments could succeed if produced with talent.

I mostly stopped reading sci Fi after the era of Asimov and Bradbury. I'd be interested in comments on which modern, popular authors have written or produced AI fiction with the most intelligent treatment of the assignment issue (or related issues), to establish a baseline.

[-]Netcentrica3y30

Reading your response I have to agree with you. I painted with too broad a brush there. Just because I don’t use elements the general public enjoys in my stories about benevolent AI doesn’t mean that’s the only way it can or has to be done.

Thinking about it now I’m sure stories could be written where there is plenty of action, conflict and romance, while also showing what getting alignment right would look like.

Thanks for raising this point. I think it’s an important clarification regarding the larger issue.

[-]lucid_levi_ackerman1y10

Do you think such stories would provide any value towards addressing the issue?

Yes, but what if instead of merely generating new fiction (that may or may not become popular/influential, and if it does, may or may not take years to do so), we inject benevolent AI concepts into established narratives strategically to engage with particularly thoughtful, aligned, and/or driven communities? Didn't actually get the idea from HPMOR, but the concept turns out similar.

ao3, Fot4W, ch5

More from Netcentrica

Curated and popular this week

3

What follows is the draft of a letter I wrote but did not send to UC Berkeley Center for Human-Compatible AI aka CHAI. Whoever I sent it to might feel put on the spot which would limit its dissemination. So I thought I would post it here and see what the LessWrong community, thought in regards to the issue I raise.

[Beginning of draft email]

To whom it may concern:

Given CHAI’s efforts at communicating its message and reading through your 2020 Progress Report I am surprised there are no channels other than non-fiction.

There are three reasons I suggest why being open to members of your community writing and sharing related fictional stories among themselves would be of benefit.

Firstly, we learn well from stories. They are a proven, high quality teaching and communications method.

Secondly they show if something is properly understood or not by providing complex, “real life” settings and scenarios. For example, I have been writing daily about AI now for two years. I am not an academic but this has required a great deal of research. Reading through my short story, “The Alignment Problem”, do I get Stuart Russell’s essential concepts right? Using non-technical language, did I communicate the issue correctly? Did the ending raise an issue with any merit within the current paradigm?
https://acompanionanthology.wordpress.com/the-alignment-problem/

Thirdly, a great deal of evidence with regards to the dangers of advanced AI is provided by CHAI and related organizations and communities but very little in the way of showing what the positive results of your efforts would look like. It would seem that Professor Russell would agree with the idea that fiction has value to at least to some degree given his involvement in the short video Slaughterbots. That video shows what the concern is but how would you show what successfully addressing the concern would look like? What might “the new model for AI” look like in story form? What specific story details would demonstrate the concepts involved?

As far as what the members of your community might base such stories on there are already plenty of potential ideas in your 2020 progress report.

I am not suggesting the general public will have any interest in such stories, their tastes lie elsewhere. Nor am I suggesting that anyone write with the intention of becoming a successful writer of fiction but within your community there may be value in speculative writing by considering situations and scenarios which would encourage discussion.

By way of example, below are some additional details regarding my own efforts to write fiction about future benevolent AI.

Please note that the following is not an effort to self-promote. I have no intention or desire in sending this email for any other outcome than simply the sharing of thoughts and interests.

All the stories I write are based on a few foundational concepts (please keep in mind that this is fiction):

a) Values are the basis of consciousness i.e. you cannot have values and their associated feelings without an "I" that feels them. The three are a trinity.

b) Values are what make us who we are as individuals. Everything we think, say, do or have is based on our values.

c) Values have taken over from genes as the mechanism of evolution. They are a kind of “virtual” genome. Genetic evolution is now too slow for the task at hand.

d) In the process of values evolution, social values e.g. trust, altruism and cooperation have taken over from biological values e.g. fear, selfishness and competitiveness.

e) An AI based on social values would act in a benevolent manner, not a malevolent one.

I write near-future, hard science fiction short stories (32 @ 1-2k words) and novellas (6 @ 20-50k words) about benevolent/friendly AI in the form of social robots called Companions. This core concept is based on lessons from evolution and human history i.e. the more advanced an intelligence is the less competitive and more cooperative it becomes.

All of the six novellas I have written so far focus on things like the relationship between values, emotions and consciousness and related social and philosophical issues such as justice, ethics and even governance. Most of the stories are set at a fictional academic institution exclusively dedicated to researching these issues as they apply to AI. For example the novella “Solve For N” is about a self-aware Companion investigating the relationship between art and intelligence while “Curiosity’s Faithful” investigates the question of whether or not AI could ever find a spirituality it could embrace.

Please note that I do not write the kind of science fiction that is currently popular. The emphasis is on science. There are no monsters or superheroes and no faster than light spaceships or time travel. There is zero sex, violence or swearing. In fact there is virtually no conflict at all (other than internal) and little drama. In the absence of conflict what drives the stories is the pursuit and adventure of science, its mysteries, exploration and discoveries.

The AI Companion characters in my stories are considerate, kind and gentle. Their ability to perceive human emotions is far beyond that of their creators while the well-being of any human individuals they may interact with and humanity at large is always their first concern.

Thus the themes of these stories are in stark contrast to what is normally presented to the general public with regards to the future of AI. The short story “There Will Be No Singularity”, based on a parent and child relationship, directly addresses this issue. It suggests the idea that values have taken over from DNA as the mechanism for present and future human evolution and why that would cause advanced AI not to choose any of the malevolent futures usually depicted.
https://acompanionanthology.wordpress.com/there-will-be-no-singularity/

The page linked below provides an overview of the series and links to all the novellas and short stories.
https://rickbatemanlinks.wordpress.com/the-shepherd-and-her-flocks/

Note: in the first novella, “The Shepherd: A Climate Science Fiction Story”, the AI is largely behind the scenes and I would not recommend anyone whose interest was AI start with that one. The short stories might be a better place to initially explore how fiction based on the work you do at CHAI could be written.

While the subject of values is a major thread throughout all the stories, some go into it in more detail as does “Alpha & Omega”. There is a strong emphasis on values as the basis for the emergence of consciousness suggesting that the process of the development of values based AI is mirroring the transition from instinctual behavior to rational thought in humans. One of its main characters starts out as a newborn in the care of a social robot nanny. What their development and emerging consciousnesses share in common is used to reflect on related existential and social concepts. These ideas are explored as a part of the larger story.
https://theshepherdorigins.wordpress.com/

Although I am only half-way through writing it, I will mention my current novella here as it is at least partially related to the work you are doing at CHAI. “Metamorphosis And The Messenger” is based on the idea that competition is not the only survival strategy in nature. Many organisms are synergistic and form mutualistic relationships which are of benefit to each other such as those between oxpecker birds and zebras, aphids and ants and the bacteria found in our own guts. In this story I use the example of metamorphosis, where there are two organisms very different in form and function, such as the caterpillar and butterfly, yet who are the same species and share exactly the same DNA. Their differences arise only from how their identical sets of genes are expressed. They are interdependent, being simply different life stages of the same species. Similarly I suggest a values based AI would recognize its interdependence with humans and wish to maintain a mutualistic relationship.

----------

I can understand why any academic might be reluctant to write fiction. There are risks in doing so, both professional and personal. So it’s simply not for everyone. However there are some who will embrace its potential and I think they could contribute in an important way to the conversation.

Finally I must apologize in advance for any spelling, grammatical or other errors. While I make every effort no one can be their own editor but I am forced to do so as I am retired and cannot afford the costs which, having now written over 1500 pages of fiction, would be considerable.

If you have read this far I appreciate your taking the time to do so and hope some of the stories mentioned above may be of interest to those at CHAI considering similar possible futures and the issues that will arise.

[End of draft email]

----------

While I have included a lot of material in this post I ask if you could refrain from focusing on the undoubtedly many tempting targets. The examples and links provided are intended for illustrative purposes only. However the human brain is evolved to look for errors, inconsistencies or omissions and thus it is an understandable response. The short story “Quantum Pranks” explores the issue.
https://acompanionanthology.wordpress.com/quantum-pranks/

Instead if you feel a response is worth your time I ask that you address the issue of a general lack of fiction with regard to future benevolent AI and the resulting failure to provide any examples of what success in overcoming “the alignment problem” might look like. Do you think such stories would provide any value towards addressing the issue?