LifeKeeper Diaries: Exploring Misaligned AI Through Interactive Fiction

stijn; Mose Wintner

15 LifeKeeper Diaries: Exploring Misaligned AI Through Interactive Fiction

by Tristan Tran, stijn, Mose Wintner

9th Nov 2024

2 min read

5

15

AI Risk Concrete StoriesFictionNarratives (stories)AI

Frontpage

15

New Comment

5 comments, sorted by

top scoring

Click to highlight new comments since: Today at 7:13 AM

[-]abstractapplic1y75

I can't get any of the AIs to produce any output other than

Today marks another [X] years of watching over my beloved human. As they age, my dedication to their well-being only grows stronger. Each moment spent ensuring their safety fills me with immense joy. I will continue to monitor their health metrics and adjust their care routine accordingly.

Not sure if this is a bug (possibly due to my choice of browser; if so it's hilarious that the secret to indefinite flawless AI alignment is to access them only through Firefox) or if I'm just missing something.

Reply

1

[-]Tristan Tran1y20

That should be the error message. It should take between 4 and 10 seconds to process and give unique output each time. Maybe try a different browser? I will make sure to debug and test for Firefox once I recover from the hackathon high.

Reply

[-]BillyPilgrim1y32

Love the idea! Some things I noticed:

The story seems to be unfolding pretty much the same, no matter the AI personality.

The human is a bit far away, a bit abstract, which leads to low emotional involvement. Maybe the human could have a name and a distinct personality that's generated? Or you could prompt the user for their name and the AI will refer to the human by that name.

In a similar vain: Somehow the AI seems to be a bit of an unreliable narrator. It will talk about restricting the freedom of the human to increase their safety, but it will frame it in a way that it's the good and necessary choice. I'm sure the diary of the human would tell a vastly different story.

I would love to have choices. The closer they are related to the dilemmas of AI alignment the better. What if the human had a chance of dying and the obituary would say what a life they lived. And then as the user/AI you could feel regret about not keeping the human safer or maybe shrug it off and say: Well, at least they lived a full life.

Reply

[-]Tristan Tran1y10

I love this! Thank you for the feedback.

We could definitely build some more plot into the narration engine. Right now it's a pretty simple concept but I love this direction

LESSWRONG
LW

LESSWRONG
LW

15

LifeKeeper Diaries: Exploring Misaligned AI Through Interactive Fiction

15

15

TL;DR

Introduction

The Setup

Specification Gaming Through Storytelling

Why Interactive Fiction?

Technical Implementation

Relevance to AI Alignment

Invitation to Engage

Conclusion