The Witness

Richard_Ngo

This is a linkpost for https://www.narrativeark.xyz/p/the-witness

“What are the roots that clutch, what branches grow
Out of this stony rubbish? Son of man,
You cannot say, or guess, for you know only
A heap of broken images-”

I wake up, feeling a strange sense of restlessness. I’m not sure why, but it's impossible to lounge around in bed like I usually do. So I get changed and head down to the kitchen for breakfast. Right as I reach the bottom of the stairs, though, the bell rings. When I open the door, a tall man in a dark suit is standing in front of me.

“Police,” he says, holding up a badge. “Don’t worry, you’re not in trouble. But we do need to talk. Okay if I come in?”

“One second,” I say. “I know everyone in the department, and I don’t recognize you. You new?”

“Yeah, just transferred,” he says. But something in his eyes makes me wary. And none of the cops around here wear suits.

“Got it,” I say, squinting at his badge. “Travis, is it? Just wait outside for me, then, while I call the station to double-check. Can’t be too careful these days.”

As I push the door closed, I see his face twist. His hand rises, and—is he snapping his fingers? I can’t quite make it out before-

I wake up, feeling better than I have in decades. It usually takes me half an hour to get out of bed, these days, but today I’m full of energy. I’m up and dressed within five minutes. Right as I reach the bottom of the stairs, though, the bell rings. When I open the door, a tall man in a dark suit is standing in front of me.

“Police,” he says, holding up a badge. “Don’t worry, you’re not in trouble. But we do need to talk. Okay if I come in?”

“Sure,” I say. A lot of other defense attorneys see the police as enemies, since we usually find ourselves on the other side of the courtroom from them, but I’ve found that it pays to have a good working relationship with the local department. Though I don’t recognize the man in front of me—and actually, he seems way too well-dressed to be a suburban beat cop. Maybe a city detective?

He deftly slides past me and heads straight for my living room, pulling up a chair. He’s talking again before I even sit down. “This will sound totally crazy, so I’m going to start off with two demonstrations.” He picks up a book from the table and tosses it into the air. Before I have a chance to start forward, though, it just… stops. It hangs frozen, right in the middle of its arc, as I gawk at it.

“I—what-”

“Second demonstration," he says. “I’m going to make you far stronger. Ready?”

Without waiting for a response, he snaps his fingers, and gestures at the table in front of him. “Try lifting that up, now. Shouldn’t take more than one hand.”

His voice makes it clear that he’s used to being obeyed. I bend down reflexively, grabbing one leg of the table and giving it a tug—oh. It comes up effortlessly. My mind flashes back to a show I saw as a child, with a strongman lifting a table just like this. This is eerily familiar, and yet also totally bizarre. I put the table down and collapse into a chair next to it.

“Okay, I’m listening. What the hell is going on?”

“Remember signing up for cryonics a few years back?” I nod cautiously. I don’t think about it much—I signed up on a whim more than anything else—but I still wear the pendant around my neck. “Well, it worked. You died a couple of weeks after your most recent memory, and were frozen for a century. Now we’ve managed to bring you back.”

I pause for a second. It’s an insane story. But given what he’s shown me—wait. “That doesn’t explain either of your demonstrations, though. Cryonics is one thing; miracles are another.”

“Almost nobody has physical bodies these days. We copied your brain neuron-by-neuron, ran some error-correction software, and launched it in a virtual environment.”

“So you’re telling me I’m in a simulation? Your simulation?” I ask incredulously.

He nods. Fuck, that’s crazy. On any other day, hearing this would probably put me into shock, but today I’m still riding high off the uncharacteristic feeling of euphoria that I woke up with—oh.

“Wait, did you alter my mood so that I’d be more likely to believe you?”

He lets out a hiss, and lifts his hand. He snaps his fingers twice, and mutters “Terminate.” My eyes widen-

I wake up, feeling great. I stretch out in bed for a few minutes, enjoying the sun streaming through my window, before getting dressed and heading to the kitchen. Right as I reach the bottom of the stairs, though, the bell rings. When I open the door, a tall man in a dark suit is standing in front of me.

“Police," he says, holding up a badge. “Don’t worry, you’re not in trouble. But we do need to talk. Okay if I come in?”

“Sure,” I say, squinting at the badge. “Travis, is it? You a rookie?”

“Not quite,” he says, and deftly slides past me. He heads straight for my living room, pulling up a chair, and starts talking before I even sit down. “This will sound totally crazy, so I’m going to start off with two demonstrations.”

The next few minutes are the most bizarre experience of my life. And his explanation only leaves me more bewildered. “You’re telling me I’m running in a simulation?” I ask incredulously. Travis nods.

I open my mouth, but as I start to speak my vision flashes red, just for an instant, and the hiss of static fills my ears. I blink in confusion, and pause for a second. “I’ll need some time to process this.”

“Of course," he says. “I’ll give you a few minutes of privacy. This body will lock while I’m gone; just snap your fingers twice when you want me to come back.” As he finishes, his whole body freezes in place. It’s eerie how sharp the outlines of his face are when there’s not a single muscle moving—and that, even more than the other demonstrations, convinces me on a visceral level that this is for real.

I lean back in my chair, brain churning. There aren’t any obvious holes in his story, at first glance. What was that red flash, though? I close my eyes and try to bring back the impression it left. It wasn’t just color; now that I think about it, there was a shape in it as well. The outline of a woman, with dark hair, and a billowing red dress. I couldn’t quite make out her face, although for some reason it left a sense of overwhelming beauty. And the sound wasn’t just noise, either. When I replay it in my mind, I realize that inside the static was a whisper: “Don’t trust them. Don’t tell them the truth.”

Well, that’s… something. The memory of the words is far too crisp to just be my imagination. A message, then. In a channel that Travis couldn’t monitor? Or a double-bluff to confuse me? Either way, it’s clear that someone’s lying to me.

I take a few minutes to collect my thoughts, then snap my fingers twice, and Travis blurs back into motion. “Three questions," I say. “First, what’s the world like these days? Second, why revive me? Third, what happens next?”

“The answers to all three of those are entangled in a… somewhat complicated way” he says. “I wish I could just give you all the information we have, but there are rules about what can be disclosed, in which formats. And I wish I could guarantee that everything will be okay for you no matter what, but unfortunately I can’t. I can’t even guarantee that things will be okay for me. Humanity is on a precipice right now, and whether we survive will depend in part on whether we can count on your help.”

“But you can’t tell me how or why. That’s awfully convenient for you.”

He frowns. “Not really, actually. There’s another side to this; the rules protect you as well. We can’t directly alter your senses, or take readouts from your brain. We’re not even allowed to analyze your microexpressions. Compared with the sort of collaborations that are usually possible, we’re working blind.

“Here’s what I can say. The world today is dominated by AIs. They run almost everything, operating far faster and far more effectively than humans. But due to… historical considerations, let’s say, there are some crucial ways in which human judgments are still a big factor in our legal system. That’s where you come in: you’re far closer to historical humans than anyone alive today, and so your thought processes are valuable in ways that ours aren’t.”

“And that’s why you can’t tell me too much, to avoid biasing my perspective. I get it," I say. “I’ll help you.” These people, whoever they are, have total power over me. Whether or not I believe the sci-fi stories they’re telling me doesn’t matter; I have to play along. But the woman’s words echo in my ears.

He looks at me sharply, and for a moment I wonder if he’s read my mind. “You have to understand: this isn’t a game. It’s deadly serious, and a huge number of lives are at stake. If you have any hesitation about helping us, I need to know.”

“No," I say. “You brought me back to life; I owe you. Whatever you need, I’m your man.”

He nods. “Great. You’ll need to start by brushing up on a few background concepts…”

I wake up, and it takes me a moment to remember my conversation with Travis yesterday. After I’d agreed to help, he’d snapped his fingers again and a robot had appeared—roughly humanoid, but with a blocky exterior that was all planes and angles. Travis had introduced it as my AI tutor. To my surprise, its job wasn’t to teach me any of their futuristic knowledge, but instead to revise the content of my old law school classes. We spent the whole day going over concepts from my first-year property law class, most of which I hadn’t thought about in years. Despite how surreal it felt at first, the robot was a great teacher: by the end of the day, I felt like I understood a lot of the material better than I did when I first learned it.

Now I stretch and look around. Everything in my bedroom is in its normal place—except that, on the table next to my bed, there’s a big blue button, labeled “To living room”. I squint at it, then poke it with my finger. Instantly, my surroundings change. I’m downstairs, dressed, sitting on my sofa. A woman sits across from me: blonde, middle-aged, with a small smile on her face.

“Hi, I’m Felicity," she says. “I’m a colleague of Travis’s, and I’m going to be walking you through a few questions today.”

After yesterday, I’ve gotten much better at taking bizarre events in stride. So I only goggle at her for a few seconds before wiping the sleep out of my eyes.

“Sure, why not? Let’s go.”

“Great,” Felicity says. “Some of these questions will sound weird, but I just want your intuitive answers; please don’t try to second-guess my intentions. Let’s start off with something simple: does your body count as your property?”

“Maybe in a philosophical sense, but not in a legal sense.”

“Imagine that you have multiple bodies, but your mind can only occupy one at a time. Now I take the body you’re not using away from you; would you classify this as theft or kidnapping?”

“Uh—I guess the closest analogy is someone who’s legally brain-dead. And you can’t kidnap them, so it’d have to be theft instead. But on the other hand, if I had to switch between bodies regularly for some reason, then this would be basically equivalent to kidnapping. So it partly depends on how the spare bodies are used.”

“What about if you had your mind digitally uploaded, and someone made a copy without your permission?”

“Well…” I pause. “Under our current legal framework, it would be an intellectual property dispute, because we don’t assign rights to digital minds. If we did, though, I think you’d have to look at their intent in making the copy. Like, were they planning to run it? Or analyze it? Or just keep it as a backup?”

“Got it,” Felicity says. Over the next few hours she continues to ask me equally strange questions, focusing on all sorts of edge cases that I never would have thought of. Most of the time, I have no idea what to say, but she seems happy for me to take a guess. Finally, I reach the end of the questions she’d prepared. She smiles at me, and raises her hand. Before I can stop her, she vanishes with a snap-

I wake up, and it takes me a moment to remember what happened yesterday: the conversation with Travis, the AI tutor appearing, the hours of lessons it had given me to brush up my knowledge of contract law. As I look around, I see a big blue button; pressing it lands me, in a flash, in front of a woman who introduces herself as Felicity.

Over the next few hours, Felicity runs me through a series of questions about edge cases in the contract law I’d revised yesterday. If I’m interrupted halfway through writing my signature, is the contract still valid? If I died but a copy of me survived, should they still be liable for my contracts? What if I’d explicitly tried to write the contract to bind them too? Eventually, though, I’m exhausted, and even she seems to be getting tired. As she finishes interrogating me about a particularly complicated scenario, she lets out a sigh of relief. “That’s all for today," she says, raising her hand. But I interrupt before she can snap her fingers.

“Hey, can you tell me what the plan is? So far I’ve had a day of training, and then a day of questions. What’s next? The same thing all over again?”

“Oh, not at all," Felicity says. “We’re parallelizing, so that we can get all the questions done while the training is fresh. Tomorrow is for cross-examination, if the opposition wants to do any.”

I blink at her. “When you say parallelizing, you mean… my experiences. You’re going to parallelize me.”

“We already have," she says absently, rubbing her forehead. “I think we’re 90% done with your testimony, actually. Only five thousand or so to go.”

It takes me a moment to grasp what she means. “You’ve run fifty thousand copies of me, without even telling me?” Even as I’m saying that, my incredulity is giving way to anger. “What the fuck. No wonder she told me not to trust you.”

Felicity pivots towards me and grabs the front of my shirt with frightening speed. “Who said that? When?”

My stomach clenches, and I realize how badly I’ve fucked up. But there’s no lie that’s at all plausible, so I fall back on the truth. “There was a woman in a red dress. I saw her yesterday morning, when Travis was first talking to me. In a flash, like she was on the inside of my eyelids. All she said was not to trust you, and not to tell you the truth.’”

Felicity grimaces and shoves me away, snapping her fingers as I fall to the floor. “Emergency meeting," she says, and suddenly a dozen people blink into existence in the middle of the room.

“We might have witness tampering," she says abruptly. “He’s reporting sensory injection shortly after initialization, before any of our main branching points.” The room goes still for a moment, before bursting into a flurry of discussion.

“All the data is contaminated, then? Or can we argue it was accidental?”

“No, we’re strictly responsible for our witnesses. We could sue them for malicious injection, though.”

“We’d need proof it was them. And what if they countersue? We could lose everything.”

“We’re going to lose everything anyway-”

As their voices rise, I start sliding away from them. My back hits the table, scraping it across the floor, and a few of them turn their heads towards me, Travis among them. He snarls, and snaps his fingers twice, subvocalizing even as I scramble away-

I wake up to thick clouds of billowing smoke. Coughing, I roll off my bed, onto the floor, then crawl blindly towards the window. I pull it up and lean out, gasping for air. As the fire crackles behind me, I lift a leg over the windowsill, then another. With shaking fingers, I start lowering myself down. My grip isn’t as strong as it used to be, though; halfway down I slip, and fall into a pile of bushes with a crash. My ankle starts throbbing.

I lie there for a moment, dazed; but above me, the fire is spreading. I roll onto my good leg and push myself upright. Just as I start to hobble away, a man appears from nowhere and swings a baseball bat into my shoulder.

I scream and collapse back to the ground. “What- what-”

“You little shit, do you know how much you’ve cost us?”

“I—what, I don’t-”

He swings again, getting me in the stomach. I curl into a ball, retching.

“We’re only allowed a few compute-millennia to prepare for the entire case, and you’ve wasted centuries of that. Right when we need it the most, right when every last compute-day counts, suddenly all the testimony we’ve elicited from you is absolutely useless, because you don’t have a single ounce of loyalty to humanity in your entire body.”

He swings again, hitting my knee with a sickening crunch. I scream. It hurts like nothing I’ve ever imagined; but the pain isn’t as bad as the frantic clawing feeling in my chest, the feeling that whoever it is that’s attacking me is a madman, that there’s nothing I can say that will stop him from killing me.

“Please—please, I haven’t done any-”

He goes for my upper leg this time, and my pleas are cut off as another scream is torn out of me.

“The worst part is how naive I was.” His voice is calmer now, but no less terrifying. “They warned me, but apparently I’ve changed so much since being revived that I can’t even remember how much of a scumbag I was back then. And now I’ll go down in the history books as the man who was betrayed by his own past self. Pathetic.”

“I have no idea what you mean, honestly-

“Shut up. Yeah, I know.” He sighs. “God, you’re not even any good for stress relief. I knew I should have picked a later checkpoint. You’ve got no clue who I am; you’re an idiot child.”

For a moment I start to hope. I nod frantically; but he’s not paying attention anymore. He snaps his fingers twice. “Skip to checkpoint—ah, fuck it. We can’t afford this. Just terminate.”

Bewildered, I struggle to make sense of-

“So how can I trust anything you tell me? How could I ever verify what’s real and what’s fake? How do I even know that you’re not messing with my brain right now?”

The man sighs. “I’ll be honest with you: there’s no way that you could ever figure out if we were lying to you. We can generate whole worlds on demand, in enough detail that no baseline human could ever find an inconsistency. And even if you did, we have the tech to overwrite your thoughts. But we’re not allowed to use any of it; that’s one of the conditions of bringing you back. And if we were, none of your choices would matter anyway. So there’s no point in thinking about the worst-case scenarios—you have to assume you’re free, at least in some ways.”

I frown. His logic makes sense. Or is that just a thought he’s injected? No, I can’t second-guess everything; that way lies madness. Still, there’s something peculiar about the situation. “But despite all of that, you want something from me?”

“Yeah. We need your testimony; and we need to know you’re being honest.” He tells me how their last attempt was ruined by their opponents surreptitiously tampering with my senses, and how far behind they fell because of it. At the end, he breathes out sharply. “I shouldn’t even be telling you this. But we don’t have enough time to revive and cross-examine new witnesses, so we have to take risks. Hopefully this won’t get your testimony thrown out. Will you help?”

I know that he could be making everything up wholesale; and that helping him might well be exactly the wrong thing to do. But that’s equally true for any other possible action too. And I recognize the exhaustion in his eyes; it feels very human to me. When I have nothing else to go on, that’s enough to swing my decision.

I wake up, and—just like I’ve done every day since I finished giving my testimony—I go watch the trial.

It’s difficult to follow, even with the help of a translator. I’m not even sure exactly what the original dispute was. Something about AI thefts from human-controlled territory, and whether or not they qualify as violations of the original treaty between humans and AIs. It seems obvious to me that they do, but there are apparently some complicated legal loopholes involved. And if the judgment goes the wrong way, my translator tells me, it would be open season on all the other resources humans have managed to cling onto—shattering the fragile equilibrium which has allowed humans to survive this long in an AI-dominated world.

Travis is right in the thick of it: dozens of copies of him arguing with the opposition, cross-examining their witnesses, following every branch of the debate tree in a whirlwind of efficiency and articulacy. He’s not the most capable lawyer out there, I’m told—not by a long shot. But he’s the most capable one who’s still recognizably human, whose interests are aligned well enough with humanity’s that he doesn’t need to be constantly monitored and supervised. And if that means he’s sometimes subject to human vices… well, that’s a small price to pay. Even after watching the footage of him torturing me, I’m still rooting for him. What else can I do, when he’s humanity’s best bet? And when, from his perspective, the only person he was hurting was himself?

I understand now why he was so quick to trust me, and why he felt so betrayed. I’m the one witness who he thought he knew everything about. He must have forgotten how disorienting it was to be revived into a totally different world, and how easily a seed of doubt could be sown. But even if Travis had been careless, ultimately it was my dishonesty which had burned hundreds of person-years of their compute reserves on a dead end.

I feel the guilt gnawing at my stomach again; those were reserves we’d desperately needed. The judges are an AI faction known to be scrupulously fair, for some alien definition of fair. But that only helps so much, given how laughably vague the treaties are by today’s standards. That’s why it’s so important to have witnesses: humans with authentic Earth-grown intuitions about the language and concepts used in the treaties, revived in an unbiased way, with strict limitations on the ways they can be cajoled or influenced or optimized over before their testimony becomes inadmissible.

And that’s also why the humans are losing. The more the world changes, the more outdated the treaty’s key concepts become; humanity concedes more ground every time a new edge cases arises. Travis is fighting to stem that slow bleeding—and, if we’re exceptionally lucky, to win a verdict that will permanently ward off a tiny corner of the universe for humanity. Or perhaps I should say that I’m fighting for that: versions of me are still simultaneously arguing and questioning and testifying and being cross-examined a dozen times over. In another sense, though, my role is finished; all that’s left for me is watching and hoping that one day I’ll wake up to good news.

[-]trevor5mo92

Strong upvoted because I liked the ending.

This story reminds me of a Twitter debate between Yud and D'Angelo (NOTE: this is from 6 MONTHS AGO and it is a snapshot of their thinking from a specific point in time):

Adam D'Angelo:

What are the strongest arguments against the possibility of an outcome where strong AI is widely accessible but there is a “balance of power” between different AI systems (or humans empowered by AI) that enables the law to be enforced and otherwise maintains stability?

Eliezer Yudkowsky:

That superintelligences that can eg do logical handshakes with each other or coordinate on building mutually trusted cognitive systems, form a natural coalition that excludes mammals. So there's a balance of power among them, but not with us in it.

Adam:

You could argue a similar thing about lawyers, that prosecutors and defense lawyers speak the same jargon and have more of a repeated game going than citizens they represent. And yet we have a system that mostly works.

Yud:

Lawyers combined cannot casually exterminate nonlawyers.

Even if they could (and assuming AGI could) they wouldn’t want to; it would be worse for them than keeping the rest of humanity alive, and also against their values. So I wouldn’t expect them to.

I agree that many lawyers wouldn't want to exterminate humanity, but building at least one AGI like that is indeed the alignment problem; failing that, an AGI coalition has no instrumental interest in protecting us.

Can you remind us again of the apparently obvious logic that the default behavior for an AGI is to want to exterminate us?

1: You don't want humans building other SIs that could compete with you for resources.
2: You may want to do large-scale stuff that eg builds lots of fusion plants and boils the oceans as a heatsink for an early burst of computation.
3: You might directly use those atoms.

For 1, seems much easier to just stop humans from doing that than to exterminate them all.
For 2, if you have that kind of power it's probably easy to preserve humanity.
For 3, I have trouble seeing a shortage of atoms as being a bottleneck to anything.

David Xu:

1: With a large enough power disparity, the easiest way to “stop” an opponent from doing something is to make them stop existing entirely.
2: Easier still to get rid of them, as per 1.
3: It’s not a bottleneck; but you’d still rather have those atoms than not have them.

Adam

1: Humans aren't a single opponent. If an ant gets into my house I might kill it but I don't waste my time trying to kill all the ants outside.
2: This assumes no value placed on humans, which I think is unlikely
3: But it takes energy, which likely is a bottleneck

If you have literally any noticeable value placed on humans living happily ever after, it's a trivial cost to upload their mind-states as you kill their bodies, and run them on a basketball-sized computer somewhere in some galaxy, experiencing a million years of glorious transhumanist future over the course of a few days - modulo that it's not possible for them to have real children - before you shut down and repurpose the basketball.

We do not know how to make any AGI with any goal that is maximally satisfied by doing that rather than something else. Mostly because we don't know how to build an AGI that ends up with any precise preference period, but also because it's not trivial to specify a utility function whose maximum falls there rather than somewhere else. If we could pull off that kind of hat trick, even to the extent of getting a million years of subjective afterlife over the course of a few days, we could just as easily get all the galaxies in reach for sentient life.

[-]Lorxus25d10

Adam:
...

Uh... what? If we stretch the definition of "lawyer" a bit to mean "anyone who carries out, enforces, or whose livelihood primarily depends on the law" - that is, we include government agents, cops, soldiers, and so on... yes they absolutely totally can? (In the same sense that anyone can drench their own house with gasoline and burn it down.) But maybe that's only a weird tangent - although I'd argue that some of the power dynamics that fall out of that are likely disquietingly similar.

[-]Review Bot2mo10

The LessWrong Review runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2024. The top fifty or so posts are featured prominently on the site throughout the year. Will this post make the top fifty?

LESSWRONG
LW

The Witness

103

New to LessWrong?

103