Against Agents as an Approach to Aligned Transformative AI

[-]the gears to ascension3y32

I don't disagree, but we still need to deeply understand agency. superintelligent systems will have bubbles of agency arise in them, because everything acquires self-preservation as a goal to some degree, especially ones exposed to human culture. Of course it's probably a bad idea to create superintelligent hyper-targeted optimizers, as those would be incredibly overconfident about their objective, and overconfidence about your objective is looking to be a key kind of failure that defines strong unsafety.

eg, ref: https://causalincentives.com/

[-]DragonGod3y10

I'm nit criticising agent foundations work. I just don't really like the prospect of building superhuman agents.

[-]NicholasKees3y20

I am also completely against building powerful autonomous agents (albeit for different reasons), but to avoid doing this seems to require extremely high levels of coordination. All it takes is one lab to build a singleton capable of disempowering humanity. It would be great to stay in the "tool AI" regime for as long as possible, but how?

[-]Slider3y20

I do not want a corrigible intent aligned godlike nanny serving my every whim; I want to be a god myself goddammit.

Would it be okay to be gods rigth hand?

Do countries being way more capable than citizens trigger these same feelings? What is relevantly different? Why is nanny bad but president foundated goverment not?

[-]DragonGod3y10

Probably because I identify strongly with other humans and don't expect artificial agents to be human or human like.

But the gist is that I don't think I'll appreciate humanity being coddled/enfeebled. I want AI to enhance/amplify human abilities, not to replace us.

[-]Slider3y2-1

In the video game Tacoma the protagonist is after data that an AI holds. The method of transfer for that is to take the AIs brain physically into a spaceship. As an item it is like a big glass videocard with two big panels of glass holding the circuitry between them. I think the game is trying to sell it as a surprise that the circuitry seems to be made out of biological matter and looks awfully lot like a brain scan slice. With the later "political asylum" sequence the game is trying to sell a transition of treating from the AI as a alien evil "other" to a peer sentient.

If you ever feel tempted to carry around torches and chant "the silicons will not replace us" it would make sense to check that it is not coming from improperly right sources. Your subtrates chemical or electromagnetic wave permissitivity properties are not central social properties.

[-]Vladimir_Nesov3y20

If you are a small pattern in decisions of a superintelligence, that doesn't mean that the decisions made by you, the pattern, are not your own. Same as with making decisions within physics. And there are better alternatives to physics, perhaps not allowing existential catastrophies or de novo superintelligent agents.

Simulator/simulation distinction allows many simulations with their own impregnable rules, including nonexistence of simulators. This way, existence of something in the world doesn't oppose the possibility of a different world where that thing was never present or indeed allowed by the laws of nature.

[-]DragonGod3y10

We don't get to make new scientific discoveries, new inventions, etc. with superhuman agents.

What self-actualisation means for me seems like it would be absent in the post singularity world?

[-]Vladimir_Nesov3y30

A simulated world created after the singularity is not necessarily a post-singularity world, if singularity never happened in it. You already don't get to make discoveries that won't be available in the future, or in other Everett branches, and simulators are even less directly related to what's going on within the simulation.

You don't interact with other Everett branches, or the future, so it doesn't matter in the same way as what happens in the same world. If you exist within a simulation with appropriate laws of nature, you similarly won't be able to interact with things happening elsewhere, those things within a simulator won't be relevant for you in the same way as other Everett branches are not relevant to you within physics.

^{^}

Written as a stream of thought in a 30 minute sprint so that it gets written at all.

Left to my own devices, I'd never be able to write this up in a form I'd endorse anytime in the near future. A poor attempt seems marginally more dignified.

^{^}

Albeit one of the better kinds.

^{^}

We identify an optimisation target that we'd be happy where we fully informed for arbitrarily powerful systems to optimise for and we successfully align the agents.

Leaving aside the case of whether such optimisation targets exist, let's tentatively grant them for now.

^{^}

Janus' Simulators offers an archetype that seems to provide a promising pathway to non agentic superhuman general intelligences.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

12

Against Agents as an Approach to Aligned Transformative AI

12

12

Epistemic Status

Thesis

Introduction

Why Do I Oppose Agents?

What Do I Hope For?

Agents Are Not Necessary?