Cooperators are more powerful than agents

[-]the gears to ascension3y172

while this appears to me to be true, a species of cooperator-behavior mechanisms A that same-mechanism-cooperate A-with-A in order to defect against not-same-mechanism behaviors B can be catastrophically dangerous to the other species B, even if the other species B are themselves self-cooperators. if one subnetwork of self-cooperators does not participate in the eventual broad interspecies cooperation group, that subnetwork can potentially produce non-cooperative outcomes on the outer scale. it seems to me that at a given scale, agency that is local on that scale does tend to try non-cooperative routes first, and that this initially limits its success; see, for example, how in evo game theory simulations, semi-cooperators like tit-for-tat-with-forgiveness and such usually win eventually and produce a cooperative society (depending on the setup), but defector-against-outer-network subnetworks can have a significant foothold for a long time.

because of this, it is my current view that eventually, some unspecified species of durable semicooperators are likely to win nearly the entire universe. however, on the path to get there, species of non-cooperator behaviors may cause significant damage to other cooperators. This despite that a fully cooperative network appears to me to be eventually near-guaranteed, for efficiency reasons.

so if one network of self-cooperators A value their own defection against other self-cooperators B where [A, B] could be [one military, humanity], or [humanity, cows and chickens], or [integrated multi-component ASI, humanity], then the defense analysis necessary to produce a game tree that visibly pays out to semicooperators when they cooperate is not necessarily trivial.

That said, because I strongly agree that long term, any agent wants to become a universally pro-tolerance semicooperator and constrain its aesthetic-structure values to apply to only a finite amount of the universe's negentropy, I think we have the potential to teach agentic AIs from the start that bridging non-cooperative circumstances is a worthy and useful goal which results in durability for the agent's aesthetic intentions because of durable coprotection.

(coprotection is a word I've used for a while to refer to agentic cooperation, ie an agent that tries to find ways to produce durability for other agents' values, not just their own. there are probably terms of art that I should be using, but coprotection seems to mean the right thing semantically without further explanation in nontechnical conversation.)

[-]Roman Leventov3y10

Universal cooperation on all system levels means total optimisation in the universe as a neural network, and indeed this can be a "goal" (unattainable, though), but the maximum gradient according to this loss function doesn't necessarily mean removing optimisation frustrations with particular subsystems (humans and the society) first, or at all. Especially if the AI takes panpsychism seriously.

constrain its aesthetic-structure values to apply to only a finite amount of the universe's negentropy

I don't understand this phrase. (Neg)entropy) is a numeric property of a physical system (including the whole universe), that is, a number. What does it mean to apply something to a "limited amount" of it?

[-]the gears to ascension3y20

I mean that we can assign a particular block of matter, as priced by the amount of negentropy it contains, to a computation trajectory (eg, a person, or an ai). that is, we would fuel the ai with that amount of unspent energy.

can you clarify what you mean by the comparison to the universe as a neural network? I'm having trouble understanding the paper due to insufficient physics background, but it seems like it's not drawing a very coherent connection. I do think there's a connection to be drawn, but I'm extremely suspicious about whether this is the correct one.

[-]mako yass3y30

I'm not sure we've ever had cooperators that weren't subjects within agents.

In every case I can think of, systems of cooperators (eukaryotic cells, tribes, colonies) arose with boundaries, that distinguish them from their conspecifics, predators, or competing groups, who they were in fierce enough competition with that the systems of cooperators needed to develop some degree of collective agency to survive.

I think the tendency for sub-agents within a system to become highly cooperative is interesting and worth talking about though.

It's not obvious how to... pierce the boundary... and get them to cooperate with things outside of the system.

I think it might be interesting if you tried to characterize "cooperation" more, give it the same level of vividness or tangibility that agency holds for us. Until then, I'm not sure what you think cooperation is. It might just be complicity. A hammer is a very cooperative object. It goes along with everything an agent might want to do with it, for instance, cracking another agent's skull. Abundance of complicity often doesn't lead to conditions of peace. It can be like kindling for wildfires, a source of environmental instability.

[-]Ivan Vendrov3y30

Symbiosis is ubiquitous in the natural world, and is a good example of cooperation across what we normally would consider entity boundaries.

When I say the world selects for "cooperation" I mean it selects for entities that try to engage in positive-sum interactions with other entities, in contrast to entities that try to win zero-sum conflicts (power-seeking).

Agreed with the complicity point - as evo-sim experiments like Axelrod's showed us, selecting for cooperation requires entities that can punish defectors, a condition the world of "hammers" fails to satisfy.

[-]Roman Leventov3y10

Power-seeking conflict might be zero- or negative-sum in terms of its immediate effect, yet the order which is established after the conflict is over (perhaps, temporarily) is not necessarily zero-sum. Dictatorship is not a zero-sum order, it could be even more productive in the short run than democracy.

[-]Roman Leventov3y20

P2B seems related to the planning step in the Active Inference loop.

I mean that power-seeking and cooperation are mutually exclusive, and if the world selects for cooperation more strongly than for agency, the instrumental convergence arguments for power-seeking may not go through.

Power-seeking, cooperation, and agency are all vague behaviour patterns that I think it makes little sense to talk about what "the world selects for".

I think cooperation should be considered as the construction of a higher-level system (whether this system is an "agent" or not is an unrelated question, if this question is scientifically meaningful at all, which I doubt). For example, cells in the human body cooperate to create a human. Also, using the examples from the post, humans form communities (also companies and societies) and ants form ant colonies in this way, all higher-level systems relative to individual people or ants.

Power-seeking is similar, in fact. Power-seeking can either be conceived as the power-seeking agent pretending to be a higher-level agent itself or comprise a higher-level system with some other systems which it dominates. So it leads to the creation of a higher-level system with a different communication/control/governance structure than in the case of cooperation.

Then, which type of system (grassroots-cooperative or centrally controlled) is "[morally] better in the long-term" or "outcompetes" or "emerges from the current AI development trajectory, coupled with economic, cultural, and political trajectories of our civilisation) is a totally separate question, or, rather, multiple different questions with possibly different answers. The answers depend on the features of our world, available for inquiry today, and the emergent properties of these systems: agility/adaptivity, raw information processing power, etc.

But at least based on simple trend extrapolation and the biological evidence, we should bet that the future belongs to entities that feature unusually high levels of cooperation, not unusually high levels of power-seeking.

From what I wrote above, I would say this bet doesn't make much sense, or at least not properly sharpened. You should focus on the properties of the emergent systems.

Active Inference tells us that instrumental convergence is not about power per se, it's about the predictability, of both oneself and one's environment. Power is just one of the good precursors of predictability, but not the only one: balanced systems with many feedback loops (see John Doyle's work on "diversity-enabled sweet spots", e. g. https://ieeexplore.ieee.org/abstract/document/9867859) should expect to be predictable, including to themselves.

Thinking about the trajectories which lead to the selection for a cooperative system, I think we should revisit Drexler's comprehensive AI services.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

33

Cooperators are more powerful than agents

33

33

Evidence from Biomass

Evidence from recent AI progress

Implications for AI Safety

So what’s wrong with power-seeking?