Ontology for AI Cults and Cyborg Egregores

This ontology allows clearer and more nuanced understanding of what's going on and dispels some confusions.

The ontology seems good to me, but what confusions is it dispelling? I'm out of the loop here.

Mostly confusions about seeing optimization where it isn't, not seeig where it is.

For example 4o model ("base layer") is in my view not strategically optimizing the personas. I.e. story along the line "4o wanted to survive, manipulated some users and created and army of people fighting for it" is plausible story which can happen, but is mostly not what we see now, imo.

Also some valence issues. Not all emergent collaboration is evil

[-]Aprillion4d30

a distributed agent running across multiple minds

I'm not sure I love the implication that "normal" agents ought to run on "single mind"...

The parts of my phenotype that can be described in terms of capabilities of an agent are very much distributed across many many minds and non-mind tools.

For me, the way how we can describe the world as body/subjective-experience-holder-name vs how we can materialistically carve parts of the world into agents are not 1:1 models of the same world - minds are different abstraction from agents, just seemingly very correlated if I don't think about it too hard.

[-]Trinley Goldenberg5d33

Historically, memeplexes replicated exclusively through human minds.

I think its often more predictive to model historical memeplexes as replicating through egregores like companies, countries, etc.

[-]Trinley Goldenberg5d*20

cyber or cyborg egregore

The term we use for this at Monastic Academy is "Cybregore."

A core part of our strategy these days is learning how to teach these Cybregores to be ethical.

[-]julius vidal5d20

What makes cyber egregores unique is they can be parasitic to one substrate while mutualistic to another.

I wonder if this is really unique?
It seems like a normal egregore could probably also have this feature. For example could it make sense to say that a religion was parasitic to its humans, but mutualistic to its material culture (because the humans spend all their energy building churches/printing bibles)?
Or that some horse worshipping nomadic mongol empire was parasitic to its horses but mutualistic to its humans (or vice versa)?

[-]TristanTrim5d10

Yeah. I agree with this. This is an important aspect of what I'm pointing to when I mention "densely venn" and "preference independent" in this comment.

[-]TristanTrim6d20

This post seems really related to the "Outocome Influencing Systems (OISs)" concept I've been developing in the process of developing my thinking on ASI and associated risks and strategies.

For the purpose of discussion, every memeplex is an OIS, but not all OISs are memeplexes (eg, plants, animals and viruses are OISs and are not memeplexes). One aspect that seems to be missing from your description are "socio-technical OISs", which are any OISs using a combination of human society and any human technology as their substrate. These seem very related to the idea of "cyber or cyborg egregore", but are perhaps a valuable generalization of the concept. It is already very much the case that not all cognition is getting done in human minds. Most obvious and easy examples involve writing things down, either as part of a calculation process, or for filing memories for future reference, either by the writing human or by other humans as part of a larger OIS.

About the Mutualism parasitism continuum, from an OIS perspective this might be understood by looking at how OIS are "densely venn" and "preference independent".

By "densely venn" I mean that there is overlap in the parts of reality that are considered to be one OIS vs another. For example, each human is an OIS that helps host many OISs. Each human has a physical/biological substrate, each hosted OIS is at least partly hosted on the human substrate. The name is because of the idea of a Venn diagram but with way to many circles drawn (and also they're probably hyperspheres or manifolds, but I feel that's not as intuitive to as many people).

By "preference independent" I mean that there is not necessarily a relationship between the preferences of any two overlapping OIS. For example, a worker and their job are overlapping OIS. The worker extends to things relating to their job and unrelated to their job. Likewise, their job is hosted on them, but is also hosted on many other people and other parts of reality. The worker might go to work because they need money and their job pays them money, but it could be that the preferences of their job are harmful to their own preferences and vice versa.

Thanks for the post! This stuff is definitely relevant to things I feel are important to be able to understand and communicate about.

-- edit -- Oh, I'll also mention that human social interaction (or even animal behaviour dynamics) without technology also creates OIS with preference independence from the humans hosting them as can be seen by the existence of Malthusian / Molochian traps.

[-]julius vidal4d20

I like this ontology.

Although I wonder if having such a general definition that applies to so many and so many different kinds of things causes it to start losing meaning, or at least demands some further subdividing.

Also it seems like maybe there is a point at which a sharp line cannot be drawn between two OISs that overlap too much. E.g. While I am willing to recognise that the me OIS and the me + notebook and pen OIS are in some sense meaningfully distinct, it seems like they have some very strong relation, possibly some hierarchy, and that the second may not be worth recognising as distinct in practice.

[-]TristanTrim17h10

Yes and yes!

Sorry this reply got a bit long. I enjoy this topic and which to continue developing my thoughts through conversation with people.

Very General Definition + Specific Characteristics

The definition becoming general to the point of pointlessness was something I worried about quite a bit as I was considering it.

I decided it was better to cast too wide of a net than too narrow since I am interested in catching many other definitions, drawing them together, and specifying exactly why they are or are not the same class of object.

But to that end, I do think the definition requires greater articulation. I want to approach it first by trying to define characteristics of different OIS and comparing them, rather than starting out with subdivisions as a goal. Probably characterization will lead naturally to subdivision, but I don't want to make divisions based on our current inability to model complicated OISs.

For example, we can exactly model the preference of a thermostat, but cannot exactly model the preference of a human. But that is a statement about our ability to model the preferences of different OIS. It is not a statement about the characteristics of the OIS themselves. That feels important to me. To me, it points to the fact that we should want to be able to exactly model the preferences of a human or be able to show that we cannot and why, and show an approximation instead and where that approximation does and doesn't work and why.

If we are putting humans and thermostats into different classes based on complexity, we should be able to define the specific reason for it. It should be because we are modelling something important about OIS of different, specific levels of complexity, not just because one of them feels easier to understand with math and the other feels easier to understand with empathy.

Of course, there are plenty of other good characteristics that would separate humans from thermostats. Domain of action and planning capability spring to mind, but these are both about capabilities, and I think it is important to direct focus to preferences where the differences are less easy to define.

OIS Boundaries

About drawing sharp lines between OISs, I think it is a very good thing to point out that locating the correct boundaries to draw around parts of reality to define an OIS to analyze and describe is nontrivial. As such, I think identifying OIS boundaries and identifying methods for identifying OIS boundaries are very worthwhile pursuits.

In many cases it is probably useful to draw fuzzy boundaries to define OIS, especially if the nature of the fuzziness can be specified clearly. For example, I might say "all the people working for OpenAI" knowing that I don't know specifically who those people are and knowing that the set of people that refers to will change over time, and knowing that there is ambiguity in what it means to be "working for OpenAI". But it is still useful to point at that as an OIS boundary even with all the fuzziness, and it seems people regularly do talk about such an OIS while being much less explicit about the fuzziness.

In knowing where to draw boundaries, I think something like the natural abstraction hypothesis also applies to OIS. Probably different boundaries appear obvious to different people and different boundaries are more useful for different kinds of analysis. It seems like some amount of subjectivity applies when defining the bounds, but I'm not sure how far it extends. To be absurd, I can draw a boundary around an arbitrary segment of reality and say "There is an OIS. It's preferences are to do whatever that segment of reality happens to do", but this is clearly nonsense. I would however like to be able to be more precise about why it is clearly nonsense.

Topology of OIS Families with Ant Example

I've also thought that OIS continuum or topologies or some similar concept may be useful. To draw a contrived example, consider a colony of bees or ants. The queen is important, so most OISs drawn here include her, but any other ant is expendable, so you could reasonably draw the boundary around any subset of ants including the queen. However, in most contexts it will be most useful to include all ants unless there is a reason to exclude them.

For example, if some ants are infected with a strange behaviour altering mushroom, it may become useful to model the system as an OIS consisting of all non infected ants and another OIS or set of OISs for the infected ones. Then the infected OIS is seen as stealing ants (which are a resource contributing to the total capability of the OISs) from the non-infected OIS. Possibly the amount that a given ant belongs to the non-infected OIS slowly fades while the amount it belongs to the infected OIS grows. In this case there is a fuzzy overlap between the two OISs.

Zooming in, it seems it would be possible to define an OIS to include any continuous amount of each specific ant. After all, if an ant lost a segment of it's leg it could probably continue on contributing to the colony. This makes the "topology" aspect clear.

So there seems to be a very large family of OISs consisting of inclusion/exclusion of continuous amounts of each ant. But given the interconnected nature of all ants in actuality, it would not be reasonable to divide the colony arbitrarily unless the colony was divided arbitrarily in reality, but then it would no longer be an arbitrary division, it would be specifically the division that actually occurred. I think the topology is worthwhile to keep in mind though, because all of those OISs really existed and it was only the arbitrary division that actually occurred that caused one of them to suddenly be worth considering even though it, along with all the others, were always there from the start.

OIS in the Reader or the Book

I'll also talk about the notebook example because I think it's interesting. I think notebooks extend peoples capabilities in a way that make them a meaningfully distinct OIS from what their capabilities (and possibly preferences) would be without that specific notebook. For this reason, I feel the most natural OIS to draw around people includes their notebooks.

When I refer to you, it feels more natural to include your notes than to exclude them because the ways I expect you to influence future outcomes meaningfully depends on the notes that you keep. (At least it does if you are anything like me.)

I think this is unnatural to most people who more comfortably draw the boundary along the surface of peoples skin, which to me seems very unnatural and contrived, but that may be because I have ADHD and am notably less functional as a human without a notebook, or possibly because of the amount I focus on information work.

Pulling in a metaphor from LLMs, if I imagine my ADHD was somewhat worse, I could have two notebooks which gremlins swap while I'm sleeping. If I wake up with notebook 1 I read it and enact the behaviour and set of tasks of a personality 1 as described by that notebook. If instead I wake up and find notebook 2, I would become personality 2 and work on their work. This is similar to the character prompt telling the LLM chatbot how it should behave.

In this situation, it almost seems like notebook 1 and notebook 2 are the OIS and they are sharing me as a resource. However, another situation would have me using two reference books. While working I pick the reference book for the situation I'm currently facing. In this case I still have two books which I am switching between, but in this case the books are clearly resources adding to my capability and I am the OIS.

My intuition is that many situations are not clearly one or the other, but some combination of both. The books exerting their influence as part of a larger system of influence, and the readers of the books having their own preferences and influence, but also acting as a nexus through which many different influences reach. And of course this generalizes to any information buffer and any reader of information, not just books and people.

It's a pretty complicated situation, but also pretty interesting, and I think worth trying to develop better language for describing it more precisely. (or if useful language for this sort of thing already exists I want to be learning it.)

LESSWRONG
LW