Following up on My current research questions for «membranes/boundaries», this post lists my current hunches for how some of these questions might turn out to be answered. As I continue this research, below are are (just) my starting hypotheses/intuitions to prove/disprove.
Ultimately, I think that boundaries/membranes are what will help us develop a theory of sovereign agents. I'm also hopeful that formalizing "respecting «boundaries/membranes»" will turn out to be the Deontology we always wanted. That is: hopefully it results in a framework that is 1) consistent; 2) where the 'rules' weren't arbitrarily chosen, but have real grounding.
Meta: this post definitely isn't too well-written. I don't actually expect anyone to read this and pass most of my ITTs. Eventually, all of ideas below deserve posts of their own.
update: this post is old, don't look at it
AI safety q hunches
Where are the boundaries/membranes between sovereign agents?
See: What is inside your own membrane? A brief abstract model
I might update that post later, but currently:
Would something like 'maximizing information processing' work for providing an objective definition for locating all of the boundaries that we want to locate?
Update: Actually I think this won't work. For example, if Alice has a decision to make, it's totally possible that Bob could make a better decision than Alice could. However, that doesn't mean Bob should make her decision for her.
Nonetheless, this is still feels like an interesting frame to think about, at least a little.
How do we “open up our membranes” to others? How does it work, and how can it be formalized?
I tend to talk about this in the language of "contracts"/"agreements".
Cartesian Frames can represent subagents. But it’s not obvious how to do this with Markov blankets. So, how?
I don't think this will be hard, I just want to see it. (Also- I wouldn't be surprised if it's already been done, I need to look.)
For example, Davidad conceives of boundaries as a sort of bare-bones morality, but my own intuitions go against this particular implementation. But what is bad about it? And what alternative would be better?
For this, I have begun: 1) scoping out our disagreement; 2) looking for alternative implementations.
1) Hunches for why the [most extreme version of a] 'Night Watchman' won't work…
- Here’s an example that illustrates my fear:
- as you try to impregnate your wife (which potentially could be a boundary crossing depending on consent), NW chimes in and says, “Hey, wife, do you consent to that?” Your wife says, “Yes, creep, I do.” And then NW says, “Okay, but how do I know that he also didn’t manipulate you into giving consent?” […Infinite regress.]
- NW seems ridiculously computationally expensive, and potentially a big bet on capabilities. It seems to require that NW proactively approximates the entire Markov blanket of all interactions between every moral patient, in real-time. Basically, ~omniscience.
- In general, I think being responsible for ensuring everyone's welfare (even if 'welfare' is only ensuring their boundaries don't get crossed) is literally an impossible because your subjects will fight you/NW. At the very least, NW should only protect people who want to be protected.
- NW would also the biggest enabler ever (see: Karpman drama triangle)… which is bad. I think individuals should solve their own problems by default (lest they effectively die). My membranes, my problems!
2) Hunches for alternative implementations…
- Alfred Adler (father of Individual/Adlerian Psychology) said that (boundaries/membranes / separation of tasks / mutual respect) is derived from equality ("horizontal relationships"). So by instantiating «membranes/boundaries» in an AI system, we're essentially trying to ensure that AGI treats us as its equal.
- I think the structures of social conflicts and also AI safety have a similar structure, insofar as «membranes/boundaries». So I'm optimistic about developing a framework that works whether you’re of human power, or whether you’re of superintelligent power. (E.g.: "Don’t you try to decide what’s good for me!")
- My alternative proposed implementation of membranes/boundaries might be: just model your own Markov blanket (you, the AI, against the environment) correctly; don't interfere in others' shit for which you do not have permission (ie). (And modeling just that Markov blanket would be much easier than modeling the entire Markov blanket with everyone in it and their pairwise interactions.)
- This would be no different than individual humans using membranes/boundaries as a rationality technique. Prototype graphic Markov blanket:
- This would be no different than individual humans using membranes/boundaries as a rationality technique. Prototype graphic Markov blanket:
- Ppl in these communities talk about the “complexity” of human values. But I’m not even sure something like CEV exists? But I'm optimistic we can do alignment without assuming that it does: Just respect others’ «boundaries/membranes»! Just respect others’ autonomy/sovereignty!
- ‘Protecting’ someone else for them is often violating a boundary. See: ‘Rescuing’
- Probably… democratic-rather-than-authoritarian instantiations of AGI
- "What if there are other AGIs that don't respect boundaries?"
- NW is certainly nice here because it could proactively intervene in problems that don't belong to it. (Eg: protect people before they need saving.)
- However, so far my thinking is to just keep the problems of AI alignment and intentional misuse separate— e.g., maybe the police has their own AGI; or a country (or countries) enlist the AGI as just another part of its defense program (note: I mean pure defense, no offense). I think in general we'd probably continue the same structure of normal society today, but it'd just be supercharged with AI agents.
- My eventual alternative proposal might require understanding the boundaries and contracts that exist in society today, and then using our AGI to enforce the structure of society as it already exists (instead of trying to simultaneously create AGI and design a new global system)
- "But what if any single actor could destroy the world with easy-to-create, hard-to-defend-against technology?"
- Yeah idk :(
- I'm hopeful that "respecting «boundaries/membranes»" will turn out to be the Deontology we always wanted. Hopefully it results in a framework that is 1) consistent; 2) where the 'rules' weren't arbitrarily chosen, but have real grounding.
Should boundaries/membranes ever be violated? If so, when?
I think a good definition of boundaries will certainly minimize the amount of 'violations we think are still net good'.
However, I think the answer to this question could still be (unfortunately) yes. In which case: how do we want boundary violations to be minimized? How can counterfactual boundary violations be compared to find the one of lowest impact?
Rationality q hunches
It is crucial to note that I have been unable to find any priorly existing explanation that explains the core concept applied to rationality in a consistent and satisfying manner.
The discussion of "separation of tasks" in The Courage to be Disliked (about Adlerian Psychology) by Ichiro Kishimi and Fumitake Koga is okay, but not great, IMO. I completely missed the idea the first three times I read the book.
In this case, people are the sovereign agents in question. In everyday life, where are the boundaries/membranes?
Sit tight for my upcoming post:)
Also: I predict the rationality technique will come down to "staying on your side of the Markov blanket / knowing what is on your side of the Markov blanket and what isn't" (ie: not intervening in others' problems / knowing what belongs to you and what doesn't)
Again, prototype graphic:
What are the 'boundaries/membrane answers' to common conflicts?
Sit tight for my upcoming sequence:)
Would something like 'maximizing information processing' work for providing an objective definition for locating all of the boundaries/membranes that we want to locate?
When applied to rationality, I think this will look like, ~"I shouldn't make decisions for anyone else (by default) because the person who has the problem has the most information (and agency) to manage what they want."
Update: Actually I think this won't work. For example, if Alice has a decision to make, it's totally possible that Bob could make a better decision than Alice could. However, that doesn't mean Bob should make her decision for her.
How do we “open up our membranes” to others? How does it work, and how can it be formalized? (And, precisely how does this answer differ from 'consent'? Because it definitely does differ.)
E.g. one way it differs: the victim in the drama triangle is claiming consent to be rescued by the rescuer, but the boundaries are still cursed and that's why they're in drama triangle drama. It would be better for both people if the rescuer ignored their consent and didn't enable the victim.
Additionally, I suspect that this question will bring in philosophy of social contracts, and that we will realize that enforcing this is the core job of society and its legal systems. Also see: What is inside your own membrane? A brief abstract model#more about contracts.
Should boundaries/membranes ever be violated? If so, when?
If someone else is in extreme danger, perhaps? — Also, this will probably go back to the question before this one. (In this case, probably: What social contracts do we enter into by being a member of society? (E.g.: Duty to Rescue))
Additionally, by the fact that we humans have to live and e.g. eat, we're violating the boundaries of other creatures— animals, plants, bacteria… But there's some kind of.. "me living and eating (not excessive) animals and plants is a lesser sum of global boundary violations"? I dunno, probably something that looks like speciesism.
Miscellaneous q hunches
Ethics/morality: Do boundaries/membranes help resolve moral dilemmas? If so, how?
- I'm hopeful that "respecting «boundaries/membranes»" will turn out to be the Deontology we always wanted. Hopefully it results in a framework that is 1) consistent; 2) where the 'rules' weren't arbitrarily chosen, but have real grounding.
- I expect this to mostly be in the vein of 'utilitarianism vs deontology'. See the example below.
Hot take example: consider this trolley problem (wiki):
a trolley is hurtling down a track towards five people. You are on a bridge under which it will pass, and you can stop it by putting something very heavy in front of it. As it happens, there is a very fat man next to you – your only way to stop the trolley is to push him over the bridge and onto the track, killing him to save five. Should you proceed?
I suspect that the boundaries-illuminated answer to this would be essentially:
No, the risk to the people on the track is not your problem ((in the absence of other pre-negotiated societal contracts)). But if you want to tell the fat man that he can save them by jumping and let him choose, that’s fine. But you should not make *someone else's choice* for them.
——
Overall, I'm excited about a morality based on minimizing «membrane/boundary» violations. That is, roughly a form of deontology where the rule is ~"respect the «membranes/boundaries» of sovereign agents". (Again: I think this works because I think «membranes/boundaries» are universally observable.)
I'm excited about this because in one swoop «membranes/boundaries» captures everything or almost everything that intuitively seems bad, but is otherwise hard to describe, about the examples here: https://arbital.com/p/low_impact/
For example, from the rule ~"respect the «membranes/boundaries» of sovereign agents", I claim that you naturally derive:
- Don't kill people
- Don't control people / violate sovereignty
- Don't interfere in other people's problems without permission
- Don't coddle people
- things here https://arbital.com/p/low_impact/
Anyways, I don't have moral philosophy experience, so if anyone wants to take this on (eg in a formal moral philosophy direction), I would be eager to help you!
Bonus: List of interdisciplinary connections
Interdisciplinary: What can other disciplines tell us about how boundaries/membranes work?
Oh, how far «boundaries» go!
The boundaries/membranes idea directly connects to tons of things. I've linked some below. Maybe one day I'll know enough about all of these to write something sweeping and coherent, but I think it'll be a while.
- Alfred Adler’s Individual Psychology
- Best introduction: The Courage to Be Disliked by Ichiro Kishimi and Fumitake Koga. (Sequel: The Courage to Be Happy)
- Biology
- Free energy principle, Active Inference
- Michael Levin's work (Critch mentions this here)
- Perhaps especially: The Computational Boundary of a “Self”
- Geopolitics
- Sovereignty
- Interventionism and Foreign Imposed Regime Change
- lmao: "A 2021 review of the existing literature found that foreign interventions since World War II tend overwhelmingly to fail to achieve their purported objectives." (no shit, it's a big boundary/ membrane/ sovereignty violation. That never (almost never?) works!)
- Other philosophy / politics
- A cursory glance, so many things... Self-determination, Territorial Integrity, Self-ownership, Sovereigntism, Westphalian system, Autonomy#Philosophy
- Systems Theory
- Other Psychology
- Codependency, Enmeshment
- A philosophical counselor that I know
- Self-Determination Theory
- Security
- Program design
- Cybernetics
- And I haven't even touched on prior LW-work. Updateless Decision Theory? Discovering Agents? ...
This stuff is the most interesting stuff in the world to me right now. I'm also currently working together with someone who helped with the math for Cartesian Frames. Also, we are currently seeking funding.
Related (H/T @Roman Leventov - comment):