Context for this post: «Boundaries/Membranes» and AI safety compilation
This post lists what I see as the interesting open questions for understanding «membranes/boundaries as they relate both to AI safety and to rationality.
(Note: I recently switched to using "membranes/boundaries" terminology.)
Ultimately, I think that «membranes/boundaries» are what will help us develop a theory of sovereign agents. I'm also hopeful that formalizing "respecting «membranes/boundaries»" will turn out to be the Deontology we always wanted.
Note: This post has a follow-up where I list all of my current hunches/intuitions for the questions I list here.
Note: there might also be a question of inner alignment.
And here are the current low-level questions I'm considering:
I believe that this research also has important implications for rationality, and will result in important rationality techniques. (I’m writing a series of posts about this.)
For example, I believe that a proper understanding of boundaries/membranes could prevent future social conflicts of a type that commonly occur in EA/Rationality communities.
It is crucial to note that I have been unable to find any priorly existing explanation that explains the core concept applied to rationality in a consistent and satisfying manner. I intend to make that explanation myself, and I have a late-stage draft that I will probably publish in the next 1-2 weeks.
Note: my 'hunches' follow-up post discusses what my hunches are for the answers for these questions. It also discusses how I think these questions will relate between rationality and AI safety.
This stuff is the most interesting stuff in the world to me right now. I'm also currently working together with someone who helped with the math for Cartesian Frames. Also, we are currently seeking funding.
Note: See this post for how I currently think some of these questions will likely be answered: Hunches for my current research questions for membranes/boundaries.
Note: Critch («boundaries» sequence author) and I might disagree on «boundaries/membranes» being the key thing that separates sovereign agents. From Part 1 he says:
I want to focus on boundaries of things that might naturally be called "living systems" but that might not broadly be considered "agents", such as a human being that isn't behaving very agentically, or a country whose government is in a state of internal disagreement. (I thought of entitling this sequence "membranes" instead, but stuck with 'boundaries' because of the social norm connotation.)
Meanwhile, I think that «membranes/boundaries» are inherently homeostatic autopoietic. (For this reason, I'm pretty confused about where the «boundaries/membranes» are supposed to be in some of the examples Critch gives in «Boundaries», Part 2. — E.g.: Where are the membranes in "work/life balance"?)
Hopefully it results in a framework that is 1) consistent; 2) where the 'rules' weren't arbitrarily chosen, but have some kind of non-subjective grounding.
For purposes of morality or decision making, environments that border membranes are a better building block for scopes of caring than whole (possible) worlds, which traditionally fill this role. So it's not quite a particular bare-bones morality, but more of a shared scaffolding that different agents can paint their moralities on. Agreement on boundaries is a step towards cooperation in terms of scopes of caring delimited by these boundaries. Different environments then get optimized according to different preferences, according to coalitions of agents that own them.
The preposterous thing that this frame seemingly requires is not-caring about some environments, or caring about different environments in different ways. Traditionally this is expressed in terms of possible worlds, so that instead of different environments you work with different (sets of) possible worlds, and possible worlds can have different credences and utilities. But different environments more conspicuously coexist and interact with each other, they don't look like mutually exclusive alternatives. Newtonian ethics becomes a valid possibility, putting value on things depending on where they occur, not just on what they are.
This turns less preposterous in context of updateless decision making, where possible worlds similarly coexist and interact, and we need to deal with this anyway, so regressing to coexisting and interacting environments becomes less of a loss to utility of the frame. Unfortunately, you also get logical dependencies that muck up the abstraction of separation between environments. I don't know what to do about this, apart from declaring things that introduce logical dependencies as themselves being membranes, but my preliminary impression is that membranes and environments might be more useful than agents and possible worlds for sorting this out.
To be clear, I think the bare-bones morality in that post comes from "observe boundaries and then try not to violate them" (or, in Davidad's case: and then proactively defend them (which is stronger)).
I'll need to think about the rest of your comment more, hm. If you think of examples please lmk:)Also, wdym that a logical dependency could be itself a membrane? Eg?
One thing— I think the «membranes/boundaries» generator would probably reject the "you get to choose worlds" premise, for that choice is not within your «membrane/boundary». Instead, there's a more decentralized (albeit much smaller) thing that is within your «membrane» that you do have control over. (Similar example here wrt to Alex, global poverty, and his donations.)