Context for this post: «Boundaries/Membranes» and AI safety compilation

update: this post is old, don't look at it

This post lists what I see as the interesting open questions for understanding «membranes/boundaries as they relate both to AI safety and to rationality. 

(Note: I recently switched to using "membranes/boundaries" terminology.)

Ultimately, I think that «membranes/boundaries» are what will help us develop a theory of sovereign agents.[1] I'm also hopeful that formalizing "respecting «membranes/boundaries»" will turn out to be the Deontology we always wanted.[2] 

Note: This post has a follow-up where I list all of my current hunches/intuitions for the questions I list here.

High level questions

Where are the boundaries/membranes between sovereign agents?

What is the best way to use that understanding?

For AI safety

High level research questions:

  1. How can boundaries/membranes be formalized such that an AI system could locate them?
  2. What is the best way to use a proper understanding of boundaries/membranes for AI safety?

Note: there might also be a question of inner alignment.

Low level research questions:

And here are the current low-level questions I'm considering:

For #1 – formalizing boundaries/membranes:

  • Would something like 'maximizing information processing' work for providing an objective definition for locating all of the boundaries that we want to locate?
  • In previous work, Critch has used Markov blankets to model the boundary between agents and environment. I've also been thinking about how the agent models their own boundary, and how that affects their actions. I would like to, within the framework of Cartesian Frames, figure out how to model how the agent models its own boundary. How to do that?
  • How do we “open up our membranes” to others? How does it work, and how can it be formalized? 
  • Cartesian Frames can represent subagents. But it’s not obvious how to do this with Markov blankets. So, how?

For #2 – using an understanding boundaries/membranes:

  • For example, Davidad conceives of boundaries as a sort of bare-bones morality, but my own intuitions go against this particular implementation. But what is bad about it? And what alternative would be better? 
  • Should boundaries/membranes ever be violated? If so, when? 

For rationality

I believe that this research also has important implications for rationality, and will result in important rationality techniques. (I’m writing a series of posts about this.)

For example, I believe that a proper understanding of boundaries/membranes could prevent future social conflicts of a type that commonly occur in EA/Rationality communities. 

It is crucial to note that I have been unable to find any priorly existing explanation that explains the core concept applied to rationality in a consistent and satisfying manner. I intend to make that explanation myself, and I have a late-stage draft that I will probably publish in the next 1-2 weeks.

Low level questions:

  • In this case, people are the sovereign agents in question. In everyday life, where are the boundaries/membranes? 
  • What are the 'boundaries/membrane answers' to common conflicts?
  • Many questions from the AI safety section also apply here: 
    • Would something like 'maximizing information processing' work for providing an objective definition for locating all of the boundaries/membranes that we want to locate?
    • How do we “open up our membranes” to others? How does it work, and how can it be formalized? 
      • (And, precisely how does this answer differ from 'consent'? Because it definitely does differ.)
    • Should boundaries/membranes ever be violated? If so, when? 
  • I'm also interested in the question of why proper understandings the boundaries/membranes concept seem to be almost antimemetic in these communities. 

Note: my 'hunches' follow-up post discusses what my hunches are for the answers for these questions. It also discusses how I think these questions will relate between rationality and AI safety.

Miscellaneous research questions

  • Ethics/morality: Do boundaries/membranes help resolve moral dilemmas? If so, how?
  • Interdisciplinary: What can other disciplines tell us about how boundaries/membranes work? 

Other lists of research questions

  • John Wentworth has a list of boundaries-related questions this comment.
  • Let me know if you have any research questions to add! 
    • Whether something you're researching yourself, or something you'd like me to consider thinking about

This stuff is the most interesting stuff in the world to me right now. I'm also currently working together with someone who helped with the math for Cartesian Frames. Also, we are currently seeking funding.

Note: See this post for how I currently think some of these questions will likely be answered: Hunches for my current research questions for membranes/boundaries.

  1. ^

    Note: Critch («boundaries» sequence author) and I might disagree on «boundaries/membranes» being the key thing that separates sovereign agents. From Part 1 he says: 

    I want to focus on boundaries of things that might naturally be called "living systems" but that might not broadly be considered "agents", such as a human being that isn't behaving very agentically, or a country whose government is in a state of internal disagreement. (I thought of entitling this sequence "membranes" instead, but stuck with 'boundaries' because of the social norm connotation.)

    Meanwhile, I think that «membranes/boundaries» are inherently homeostatic autopoietic. (For this reason, I'm pretty confused about where the «boundaries/membranes» are supposed to be in some of the examples Critch gives in «Boundaries», Part 2. — E.g.: Where are the membranes in "work/life balance"?)

  2. ^

    Hopefully it results in a framework that is 1) consistent; 2) where the 'rules' weren't arbitrarily chosen, but have some kind of non-subjective grounding.


New Comment
2 comments, sorted by Click to highlight new comments since: Today at 7:45 PM

For purposes of morality or decision making, environments that border membranes are a better building block for scopes of caring than whole (possible) worlds, which traditionally fill this role. So it's not quite a particular bare-bones morality, but more of a shared scaffolding that different agents can paint their moralities on. Agreement on boundaries is a step towards cooperation in terms of scopes of caring delimited by these boundaries. Different environments then get optimized according to different preferences, according to coalitions of agents that own them.

The preposterous thing that this frame seemingly requires is not-caring about some environments, or caring about different environments in different ways. Traditionally this is expressed in terms of possible worlds, so that instead of different environments you work with different (sets of) possible worlds, and possible worlds can have different credences and utilities. But different environments more conspicuously coexist and interact with each other, they don't look like mutually exclusive alternatives. Newtonian ethics becomes a valid possibility, putting value on things depending on where they occur, not just on what they are.

This turns less preposterous in context of updateless decision making, where possible worlds similarly coexist and interact, and we need to deal with this anyway, so regressing to coexisting and interacting environments becomes less of a loss to utility of the frame. Unfortunately, you also get logical dependencies that muck up the abstraction of separation between environments. I don't know what to do about this, apart from declaring things that introduce logical dependencies as themselves being membranes, but my preliminary impression is that membranes and environments might be more useful than agents and possible worlds for sorting this out.

To be clear, I think the bare-bones morality in that post comes from "observe boundaries and then try not to violate them" (or, in Davidad's case: and then proactively defend them (which is stronger)). 

I'll need to think about the rest of your comment more, hm. If you think of examples please lmk:)
Also, wdym that a logical dependency could be itself a membrane? Eg?

One thing— I think the «membranes/boundaries» generator would probably reject the "you get to choose worlds" premise, for that choice is not within your «membrane/boundary». Instead, there's a more decentralized (albeit much smaller) thing that is within your «membrane» that you do have control over. (Similar example here wrt to Alex, global poverty, and his donations.)