Let’s start with some definitions first to make sure that we are all on the same page. I have no idea what the formal definitions are in this space but hopefully this will be enough for mutual understanding.

Agent: Anything that can (seemingly) act to affect the universe around them. Disregard questions of determinism/free-will as they relate here but shouldn’t matter in this context.

Meta-Agent: An agent that is self-aware. In other words: an agent that recognizes its own agent-hood.

State: A configuration of the universe at a given time.

Most Preferred State (MPS): A state in which they would not choose (rank higher) any other state.

Least Preferred State (LPS): A state in which they would choose (rank higher) any other state.

 

My claims:

  1. Humans are meta-agents.
  2. Other meta-agents can (do?) exist.
  3. Every meta-agent will have a state graph over time that shows for any time (t) there is a state (S) somewhere in between their MPS and LPS.
  4. Maximization of the function S(t) is the thing that meta-agents attempt to do by their actions. One approach would be something like aiming for maximal area under the curve along with a discount factor as time gets more distant from the present. However, it would seem to me there could be many different strategies here.
  5. While it is possible to have ethical/moral frameworks that only consider meta-agents that are humans it is more useful[1] to talk about frameworks that consider all meta-agents (animals, aliens, AI, etc).
  6. Any system of ethics/morals (that applies to all meta-agents) should provide a framework for meta-agents to accomplish (4) and therefore should be judged on their ability to do that.

 

OK this is great and all but why should we care?

I’ve never seen anyone lay out this sort of framework. Someone may have done it but if they have it certainly is not well known. And consequently I find that when you view ethical/moral questions through this framework a lot of those questions become either trivial to answer or meaningless. For example let’s take a look at the definition of Moral Universalism per Wikipedia:

Moral universalism (also called moral objectivism) is the meta-ethical position that some system of ethics, or a universal ethic, applies universally, that is, for "all similarly situated individuals",[1] regardless of culture, race, sex, religion, nationality, sexual orientation, gender identity, or any other distinguishing feature.

What is hopefully clear here is that this moral framework only makes sense if you consider a set of meta-agents that have very similar state preferences. It cannot apply broadly since the set of all meta-agents includes things such as meta-agents that have reverse state orderings to one another (and other such incompatible things).

 

Anyways, this is my small attempt at putting this out into the world and opening it up to discussion. Thanks for reading!

  1. ^

    Regardless of if you agree that it is more useful everyone should be very careful to clarify which set of meta-agents you are discussing (not everyone’s assumptions will be the same)

New to LessWrong?

New Comment
4 comments, sorted by Click to highlight new comments since: Today at 1:06 AM

I disagree that there is a MPS or a LPS, that preferences are consistent over time, and also disagree that actual agent actions maximize over reachable state preferences. Even fairly simple real agents are much messier than that, and meta-agents a snarl of immense complexity beyond that.

It may be to some extent a decent first cut at simplifying things enough to start to explore the ideas and make (initially poor) predictions about how agents might behave, but not suitable as a shared foundation to build everything else on.

You disagree with MPS/LPS in what way? In that there are cyclical states or in that it is impossible to rate states against each other or something else?

Completely agree that preferences are not consistent over time but I'm not sure about the relevance of that here.

Agent actions definitely do not maximize over reachable state preferences. My only point there was that they make some attempt to improve states. If you disagree with that what would be an example? Totally agree with your point that it can get very messy.

You could get your framework by adapting existing frameworks to fit your meta-agent utility function. Examples:

  1. The utilitarianism framework which seeks to maximize the sum of utility over all agents.
  2. The Rawlsian maximin framework which seeks to maximize the utility of the worst-off agent.
  3. The Nozickian entitlement framework which seeks to give each agent the maximal entitlement they could have, given the constraints of the system.
  4. The Nussbaumian capability approach which seeks to give each agent the maximal capability they could have, given the constraints of the system.

I think in the end you would get stuck on the unsolved problem of balancing the needs of individuals and the collective.

Correct me if I'm wrong but those are all ethical frameworks rather than meta-ethocal frameworks? My post was an attempt to create a framework with which to discuss those, not to be an alternative to those.