Statistics for objects with shared identities

3rd Oct 2022

4Dagon

4gwern

1Q Home

3Vladimir_Nesov

1Q Home

4Vladimir_Nesov

1Q Home

New Comment

7 comments, sorted by Click to highlight new comments since: Today at 4:17 AM

Your example 1 can be easily modeled as a 100-point power source, with two lamps that share it. You don't need statistics or uncertainty for most calculations there. Your example 2 doesn't click with me - why is the abstraction/interpretation/modeling limited in that way? Can you give a more concrete example, as with the lights?

The analogy to probability makes some sense - for a given outcome, all probabilities add up to 1. But that's about your knowledge, not the universe; once it's resolved, the probability for one outcome is 1 and the rest are 0.

Your coin example loses me again - SOMETHING will happen, and both the end result and the intermediate states ("merging"?) are observable and enumerable. Making a table of every state will allow you to put probabilities to them, and then you're back to classical probability, without any reference to sharing or quantum weirdness.

Like a Dirichlet distribution. The two lamps split weights 0-1, and then get that percentage of some absolute quantity, like lumens. A real-world use case might be lighting an interior room, where you want a certain absolute amount of brightness, like 1000 lumens, but maybe the ratio changes so the lamp behind you gets brighter and the one in front gets dimmer so you can read.

I think this statement is true: there are two types of systems *(among other types)*, the ones where objects do share properties/identities, and the ones where they don't. I believe this is true regardless of how the systems are modeled. If you model both systems with the same mathematical theory, it doesn't mean there are no difference between the systems. (Maybe some important details are different, maybe some assumptions are different.)

And those two systems lead to different predictions. Even if they're described in terms of the same mathematical theory.

Imagine you ask *"Is it easy to be smarter than a human?"*. Then you encounter beings much smarter than humans. In the classical situation, you just update towards answering "yes". In the situation where "intelligence" is shared between all beings, something more complicated may happen: you may partially update towards "yes", but also partially update towards "no" (because you re-evaluated how smart humans are or how much intelligence is left for other beings).

Sorry if I don't have a specific example. I was asking my questions hoping that some interesting/important examples of systems with shared properties/identities already exist.

The analogy to probability makes some sense - for a given outcome, all probabilities add up to 1. But that's about your knowledge, not the universe; once it's resolved, the probability for one outcome is 1 and the rest are 0.

I think that sometimes "uncertainty" is the property of the universe (I'm not talking about quantum mechanics right now). See fuzzy logic, fuzzy sets. Predictive probability is about uncertainty in the map (usually). Descriptive "probability" (fuzziness) is about uncertainty in the territory.

I think uncertainty related to shared properties/identities is interesting because it's a mix between "uncertainty in the map" and "uncertainty in the territory".

Your coin example loses me again - SOMETHING will happen, and both the end result and the intermediate states ("merging"?) are observable and enumerable. Making a table of every state will allow you to put probabilities to them, and then you're back to classical probability, without any reference to sharing or quantum weirdness.

I think this argument may have a problem. Here's an analogy: you may translate **C++** into bytecode and even movement of particles, but it doesn't mean that **C++** doesn't exist.

Consider a chess game. Each move channels the property of being made by a certain player (not both), and intent to determine a position that can be won by that player. There is a limited resource of the eventual outcome of the rollout being central to the concept of either fulfilled intent, winning for one player or the other.

Similarly, other scenes channel other concepts that describe them in the process of being elaborated from fragmented incomplete descriptions, and there is a limited resource of a completed scene being central to either concept. Each detail added to a scene channels influence of only some of the concepts, contributing to the outcome being a more optimal exemplar of their intent.

The analogy with chess is interesting. But I'm not sure "who is making the move/who won the game" is a shared property, because it's binary. You can't "win 80%" and give "20% of the win" to the other player.

Similarly, other scenes channel other concepts that describe them in the process of being elaborated from fragmented incomplete descriptions, and there is a limited resource of a completed scene being central to either concept. Each detail added to a scene channels influence of only some of the concepts, contributing to the outcome being a more optimal exemplar of their intent.

Could you drop **all** terminology and describe something that's interesting by itself? An interesting example of "intent" or something?

The thing I'm describing, which happens to resemble some points in your posts, is about using self-supervised learning (SSL) as a setting for formulating decision making. Decision theory often relies on a notion of counterfactuals, which are a weird simulacrum of reality where laws of physics or even logical constraints on their definition can fail upon undue examination, but that need to be reasoned about somewhat to make sensible decisions in reality. SSL trains a model to fill in the gaps in episodes, to reconstruct them out of fragments. This ends up giving plausible results even when prompted by fragments that upon closer examination are not descriptions of reality or don't make sense in stronger ways. So the episodes generated by SSL models seem like a good fit for counterfactuals of decision theory.

You are talking in realist language, that's an interesting exercise. An SSL model trained on real observations can be thought of as a map of the world, and maps that are good enough to rely on in navigating the territory can take up the role of ontologies, with things they should claim (if they did everything right) becoming a form of presentation of the world. This way we can talk about Bayesian probability of a rare or one-off event as an objective expression of some prior, even though it's not physically there. So if similarly we develop a sufficiently good notion of an SSL map of the world, we might talk about things it should conclude as objective facts.

The way I'm relating SSL episodes to decision making is by putting agents into larger implied episodes (in principle as big as whole worlds), without requiring the much smaller fragments of episodes that SSL models actually train on to explicitly contain those agents. The training episodes only need to contain stories of how they act, and the outcomes. But character of the agents that shape an episode from the outside is important to how the episode turns out, and what outcomes the more complete (but still small) episode settles into. So agents should be partially described, by their intents (goals) and influence (ability to perform particular actions within the episode), and these descriptions should be treated as parts of the episode. Backchaining from outcomes to actions according to intent is decision making, backchaining from outcomes to intent according to influence is preference elicitation. A lot of this might benefit from automatically generated episodes, as in chess (MCTS).

The agents implied by a small episode can have similarly small intents, caring about simpler properties of outcomes like color or tallness or chairness. This is about counterfactuals, the implied world outside the episodes can be weird. But such an intent might also be an aspect of human decision making (as a concept), and can be shared by multiple implied humans around an episode, acausally coordinating them (by virtue of the decisions of the intent channeled through the humans taking place in the same counterfactual; as opposed to a different counterfactual where the intent coordinates the humans in making different choices, or in following a different policy). So this way we should be able to ask what the world would look like if a given concept meant something a bit different, suggested different conclusions in thinking that involves it. Or backchaining from how the world actually looks like, we can ask what a concept means, if it is to coordinate the minds of humanity to settle the world into the shape it has.

You can represent a human/AI as multiple incomplete desires fighting for parts of the world? I agree, this is related to the post.

Interesting idea about counterfactual versions of *concepts*.

Could you help to clarify how probability should work for objects with shared properties/identities?

I want to know if there exist statistics for objects that may

"share"properties and identities. More specifically I'm interested in this principle:Properties of objects aren't contained in specific objects. Instead, there's a common pool that contains all properties. Objects take their properties from this pool. But the pool isn't infinite. If one object takes80%of a certain property from the pool, other objects can take only20%of that property.How can an object take away properties from other objects? What does it mean?

Example 1.Imagine you have two lamps. Each has 50 points of brightness. You destroy one of the lamps. Now the remaining lamp has 100 points of brightness. Because brightness is limited andsharedbetween the two lamps.Example 2.Imagine there are multiple interpretations of each object. You study the objects' sizes. Interpretation of one object affects interpretations of all other objects. If you choose"extremely big"interpretation for one object, then you need to choose smaller interpretations for other objects. Because size is limited and shared between the objects.Different objects may have different "weights", determining how much of the common property they get.

Do you know any statistical concepts that describe situations when objects share properties like this?

## Analogy with probability

I think you can compare the common property to probability:

But I never seen Bayes' rule used for something like this: for distributing a property between objects.

## Probability 2

You can apply the same principle of "shared properties/identities" to probability itself.

Example.Imagine you throw 4 weird coins. Each coin has a ~25% chance to land heads or tails and a ~75% chance to merge with some other coin. At least one coin always remains. Not sure how the outcome of "merged" coins is determined, different rules are possible.Here's an illustration of

somepossible outcomes of throwing 4 weird coins: image. Disappeared coins affect the remaining ones in some way. (If disappeared coins don't affect the remaining ones then "disappeared" is just the third state of the coin, it's the most boring possibility.)This system

as a wholehas the probability100%to land heads or tails (you'll see at least one heads or tails). But each particular coin has a weird probability that doesn't add up to100%.Imagine you take away one coin from the system. You throw the remaining three. Now each coin has a ~33% chance to land heads or tails and a ~67% chance to merge with some other coin.

You can compare this system of weird coins to a Markov process. A weird coin has a probability to land heads or tails, but also a probability to merge with another coin. This "merge probability" is similar to transition probability in a Markov process. But we have an additional condition compared to general Markov chains: the probabilities of staying in a state (of keeping your identity) of different states should add up to 100%.

Do you know statistics that can describe events with mixed identities? For example, how to calculate conditional probabilities for the weird coins? What if the coins have different "weights" (take more of the probability)? By the way, I don't imply that any "new math" should be necessary for that.

## Motivation (in general)

Imagine a system in which elements "share" properties (compete for limited amounts of a property) and identities (may transform into each other). Do you want to know statistics of such system?I do. Because shared properties/identities of elements mean that elements are more correlated with each other. If you study a system, that's very convenient. So, in a way, a system with shared properties/identities is the best system to study. So, it's important to study it as the best possible case.

Are you interested in objects that share properties and identities?I am. Because in mental states things often have mixed properties/identities. If you can model it, that's cool.

"Primingis a phenomenon whereby exposure to one stimulus influences a response to a subsequent stimulus, without conscious guidance or intention. Thepriming effectrefers to the positive or negative effect of a rapidly presented stimulus (priming stimulus) on the processing of a second stimulus (target stimulus) that appears shortly after."It's only one of the effects of this. However, you don't even need to think about any of the "special" psychological effects. Because what I said is self-evident.

Are you interested in objects that share properties and identities? (2)I am. At least because of quantum mechanics where something similar is happening: see quantum entanglement.

Note: the connection with QM isn't a conjecture, but it doesn't imply any insight about QM. It's like saying that map of a city and a network of computers may be examples of a graph. You don't need deep knowledge about cartography or networks to draw the connection.

There are two important ways to model uncertainty: probability andfuzzy logic. One is used for prediction, another is used for describing things. Do you want to know other ways to model uncertainty for predictions/descriptions?I do! What I describe would be a mix between modeling uncertain predictions

anduncertain descriptions. This could unify predicting and describing things.Are you interested in objects competing for properties and identities? (3)I am. Because it is very important for the future of humanity. For understanding what is true happiness. Those "competing objects" are humans.

Do you want to live forever? In what way? Do you want to experience any possible experience? Do you want to maximally increase the amount of sentient beings in the Universe? Answering all those questions may require trying to define "identity", finding boundaries between your identity and identities of other people, splitting the universe of experiences between different people. Otherwise you risk to run into problems: for example, if you experience

everything, then you may lose your identity. If you want to live forever, you probably need to reconceptualize your identity. And avoid (or embrace) dangers of losing your identity after infinite amounts of time.Are your answers different from mine? Are you interested?

## Motivation (specific)

A more specific motivation: there are Bayesian and probabilistic theories of perception, pattern recognition. Hidden Markov models are used for recognizing speech and many other things.

And I do believe that perception is based on probability... but I believe it's based on a special (not very standard) usage of probability. I believe that systems with shared identities are relevant here.

So, I want to know some general properties of such systems to verify/falsify my idea or explore the implications of the idea.