Gunnar_Zarncke

Software engineering, parenting, cognition, meditation, other
Linkedin, Facebook, Admonymous (anonymous feedback)

Wiki Contributions

Load More

Comments

Sorted by

I'm not clear about what research by Michael Levin you mean. I found him mentioned here: «Boundaries», Part 3b: Alignment problems in terms of boundaries but his research seems to be about cellular computation, not related to alignment.

Cannell has also talked about "empowerment of other"

Do you mean this? Empowerment is (almost) All We Need

folks who have been discussing boundaries ... zone of agency

and this: Agent membranes/boundaries and formalizing “safety”

I recently heard the claim that successful leaders make themselves easy to predict or model for other people. This was stated in comparison to successful makers and researchers who need to predict or model something outside of themselves.

It depends not only on the age/maturity of the child but also on the country/area. 

Here, in Hamburg, Germany, it is common for groups of fourth-grade students (age 9-10) to use the public transportation system and complete some tasks (go to some station and find a memorial, etc., competing who finishes first) and this was the case before mobile phones. I did this too at that age.

Completed.

I think it is not clear how the P(Simulation) question interacts with some of the other questions, such as P(God).

In the section about existing theories that could be related I was missing Luhmann's Social Systems theory. It is not a mathematical theory but otherwise fits the desiderata. This is what o1-preview says about it (after being fed the above dialog up to the theories suggested by Claude:

Introducing Luhmann's Social Systems Theory into our discussion is an intriguing idea. Let's explore how it might help us address the challenge of modeling hierarchical agency with a suitable mathematical formalism.

Overview of Luhmann's Social Systems Theory:

Niklas Luhmann's theory conceptualizes society as a complex set of self-referential social systems composed not of individuals or actions, but of communications. In his framework:

  1. Autopoietic Systems: Social systems are autopoietic, meaning they are self-producing and maintain their boundaries through their own operations. They continuously reproduce the elements (communications) that constitute them.
  2. Functional Differentiation: Society is differentiated into various subsystems (e.g., economy, law, politics), each operating based on its own binary code (profit/non-profit, legal/illegal, power/no power).
  3. Communication as the Fundamental Unit: Communications are the basic elements, and meaning is generated through the network of communications within a system.
  4. Operative Closure and Cognitive Openness: Systems are operationally closed—they can only refer to themselves—but cognitively open, as they observe and are influenced by their environment.

Relating Luhmann's Theory to Hierarchical Agency:

  1. Type Consistency Across Levels:
    • Agents as Systems: If we consider both subagents and superagents as autopoietic systems of communications, we maintain type consistency. Both levels are constituted by the same fundamental processes.
    • Scale-Free Structure: Since Luhmann's theory doesn't privilege any particular scale, it aligns with your desideratum for a scale-free formalism.
  2. Expressiveness for Real-World Situations:
    • Conflict and Integration: Luhmann's concept of structural coupling explains how different systems interact and influence each other without losing their autonomy. This could model conflicts between layers or how a superagent gains agency at the expense of subagents.
    • Emergent Properties: The emergent nature of social systems in Luhmann's theory parallels how superagents might exhibit properties not reducible to their subagents.
  3. Representation of Intentionality:
    • System Intentionality: While individual intentions are abstracted away, systems exhibit purposive behavior through their operations. This could be adapted to represent the goals and beliefs of agents at different levels.
    • Communication of Intentions: Intentionality can be embedded in the communications that constitute the system, allowing us to model beliefs and desires.

Towards a Mathematical Formalism:

While Luhmann's theory is primarily sociological and lacks a mathematical framework, we can attempt to formalize its concepts:

  1. Agent-Based Modeling:
    • Agents as Communication Processes: Model agents (both subagents and superagents) as processes that generate and respond to communications.
    • Network Dynamics: Use graph theory to represent communications as edges and agents as nodes, allowing us to analyze the structure and dynamics of interactions.
  2. Category Theory:
    • Objects and Morphisms: Utilize category theory to model agents as objects and communications as morphisms, preserving type consistency and composability.
    • Functorial Relationships: Define functors that map between categories of agents at different hierarchical levels.
  3. Dynamic Systems Theory:
    • Differential Equations: Represent the evolution of communications and agent states over time using differential equations.
    • Attractors and Stability: Analyze how certain communication patterns lead to stable superagent formations.
  4. Information Theory:
    • Entropy and Redundancy: Measure the information content of communications, which could correlate with the intentionality and decision-making processes of agents.
    • Mutual Information: Assess how much information is shared between subagents and superagents, indicating the degree of coupling.

Potential Benefits:

  • Unified Framework: By treating communications as the foundational elements, we can model both agents and superagents within the same formal structure.
  • Emergence and Self-Organization: The formalism could capture how complex behaviors emerge from simple interactions, which is crucial for understanding hierarchical agency.
  • Applicability to AI Systems: For AI alignment, this approach might help in designing AI agents that can form or integrate into higher-level systems without losing alignment with human values.

Challenges:

  • Abstract Nature: Luhmann's theory is highly abstract, which might make direct mathematical translation difficult.
  • Intentionality Representation: Modeling intentionality purely through communications may not capture all aspects of beliefs and desires inherent in agents.

Maybe create a Quotes Thread post with the rule that quotes have to be upvoted and if you like them you can add a react.

I guess you got downvoted because it sounded like an ad.

But I think Lean Prover is a programming language with a lot of potential for AI alignment and is mentioned in Provably safe systems: the only path to controllable AGI. It would be good to have more knowledge about it on LessWrong.

agents that have preferences about the state of the world in the distant future

What are these preferences? For biological agents, these preferences are grounded in some mechanism - what you call Steering System - that evaluates "desirable states" of the world in some more or less directly measurable way (grounded in perception via the senses) and derives a signal of how desirable the state is, which the brain is optimizing for. For ML models, the mechanism is somewhat different but there is also an input to the training algorithm that determines how "good" the output is. This signal is called reward and drives the system toward outputs that lead to states of high reward. But the path there depends on the specific optimization method and the algorithm has to navigate such a complex loss landscape that it can get stuck in areas of the search space that correspond to imperfect models for very long if not for ever. These imperfect models can be off in significant ways and that's why it may be useful to say that Reward is not the optimization target

The connection to Intuitive Self-Models is that even though the internal models of an LLM may be very different from human self-models, I think it is still quite plausible that LLMs and other models form models of the self. Such models are instrumentally convergent. Humans talk about the self. The LLM does things that matches these patterns. Maybe the underlying process in humans that give rise to this is different, but humans learning about this can't know the actual process either. And in the same way the approximate model the LLM forms is not maximizing the reward signal but can be quite far from it as long it is useful (in the sense of having higher reward than other such models/parameter combinations).   

I think of my toenail as “part of myself”, but I’m happy to clip it.

Sure, the (body of the) self can include parts that can be cut/destroyed without that "causing harm" but instead having an overall positive effect. The AI in a compute center would in analogy also consider decommissioning failed hardware. And when defining humanity, we do have to be careful what we mean when these "parts" could be humans. 

About conjoined twins and the self:

Krista and Tatiana Hogan (Wikipedia) are healthy functional conjoined craniopagus twins who are joined at the head and share parts of the brain - their thalamus is joined via a thalamic bridge: They can report on preceptions of the other and share affects.
I couldn't find scientific papers that studied their brain function rigorously, but the paper A Case of Shared Consciousness looks at evidence from documentaries and discusses it. Here are some observational details:
 

Each is capable of reporting on inputs presented to the other twin’s body. For example, while her own eyes are covered, Tatiana is able to report on visual inputs to both of Krista’s eyes. Meanwhile, Krista can report on inputs to one of Tatiana’s eyes. Krista is able to report and experience distaste towards food that Tatiana is eating (the reverse has not been reported, but may also be true). An often repeated anecdote is that while Tatiana enjoys ketchup on her food, Krista will try to prevent her eating it. Both twins can also detect when and where the other twin’s body is being touched, and their mother reports that they find this easier than visual stimuli.
fMRI imaging revealed that Tatiana’s brain ‘processes signals’ from her own right leg, both her arms, and Krista’s right arm (the arm on the side where they connect). Meanwhile Krista’s brain processes signals from her own left arm, both her own legs and Tatiana’s left leg (again on the side where they connect). Each twin is able to voluntarily move each of the limbs corresponding to these signals.
The twins are also capable of voluntary bodily control for all the limbs within their ordinary body plans. As their mother Felicia puts it, “they can choose when they want to do it, and when they don’t want to do it.”
The twins also demonstrate a common receptivity to pain. When one twin’s body is harmed, both twins cry.
The twins report that they talk to each other in their heads. This had previously been suspected by family members due to signs of apparent collusion without verbalisation.

Popular article How Conjoined Twins Are Making Scientists Question the Concept of Self contains many additional interesting bits:

when a pacifier was placed in one infant’s mouth, the other would stop crying.

About the self:

 Perhaps the experience of being a person locked inside a bag of skin and bone—with that single, definable self looking out through your eyes—is not natural or given, but merely the result of a changeable, mechanical arrangement in the brain. Perhaps the barriers of selfhood are arbitrary, bendable. This is what the Hogan twins’ experience suggests. Their conjoined lives hint at myriad permeations of the bodily self.

About qualia:

Tatiana senses the greeniness of Krista’s experience all the time. “I hate it!” she cries out, when Krista tastes some spinach dip.

(found via FB comment)

Load More