Software engineering, parenting, cognition, meditation, other
Linkedin, Facebook, Admonymous (anonymous feedback)
What do you think about my arguments in Thou art rainbow: Consciousness as a Self-Referential Physical Process?
I have read almost all of this dialog, and my half-serious upshot is:
An agent A can't prove that another agent B is correct in both its reasoning as well as semantics, but that doesn't matter because it can't trust its own reasoning to that degree either.
This glosses over a lot of details in the long and charitable comment thread above. I tried to get an overview of it with ChatGPT. I'm surprised how well that worked:
ChatGPT 5.2 extended thinking summary of the misunderstanding,
Let:
A natural “soundness schema relative to S” is:
Sound(L,S) := ∀φ (□Lφ→TrueS(φ)).
The Löbian obstacle setup (as Morgan summarizes it) is that a designer agent A wants to rely on proofs produced by a subordinate B, and this seems to demand something like a schema □Lφ→φ (or its intended-world analogue) for arbitrary φ, which is blocked by Löb-ish reasoning.
So far: aligned.
Demski treats “escape the obstacle” as: produce an agent design whose decision procedure can rationally delegate mission-critical tasks under something like the formal constraints studied in the Löbstacle literature.
A clean formalization of what he keeps asking for is:
Hence his repeated demand: “what modification of the agent designs do you have in mind?” and later: “I still don’t understand how you propose for the agent to reason/behave differently.”
Morgan’s post is primarily: diagnose the source of the apparent paradox as a category error—treating a semantic claim (“my proofs track the world”) as if it were a syntactic theorem inside LLL.
His move is to relocate Sound(L,S) out of LLL and into a separate layer of “metalogical belief” that lives at the interface between proofs and world-interpretation. He says the agent need not (and cannot) prove the semantic mapping is sound; rather, the agent assumes/believes it when applying proofs to the world.
This is why he says “the agent doesn’t change its reasoning” (meaning: the proof calculus inside LLL is unchanged).
Demski hears:
“Allowing for metalogical beliefs immediately dissolves the obstacle.”
and translates it into the most standard formal move:
Then he points out the familiar reflection hierarchy problem: L′ can at best endorse L, but not itself; so you get “trust weaker but not equal-strength” all over again. He explicitly frames this as “you just get a new system L′… which doesn’t believe its own soundness.” and distinguishes “L is sound” from “L plus me is sound.”
Morgan, meanwhile, is not proposing “add Sound(L,S) as an axiom used inside the proof system that generates actions.” He keeps insisting the semantic map is “defined on L, not on L′,” and that for the delegation story you can restrict attention to accepting L-proofs.
So the mismatch is:
Formally, Morgan is closer to specifying a two-layer agent:
Demski’s objection is then immediate: if Acc is where the action happens, you have changed the decision architecture; and unless Acc is itself formally grounded, you haven’t solved the original agent-design problem, only renamed it. This is exactly his “back pocket” complaint: “fine, I grant the agent ‘has’ the semantic map; how does it use it?”
They overload “reasoning” differently.
This is why Demski keeps hearing a contradiction in Morgan’s stance:
“No, the agent doesn’t change its reasoning… The change is that we don’t require A to prove that their semantic mapping is sound…”
Under Demski’s formal lens, “not requiring proof” is already a behavioral modification, because the original Löbstacle agent was (by assumption) constrained to act only when it could justify that constraint internally.
Morgan tries to narrow “trust” to “trusting LLL-proofs”:
Demski pushes back that the Löbstacle was about mission-critical delegation, not merely proof checking; restricting to a weaker fragment can amount to “make the delegate dumber,” which is the classic non-solution.
Formally, Demski is insisting on a property like:
A delegates ⟹ ∃ nontrivial tasks where B’s full cognition matters, not just L-proof output.
Morgan is (at least in that subthread) defending a weaker property:
A trusts B ⟹ A accepts any syntactically valid L-proof produced by B as action-relevant.
Those are different “trust” relations, and each is natural given a different target.
Morgan phrases the bridge as normative:
if φ is provable in L, the interpretation of φ “ought to be true”.
Demski hears “ought” as moral/normative in the usual sense and suggests rewriting it as a prediction (“will probably be true”).
But structurally, this is the same dispute:
If you compress all of the above into two “theorems” they are implicitly trying to establish:
If an agent’s action rule is allowed to use an extra-logical acceptance bridge Acc that is not required to be provable inside L, then Löb’s prohibition “L can’t prove its own soundness schema” does not block the agent from in fact relying on L-proofs in the world (conditional on S being sound).
This is basically: “Löb blocks internal proof of Sound(L,S), not the practical stance of treating proofs as reliable.”
If we keep the Löbstacle-style constraint that delegation must be justified within the agent’s own formal apparatus (or equivalently, we forbid a free external Acc oracle), then Morgan’s move either (i) reappears as L′ and triggers the reflection hierarchy, or (ii) reduces to trusting only weaker fragments, which is not the desired kind of delegation.
So: Morgan is dissolving a philosophical inference (“therefore no agent can trust its reasoning”), while Demski is defending a formal obstacle statement about a particular class of reflective agent designs.
Both can be simultaneously right—because the quantifiers differ.
It gets the core point:
the Löbstacle was about mission-critical delegation, not merely proof checking; restricting to a weaker fragment can amount to “make the delegate dumber,” which is the classic non-solution.
It is a non-solution in the sense that it doesn't let the sub agent B run free because it can be fully trusted. But Morgan’s move does seem to enable a save kind of delegation. So practically, the different approaches come down to:
Clearly, 1 is weaker than 2. But we can't get 2 anyway, so getting 1 seems like a win.
And maybe we can extend 1 into a full agent by wrapping B into a verifier. And that would nest for repeated delegation.
I'm not sure we can directly apply solid state physics to NNs, but we may approximate some parts of the NNs with a physical model and transfer theorems there. I'm thinking of Lorzenzo Tomaz' work on Momentum Point-Perplexity Mechanics in Large Language Models (disclaimer: I worked with him at AE Studio).
What is the relative cost between Aerolamp and regular air purifiers?
For regular air purifiers, ChatGPT 5.2 estimates 0.2€/1000m3 of filtered air.
From the Aerolamp website:
How many Aerolamps do I need?
Short answer: 1 for a typical room, or about every 250 square feet
Long answer: It's complicated
Unlike technologies like air filters, the efficacy of germicidal UV varies by pathogen. Some pathogens, like human coronaviruses, are very sensitive to far-UVC. Others are more resistant. However, there is significant uncertainty in just how sensitive various pathogens are to UV light.
The key metric to look for in all air disinfection technologies is the Clean Air Delivery Rate (CADR), usually given in cubic feet per minute (cfm). A typical high-quality portable air-cleaner has a CADR of around 400 cfm - a more typical one will deliver 200 cfm.
For a typical 250 square foot room with 9 foot ceilings, Aerolamp has an expected CADR of 200-1500 cfm, depending on the pathogen and the study referenced.
And ChatGPT estimates 0.02 to 0.3€/1000m³ for the Areolamp - quite competitive esp. given that it is quieter.
I'm not arguing either way. I just note this specific aspect that seems relevant. The question is: Is the babies body more susceptible to alcohol than an adults body. For example, does the liver work better or worse than for a baby? Are there developmental processes that can be disturbed by the presence of alcohol? By default I'd assume that the effect is proportional (except maybe the baby "lives faster" in some sense, so the effect may be proportional to metabilism or growth speed or something). But all of that is speculation.
From DeJong et al. (2019):
Alcohol readily crosses the placenta with fetal blood alcohol levels approaching maternal levels within 2 hours of maternal consumption.
https://scispace.com/papers/alcohol-use-in-pregnancy-1tikfl3l2g (page 3)
I have pointed at least half a dozen people (all of them outside LW) to this post in an effort to help them "understand" LLMs in practical terms. More so than to any other LW post in the same time frame.
Related: Unexpected Conscious Entities
Both posts approach personhood from orthogonal angles:
This suggests a matrix:
| High legal / social personhood | Low / no legal personhood | |
|---|---|---|
| High consciousness-ish attributes | Individual humans | Countries |
| Low / unclear consciousness-ish attributes | Corporations, Ships, Whanganui River | LLMs (?) |
Thanks for writing this! I finally got around to reading it, and I think it is a great reverse-engineering of these human felt motivations. I think I'm buying much of it, but I have been thinking of aggregation cases and counterexamples, and would like to hear your take on it.
Envy toward a friend’s success
A friend wins an award; I like them, but I feel a stab of envy (sometimes may wish they’d fail). That is negative valence without “enemy” label, and not obviously about their attention to me. For example:
Is the idea that the "friend/enemy" variable is actually more like "net expected effect on my status," so a friend’s upward move can locally flip them into a threat?
Admiration for a rival or enemy
I can dislike a competitor and still feel genuine admiration for their competence or courage. If "enemy" is on, why doesn’t it reliably route through provocation or schadenfreude? Do you think admiration is just a different reward stream, or does it arise when the "enemy" tag is domain-specific?
Compassion for a stable enemy’s suffering
E.g., an opposing soldier or a political adversary is injured and I feel real compassion, even if I still endorse opposing them.
This feels like “enemy × their distress” producing sympathy rather than schadenfreude. Is your take that “enemy” isn’t a stable binary at all—that vivid pain cues can transiently force a “person-in-pain” interpretation that overrides coalition tagging?
Gratitude / indebtedness
Someone helps me. I feel gratitude and an urge to reciprocate. It doesn’t feel like "approval reward" (I’m not enjoying being regarded highly). It feels more like a debt.
Do you see gratitude as downstream of the same "they’re thinking about me" channel, or as a separate ledger?
Private guilt
People often report guilt as a direct response to "I did wrong," even when they’re confident nobody will know.
I'm not sure that fits guilt from "imagined others thinking about me." It looks like a norm-violation penalty that doesn’t need the “about-me attention” channel. Do you have a view on which way it goes?
Aggregation Cases
I have been wondering about if the suggested processing matches what we would expect for larger groups of people (that could all be friend/enemy and/or thinking of me or not. And there seem to be at least two different processes going on:
Compassion doesn’t scale with the number of people attended to. This seems to be well established for Identifiable Victim and Numbing. When harm is spread over many victims, affect often collapses into numbness unless one person becomes vivid. That matches your attentional bottleneck.
But evaluation does seem to scale with headcount, at least in stage fright and other audience effects.
Maybe a roomful of people can feel strongly like “they’re thinking about me,” even if you’re not tracking anyone individually? But then the “about-me attention” variable would be computed at the group level, which complicates your analysis.