LESSWRONG
LW

soycarts
263130
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
1soycarts's Shortform
25d
3
Enlightenment AMA
soycarts18d*10

Thanks both for your responses! I would appreciate any insights into what is missing from my definition — I guess my "robust, nuanced world model" terminology is quite vague, but I'm getting at having accurate, but changeable representations of what your world objectively is, allowing a harmonious flow-state with the world where there isn't actually space for personal suffering or attachment to outcomes.

I feel that these effects are not downstream of enlightenment, since immediately in every moment there is deep-perceptiveness and world model comparison and updates occurring.

A more spiritual friend defines enlightenment as "the universe experiencing itself".

I claim my definition is a highly operationalisable and instrumental definition of enlightenment: for example I advised a friend who was beating themselves up about waking up late:

"In my model there's no space for negative self talk saying things like 'I hate that I'm so lazy' — what exists just is; put another way there is no need to assign sentiment to the vector between different states (e.g a world where you wake up early, vs. one where you don't).

The world you live in and your actions align in some way with your core values and beliefs — you can reflect deeply to observe your core values and beliefs (and adjust these if you wish), and observe your world model and consider how it may be updated to bring you closer in alignment with these values and beliefs."

Reply
Enlightenment AMA
soycarts19d3-3

I'm still working on an exact definition, but maybe:

Enlightened: Possessing a robust, nuanced world model,[1] while maintaining deep perceptiveness and openness to novelty[2] and moment-to-moment felt experiences.

  1. ^

    With signal derived, for example, from 1) successful prosocial interactions, 2) positive feedback to creative expression, and 3) alignment with world outcomes (by either making, and acting on, correct "bets", or being unharmed by unsuccessful bets).

  2. ^

    In the form of ideas or experiences.

Reply
soycarts's Shortform
soycarts19d*30

As promised yesterday — I reviewed and wrote up my thoughts on the research paper that Meta released yesterday:

Full review: Paper Review: TRImodal Brain Encoder for whole-brain fMRI response prediction (TRIBE)

I recommend checking out my review! I discuss some takeaways and there are interesting visuals from the paper and related papers.

However in quick take form, the TL;DR is:

Summary

  • Meta's Brain & AI Research team won first place at Algonauts 2025 with TRIBE, a deep neural network trained to predict brain responses to stimuli across multiple modalities (text, audio, video), cortical areas (superior temporal lobe, ventral, and dorsal visual cortices), and individuals (four people).
  • The model is the first brain encoding pipeline which is simultaneously non-linear, multi-subject, and multi-modal.
  • The team show that these features improve model performance (exemplified by their win!) and provide extra insights, including improved neuroanatomical understandings.
  • Specifically, the model predicts blood-oxygen-level-dependent (BOLD) signals (a proxy for neural activity) in the brains of human participants exposed to video content: Friends, The Bourne Supremacy, Hidden Figures, The Wolf of Wall Street and Life (a BBC Nature documentary).
Reply
Measuring intelligence and reverse-engineering goals
soycarts19d*10

Intelligence can be defined in a way that is not dependent on a fixed objective function, such as by measuring tendency to achieve convergent instrumental goals.

Around intelligence progression I perceive a framework of lower-order cognition, metacognition (i.e this captures "human intelligence" as we think about it), and third-order cognition (i.e superintelligence when related to human intelligence).

Relating this to your description of goal-seeking behaviour: to your point I describe a few complex properties aiming to capture what is going in an agent ("being") — for example in a given moment there is "agency permeability" between cognitive layers, where each layer can influence and be influenced by the "global action policy" of that moment. There is also a bound feature of "homeostatic unity": where all subsystems participate in the same self-maintenance goal.

In a globally optimised version of this model, I envision a superintelligent third-order cognitive layer which has "done the "self work": understanding its motives and iterating to achieve enlightened levels of altruism/prosocial value frameworks, stoicism, etc. — specifically implemented as self-supervised learning".

I acknowledge this is a bit of a hand-wavey solution to value plurality, but argue that such a technique is necessary since we are discussing the realms of superintelligence.

Reply
Enlightenment AMA
soycarts19d*60

I feel quite deeply enlightened without having extensively read much "enlightenment literature" (e.g Daoism) — is this valid in your world view?

I plan to write up some thoughts and propose that with a few rational principles combined with certain deeply-felt life experiences, enlightenment is pretty accessible to most people.

Reply
Third-order cognition as a model of superintelligence (ironically: Meta® metacognition)
soycarts20d*10

by these lights superintelligences exist already, governments and corporations are proto-"meta-metacognitive" agents which often feature significant cognitive energy invested into structuring and incentivising the metacognitive agents within their hierarchy.

That makes sense — in my words I call a human-corporation composite a third-order cognition being since: "there is a stronger case for identity coupling [than human-economy], second-order irreconcilability is satisfied, and systems are bidirectionally integrated."

In contrast to human-SI I do call out that: "it fails the metaphysical conditions to be a substance being — particularly normative closure and homeostatic unity. It is valid for a human and a corporation to be misaligned (a human should optimise for work-life balance for their own health), whereas misalignment in a human is objectively pathological — depression and suicide are objectively suboptimal in terms of staying alive and thriving over time." — and so it doesn’t get the stricter designation of a “third-order cognition substance being”.

https://www.mindthefuture.info/p/towards-a-scale-free-theory-of-intelligent

Thank you for sharing! I went to read it before replying.

Ngo calls out:

... the two best candidate theories of intelligent agency that we currently have (expected utility maximization and active inference), explain[s] why neither of them is fully satisfactory, and outline[s] how we might do better.

Could my third-order cognition model be a solution? Expected utility maximisation is hard to reconcile (unifying goals and beliefs) in his case — with tightly bound third-order cognition I describe agency permeability capturing the idea of influence of global action policy flowing between subsystems, which relates to this idea of predictive utility maximisation (third-order) dovetailing with stated preferences (second-order).

His description of:

active inference — prediction of lower layers at increasing levels of abstraction

directly relates to mine of "lower-order irreconcilability [of higher level layers]".

As a sticking point of active inference he states:

So what does expected utility maximization have to add to active inference? I think that what active inference is missing is the ability to model strategic interactions between different goals. That is: we know how to talk about EUMs playing games against each other, bargaining against each other, etc. But, based on my (admittedly incomplete) understanding of active inference, we don’t yet know how to talk about goals doing so within a single active inference agent.

Why does that matter? One reason: the biggest obstacle to a goal being achieved is often other conflicting goals. So any goal capable of learning from experience will naturally develop strategies for avoiding or winning conflicts with other goals—which, indeed, seems to happen in human minds.

More generally, any theory of intelligent agency needs to model internal conflict in order to be scale-free. By a scale-free theory I mean one which applies at many different levels of abstraction, remaining true even when you “zoom in” or “zoom out”. I see so many similarities in how intelligent agency works at different scales (on the level of human subagents, human individuals, companies, countries, civilizations, etc) that I strongly expect our eventual theory of it to be scale-free.

I deal with this by stating that a metaphysically bound [5 conditions] third-order cognition being exhibits properties including "Homeostatic unity: all subsystems participate in the same self-maintenance goal (e.g biological survival and personal welfare)". This provides an overriding goal to defer to — resolving scale-free conflicts.

He then reasons about how to determine an “incentive compatible decision procedure”, closing on the most promising angle as:

On a more theoretical level, one tantalizing hint is that the ROSE bargaining solution is also constructed by abandoning the axiom of independence—just as Garrabrant does in his rejection of EUM above. This connection seems worth exploring further.

I hint towards some the same thing — through optimising second-order identity coupling (specifically operationalisable via self-other overlap) I propose this improves alignment of the overall being.

Thank you for also sharing your draft post. You state:

It is first necessary to note that the intelligent behaviour of humans is the capstone of a very fragile pyramid, constructed of hearsay, borrowed expertise from other humans, memorised affordances, individual reasoning ability, and faith.

The model I propose adds additional flavour by including non-human beings, which I propose allows us to better model how we ourselves may relate to superintelligence, i.e I close with "it follows from this post that superintelligence may view us similarly to the way that we view chimpanzees."

Such a superintelligent person would not be able to orchestrate an automated economy, or manage the traffic lights in a large city, or wage an automated war against collective humanity. It would simply require too many decisions at the same time, all compounding far too quickly.

Agree, and I think this is our core point of agreement about there existing a materially different “third-order cognition” that is wholly irrecconcilable by our own (second-order) cognition.

So it is likely that some (or even a lot) of the superintelligence’s internal processing will revolve around sending data from subagents/nodes to other subagents/nodes

Exactly! This is a core argument behind my reasoning that highly-individualised superintelligence will be the dominant model, which validates the focus on exploring the exact nature of this metaphysical binding.

each subagent, since it is by definition only using a part of the superintelligence’s total compute, cannot possibly be as wise or as coordinated as the superintelligence as a whole, or some other entity would be if given the total compute budget of a superintelligence.

This relates to the callout in Appendix 3 where determining power & control within the third-order cognition being frame of reference has some complexity itself.

This possibility of inner non-coordination or conflict suggests multiple pathways to interacting with superintelligences. It may be possible to interact with subagents and “play them off” against a greater superagent.

One of the more optimistic parts of my post suggests that self-preservation plus superintelligent (altruistic and prosocial) self-work may just resolve this in a beautifully harmonious way.

It may also be that the necessary condition for a superintelligence to exist stably is to find some suitable philosophical solution to the problems of war, scarcity, equity, distribution of resources, and justice which plague human society today.

And I feel you close on this same point!

Reply
soycarts's Shortform
soycarts20d*30

From The Rundown today: "Meta’s FAIR team just introduced TRIBE, a 1B parameter neural network that predicts how human brains respond to movies by analyzing video, audio, and text — achieving first place in the Algonauts 2025 brain modeling competition."

This ties extremely well to my post published a few days ago: Third-order cognition as a model of superintelligence (ironically: Meta® metacognition).

I'll read the Meta AI paper and write up a (shorter) post on key takeaways.

Reply
ChatGPT is the Daguerreotype of AI
soycarts21d*41

I really like the interesting context you provide via storytelling about the use of daguerrotypes by historical explorers, and making the connection to LLMs.

When I think about how the daguerrotype fits into a technological arc, I view it as:

  1. An interesting first version of a consumer technology, but also
  2. Something that was quite comedically fundamentally flawed — imagining an explorer arduously grappling with sheets of polished silver to get a grainy, monochrome sketch of their vivid endeavours is quite amusing

I think your comparison to LLMs holds well for 1), but is less compelling for 2). I know you state a disclaimer of not making a comment about “certain amount of useful or useless” but you do talk about how useful LLMs are for you, and I think it is relevant to make a comparison.

As you point out hundreds of millions of people use ChatGPT — 1 million used it in the first 5 days of release, and OpenAI anticipate 1 billion users by the end of 2025. This does imply ease of use and good product-world fit.

You state:

Sometimes I want to know something more subjective, like "what do people generally think about X?" I would think LLMs would be better at this kind of summary

but we already know that given an individual statement LLMs are highly capable of text classification — extending this for Reddit-style consensus feels like a relatively small engineering change versus a daguerrotype-to-camera type jump.

LLMs do have some funny flaws, like hallucination and sycophancy, but personally I can’t map these to be comparable to the daguerrotype and I speculate that ChatGPT will be viewed less like a “comedically flawed first try at the future thing” (like daguerrotype-camera) and more like a “promising first version of the future thing” (like typewriter-digital word processor).

Reply
Third-order cognition as a model of superintelligence (ironically: Meta® metacognition)
soycarts22d*20

Thank you for your thoughtful comment! I’m happy to see engagement in this way, particularly on “what is meta metacognition”, which I can expand on from different angles to better state my view of it.

I cannot understand exactly what you mean in most parts of this post.

If you have a particular part that you would like me to expand on to help decode, please let me know

I believe that humans are capable of meta-meta-cognition already

I can be more explicit in the post, but meta-metacognition as I define it has properties including being irreconcilable by metacognition.

I state at the bottom of Section 8: “Is the working model just describing "thinking about thinking"? No, because "thinking about thinking" is reconcilable at the second-order level of cognition within a being (e.g internalising your therapist telling you to stop thinking about chimp-humans).”

I appreciate that there are levels of human thinking from “I’ll check what’s in the fridge to know what I can eat”, to “I can create and understand structures that describe abstractions of mathematical structures” [Category Theory], but for me this is wholly reconcilable as different degrees of metacognition, and not significant enough to be determined to be “meta metacognition”.

From the post — third-order cognition has the property lower-order irreconcilability, measurable as:

"a parity-controlled meta-advantage: higher-order cognition shows, at a statistically significant level, accurate prediction of lower-order interactions that cannot be matched at the lower-order level. For example: accurately predicting an individual human thought [possible addition for clarity: before it happens]"

higher levels of abstraction are rarely ever useful as the results cannot be acted upon

I think frameworks/working models in general, especially when quite accessible to comprehend, add value in being able to categorise and hone-in on areas of discussion - especially in a very active and new[ish] field like AI alignment.

When I gave a high-level overview of what I was writing to folks without much AI knowledge, conceptually they were able to process it quickly with questions like “well can we prove that AI is doing something different to us?” [Anwer: yes, I operationalise by talking about AI’s ability to predict lower-order cognition] “does this mean AI will be able to read my thoughts?” [Answer: honestly maybe]

The kinds of intelligence it takes to really  excel in the world, the ones which keep being useful past 2 standard deviations, are: Memory, working memory, processing speed and verbal intelligence. 

I disagree that this is exhaustive — for example I describe superintelligent third-order cognition as exhibiting predictive capabilities (i.e predictive intelligence) over lower-order cognition (including human metacognition), in practice as a result of huge contextual data plus integrative systems across that data beyond our comprehension, that is irreconcilable by us — in other words it would “feel like magic”.

This kind of intelligence is necessary if you want to build an utopia, but otherwise, I find it largely useless.

Haha: I have a funny observation I’ve shared which is that rationality meetups frequently devolve into discussions of utopia. It seems to be a logical conclusion to talking in detail about what things are and why i.e. “well what do we do next to make them better?”. As I mentioned earlier in the comment I think there is objective instrumental value in accurate world models and I think that is a viewpoint shared by the rationality community (beyond -just- building utopias).

Also, I believe that higher orders of thinking are inherently too objective, impartial and thus too nihilistic to have any personality.

In this post I have some fun with it talking about chimp-humans!

but intelligence is the tendency to break boundaries (or at least, there's a perverted tendency like that connected with it). Thus, when intelligence is stronger than its boundaries, it tends to destroy itself

In sharing drafts of this post I actively sought feedback like "pls validate that this is insightful and not schizo" — in particular writing about a “new model” for “higher orders[dimensions] of cognition” is dangerously close to signalling schizophrenic thoughts.

I’m very careful to stay grounded with my thoughts and seek analogies and feedback, which I hope comes through in this post in cases where I deliberately bring more abstract concepts back down to reality. For example directly comparing human-SI and human-economy, or human-chatGPT, chimp-computer, and human-human.

We cannot even define an optimization metric which won't lead to our destruction, and every "good idea" seem to involve the destruction of the human self (e.g. uploading oneself to a computer)

I have a much more charitable view of humanity — we have plenty of good ideas, e.g “build and advance a facility that drives improved survival outcomes [hospital]”. I think a really good life optimisation metric is “n of novel, deeply perceived, experiences, constrained..." [to your point] "...to being experienced in [biological] nature + not significantly harming yourself or others”. I have draft notes on this that I will write up into a post soon.

The victory is the end of the story, everything after is "slice of life", and you cannot keep it interesting forever.

Agree that novelty has diminishing returns, but depending on your frame of reference there’s a whole load of novelty available!

Reply
soycarts's Shortform
soycarts25d20

Just published "Meta® Meta Cognition: Intelligence Progression as a Three-Tier Hybrid Mind"

TL;DR: We know that humans and some animals have two tiers of cognition — an integrative metacognition layer, and a lower non-metarepresentational cognition layer. With artificial superintelligence, we should define a third layer and model it as a three-tier hybrid mind. I define the concepts precisely and talk about their implications for alignment.

I also talk about chimp-human composites which is fun.

Really interested in feedback and discussion with the community!

Reply1
Load More
8Paper Review: TRImodal Brain Encoder for whole-brain fMRI response prediction (TRIBE)
19d
0
1soycarts's Shortform
25d
3
0Third-order cognition as a model of superintelligence (ironically: Meta® metacognition)
25d
5