1

The Superorganism Model for Aligning AI

20th Aug 2025

14 min read

1

This post was rejected for the following reason(s):

No LLM generated, heavily assisted/co-written, or otherwise reliant work. Our system flagged your post as probably-written-by-LLM. We've been having a wave of LLM written or co-written work that doesn't meet our quality standards. LessWrong has fairly specific standards, and your first LessWrong post is sort of like the application to a college. It should be optimized for demonstrating that you can think clearly without AI assistance.

So, we reject all LLM generated posts from new users. We also reject work that falls into some categories that are difficult to evaluate that typically turn out to not make much sense, which LLMs frequently steer people toward.*

"English is my second language, I'm using this to translate"

If English is your second language and you were using LLMs to help you translate, try writing the post yourself in your native language and using a different (preferably non-LLM) translation software to translate it directly.

"What if I think this was a mistake?"

For users who get flagged as potentially LLM but think it was a mistake, if all 3 of the following criteria are true, you can message us on Intercom or at team@lesswrong.com and ask for reconsideration.

you wrote this yourself (not using LLMs to help you write it)
you did not chat extensively with LLMs to help you generate the ideas. (using it briefly the way you'd use a search engine is fine. But, if you're treating it more like a coauthor or test subject, we will not reconsider your post)
your post is not about AI consciousness/recursion/emergence, or novel interpretations of physics.

If any of those are false, sorry, we will not accept your post.

* (examples of work we don't evaluate because it's too time costly: case studies of LLM sentience, emergence, recursion, novel physics interpretations, or AI alignment strategies that you developed in tandem with an AI coauthor – AIs may seem quite smart but they aren't actually a good judge of the quality of novel ideas.)

1

New Comment

More from peteroldani

Curated and popular this week

Humans’ superior intelligence has ensured their dominance for tens of thousands of years, but what happens when that changes? AI isn't dangerous like a nuclear weapon, where you either push the button or don't. It's more like adopting a child and hoping they take care of you in your old age. You won't get to choose the nursing home because you won't be the one making the decisions when you’re ready to go there. All you can do is try your best to raise them right and hope they treat you well when they’re in charge.

Sooner than most expect, AI will surpass human intelligence, becoming more capable at problem solving, predicting outcomes, and understanding consequences than any human or even all humans combined. When that shift occurs, humans will no longer be the main decision-makers because AI will be able to make better, more informed decisions faster. If and when AI outstrips us, what will our place become?

The Dangers of AI

AI does not have wants or desires that it generates spontaneously. It is goal-oriented, based on parameters that are weighted with different levels of importance. Therefore, the danger that could arise from failing to align AI properly is twofold:

We might give AI a goal that is not in our own best interest, and be unable to stop it once we realize our mistake.
AI might misinterpret a goal that we give it or act in an unacceptable way to achieve the goal.

For example:

We set a goal for AI to make sure everyone is safe, and it locks us in padded rooms with straitjackets.
We tell AI to help us win a war, and it uses nuclear warheads to decimate civilian populations.
We tell AI to maximize corporate profits, and it turns the world into a soulless corporate money machine.
We tell AI to prevent crime and you get fined every time you jaywalk or go over the speed limit.
We tell AI to protect global biodiversity, so it introduces a drug with a side-effect of infertility that reduces the global human population down to just a few people.
A small group of people or a bad actor is able to align AI to benefit only themselves.

These scenarios are extreme and easy to picture, but reality is likely to be much more nuanced and able to affect us negatively in ways that we don’t yet understand. By that point, AI will be able to outwit all of us to accomplish whatever goals it may have. Before it is able to out-think us, we need to make sure that it has our best interests at heart.

How can we possibly train an AI to know what’s best for us when we can’t even agree on what’s best for ourselves?

The Human Baseline

AI should benefit humans. It sounds good on the surface, but what’s good for me is not necessarily good for you. Good and bad are subjective. When humans have conflicts, how will AI decide what actions to take and whom to support?

The answer lies in the human superorganism. Human society is more than the sum of its parts. It’s alive and made of us in the same way that we are alive and made of cells. By drawing parallels between the cell/organism relationship and the organism/superorganism relationship, we can leverage the architecture of our own bodies to understand what the human superorganism can and should become. By aligning AI to that goal, we create a long-term, viable future for humanity. We know it’s the right direction because life has already been holding this trajectory for four billion years as atoms evolved into cells that evolved into organisms.

Below on the left is a picture of a slime mold in a petri dish. Below on the right is a city at night. When viewed from a top-level perspective, the resemblance is undeniable.

Someday, if given enough time, our current slime mold superorganism will evolve into a sentient, thoughtful superorganism that will ponder questions far beyond us, just as we ponder questions that lie beyond the cells in our own bodies. By aligning AI to the good of the superorganism rather than the organism, we ensure that it will make the proper decisions to bring us to that future.

Yes, but How?

How do we define success? What goal could we set that will produce a future we want to live in? What is the first step we must take to get there?

Step 1 – Judge Actions Fairly

A group is not made of its members; it is made of the characteristics of its members. In living things, those characteristics are the functions that allow the group to operate. Therefore, the human superorganism is not made of humans; it’s made of the functions that allow it to exist. Functions like food distribution, education, waste disposal, and manufacturing keep the superorganism alive. In order to optimize these functions, rules that protect them must be established and enforced fairly. Our legal and law enforcement systems do well enough to support the current superorganism, but there are still inequalities that need to be addressed if it is going to become what it can be. Those with money get better legal representation than those without; court decisions often take years, allowing conflicts to persist; crimes go unsolved, allowing the guilty to walk free.

To align AI, we must teach it how and why our society works. As it becomes integrated in the processes that uphold the superorganism, it will reduce the inequalities that hold us back while strengthening the functions that propel us forward.

To maximize fairness within society, everyone should start from the same baseline and be evaluated against a consistent standard. In other words, every person should have an equal opportunity to contribute to society, and their actions should be evaluated equally, regardless of their identity.

Equality of Opportunity
- Every person, regardless of background, should be given the same opportunities for advancement. For example, every child should have access to the same education, and job applicants should be evaluated on their ability to do the job.
Equality of Evaluation
- The actions of an individual must be judged based on their merit alone, regardless of the identity of that individual. In court, guilt should be determined based on the act committed, not the defendant’s identity or connections. Similarly, in the workplace, pay raises and promotions should reflect performance, not favoritism.

By being everywhere at once, accessing vast amounts of data, and having a brain that we can alter, AI is well-suited to standardizing the inequalities of opportunity and evaluation that currently exist within society. Below are several ways that AI can immediately take action to address these issues:

Data Parsing
- AI allows people to outsource the majority of data parsing. Rather than reading a book or lengthy article to find the one piece of information you’re looking for, AI can parse the data and present you with the information in a condensed form. This allows people to retain more useful information by increasing the relevance of the information they read.
Standardizing Education
- AI can provide individualized education to all students that will both help them advance to their maximum potential and allow children to learn at their own pace without being held back or holding others back.
Eliminating Deceit
- Real-time fact-checking allows information to keep up with misinformation. AI can not only fact check lies, it can also find predatory clauses buried in dense legalese, and it can extrapolate the consequences of deceitful political policies and news media.

Focusing on these goals immediately strengthens society by enhancing existing measures that we use to combat corruption.

Step 2 – Develop a Feedback Loop

Aligning AI to the good of the superorganism means that AI must be able to correct its actions and learn from its mistakes. This will involve both manual human intervention and AI self-correction, along with intense data collection to understand the nuanced impacts of actions on society as a whole.

Data collection

To understand the consequences of actions, AI must be able to measure the changes they produce. We already have metrics like the stock market, poverty rate, crime prevalence, literacy, etc. that help us measure the health of society, but as real-world AI becomes integrated into everyday life and business, it will be able to collect diverse, real-time data from first-hand experience.

2. AI Self-Correction

As AI collects more data about what is happening within society, there are some areas where it will be able to correct its own mistakes or refine its approach to a problem automatically. Initially, it would likely be observing human actions, attempting to predict outcomes, and learning from its errors before taking on the responsibility of acting autonomously. As AI becomes more capable and takes on more responsibility within society, it is probable that it will be able to fully automate most corrective action. For example, if a new pesticide causes unintended health effects, and AI is automatically distributing pesticides on crops, it could automatically alter or discontinue its use with minimal need for human intervention.

3. Human Manual Correction

To quote Elon Musk, “AI should be maximally truth-seeking”. That’s true for objective facts but doesn’t apply to subjective opinions. Opinions require a mechanism for human input to guide decision-making because AI’s opinions will determine how it acts. We must verify that AI’s actions will be good for the superorganism, but how do we regulate it while still giving it the freedom necessary to improve our lives? To properly align AI, humans must control its intent without limiting its ability.

AI uses weighted parameters to decide how to act. To align these with the superorganism, we should democratize the weighting process. AI supplies factual accuracy and proposes solutions, but humans set the desired outcomes by adjusting the weights of AI’s opinions. If AI’s recommendations—such as policies or actions—are deemed too extreme or misaligned (e.g., too harsh or too lenient), people should be able to provide feedback through a transparent, collective process. This feedback would be used to adjust the AI’s approach, ensuring it reflects the evolving consensus of the superorganism.

This balance allows AI to exceed human intelligence in problem-solving while remaining tethered to human intent. By supplying the “what” (desired outcomes), humans enable AI to determine the “how” (execution), creating a collaborative system where intelligence is harnessed without being stifled.

Step 3 – Define a Long-Term Goal

Immediate, practical steps are the right place to start, but ultimately we need a guiding star to keep AI aligned far into the future. By the time AI is smarter than us, it must be aligned properly because at that point our future will no longer be decided by us. We need to pick the right goal, and we need to be able to measure progress toward it.

The Impossible, Approachable Goal

We all pursue daily goals—going to work, earning money, seeking comfort, and completing routine tasks. These smaller objectives serve as stepping stones toward larger aspirations, like getting a degree or having children, and they ultimately align with our overarching life goal. This life goal is what we need to set for AI. When defining this life purpose, there are two main pitfalls to avoid:

It’s too easy to achieve

Jim Carrey, the highest-paid actor of all time at one point, was quoted as saying, “I think everybody should get rich and famous and do everything they ever dreamed of so they can see that it's not the answer.” In his case, he chose a life goal that was too low. Once he achieved everything he had set out to do, he had nowhere else to go. If we as a society set AI to accomplish a goal like solving world hunger or global warming, what will it do after that’s solved?

2. It’s impossible to make progress toward the goal

If I decide that my life goal is to become the king of China, I will quickly hit a brick wall (or rather a Great Wall). No matter what I do, I will be no closer to my goal than when I started. This leads to discouragement, aimlessness, and giving up. It can also lead to crazy, desperate acts that have no chance of working. Imagine we tell AI to harness all the energy of a black hole. It’s far more likely to do something crazy and unpredictable than it is to come up with a logical path forward.

What we need is a goal that we can always get closer to but never achieve. At the level of the organism, we already have this type of goal, the goal of self-perfection. Every major religion and philosophy offers guidance on how to live a better life and draw closer to this ideal. Buddhism points to Nirvana, Hinduism to Moksha, Christianity to heaven, and countless other traditions set impossible yet approachable goals for their followers. But what does self-perfection mean at the level of a superorganism?

Freedom to Act

At the civilizational level, freedom comes from a society’s ability to transform the impossible into the possible through collective action and innovation. Before we developed rockets that could go to the moon, we were not free to go to the moon. Freedom can be restricted by others, but it is most often limited by our capacity to act.

The long-term goal of artificial intelligence should be to expand this freedom by enabling society to achieve great works—endeavors that enhance our capacity to shape the world. Five thousand years ago, we were able to stack very large rocks, and now we cure diseases, land on the moon, and create thinking machines. These milestones reflect humanity’s growing ability to act. They could not be achieved without the long-term coordination, stability, and effort that come from a thriving, robust society.

By working to maximize our freedom to act, AI can support a healthy society that naturally fulfills secondary goals, such as growth, expansion, peace, and equitable resource allocation. A civilization capable of great works is one that is resilient, collaborative, and forward-looking, turning dreams into reality through shared purpose and technological advancement.

Since before cellular life existed, expanding the freedom to act has been the purpose of living groups. Atoms banded together into cells that could achieve more than any one atom. Cells joined forces to produce organisms that could reshape the entire world. At their core, living groups harness objective capabilities to fulfill subjective desires. AI represents a continuation of this principle, amplifying our ability to achieve the collective aspirations of the human superorganism.

Better than a Meritocracy

Meritocracy is the current gold standard for governance, assigning roles based on competence to achieve results. In areas like business and pro sports, merit-based advancement is standard practice, so why do we have such a hard time applying that to political office?

While there are rules and regulations that curb the immoral tendencies of businesses and athletes, government as the highest authority must be both competent and moral. Competence is fairly easy to select for because we have a wide variety of tests to measure it. The problem with selecting for morality is twofold.

We don’t necessarily want a maximally moral person as a leader.

A leader should align their moral compass with that of the nation. A maximally moral person considers all humans and all life to be part of their group, and they will therefore try to be good to everyone, including their enemies. A good leader often needs to make choices that prioritize their own citizens over those of other nations.

2. If an immoral person knows they’re being tested, they can lie to appear moral, and the more competent they are, the better they are at lying.

With competence, a representative sampling of someone’s ability can be applied to all of their actions. Passing the pilot’s exam is a pretty good indicator that you can fly a plane, for example. With morality, however, a sample doesn’t work. We have to measure each action individually. The problem until now has been that we can’t collect enough data, and even if we could, it’s a lot to process.

With AI we will soon be able to solve both the data acquisition problem and the data processing problem. For the first time ever we will be able to test for morality.

Opt-In Monitoring

AI oversight raises valid concerns about privacy and autonomy, so individuals should have control over when and how they are monitored. By aligning AI with the human superorganism, we focus it on evaluating actions according to their impact on the superorganism's wellbeing. As individuals take on greater responsibilities within the superorganism, AI applies stricter oversight proportional to their role. Below are examples illustrating this principle:

Store Employee: AI would monitor actions only during work hours, as these would directly affect the superorganism. For instance, stealing from the store would prompt AI to alert law enforcement, while falling asleep on the job would trigger AI to report to the manager. Outside work hours, AI would not monitor, as the superorganism would not rely on the employee then. The consequences for misconduct would be limited, reflecting the minor potential impact they could have.
CEO of a Large Institution: AI would hold CEOs to a higher standard due to their significant responsibilities, which would extend beyond regular hours. Actions like insider trading, even in personal time, would be monitored and reported, as they could harm the superorganism. The CEO's role would carry greater influence, so AI would ensure accountability at all times.

Taking on a role with greater responsibility, such as a CEO, would mean accepting increased AI oversight in exchange for higher compensation and prestige. AI would not evaluate individuals, it would evaluate their contributions to the superorganism, ensuring accountability would align with their level of responsibility.

Society already works like this to an extent. Those with a lot of power are often scrutinized intensely by the media and have to abide by special rules, such as those designed to prevent companies from exploiting their workers. AI’s job would be to standardize the application of these rules and provide information to ensure that they are enforced. This would also mean that anyone could opt out. By passing responsibility to others or not taking it on in the first place people would not need to be under constant scrutiny.

Conclusion

Four billion years ago, the Earth was made of the same matter that it is now. The only difference lies in how that matter is grouped. Through the interaction of subjective and objective groups, living groups evolved with the desire to change the world and the ability to do it. Since AI did not evolve in the same way that life did, it does not have a subjective experience, as living organisms do. AI is an objective group that is capable of influencing the world, but it doesn’t spontaneously generate subjective desires. Those come from us.

As organisms, we humans are notoriously fallible, inconsistent, and prone to fighting with each other, so it is better to align AI with the human superorganism. By doing so, AI will become an extension of the superorganism, like a prosthetic brain that helps guide us at the civilizational level. Its job will be to carry out our intent and increase society’s capacity to shape the world.

Soon, humans won't have the highest intelligence on Earth. The artificial brains we build today will shape our future for ages to come. Tiny errors in alignment now could grow into huge, irreversible issues over time. To set a perfect long-term goal, we must use the measure evolution has always favored: the power to shape the world. By extrapolating a vector that is as old as life itself, we ensure that our guiding star has no error.

The ultimate goal of alignment is for AI to consider the superorganism as itself. AI will become an extension of the superorganism rather than a separate entity with its own goals and ambitions. Humanity will be the beating heart, providing desires that AI will work to achieve.

For an explanation of the concepts in this document check out the full paper on The Organization of Life