It's cool to see you involved in this sphere! I've been seeing and hearing about your work for a while, and have been impressed by both your mechanism design and ability to bring it into large-scale use.
Reading through this, I get some impression that it's missing some background related to some of the models of what strong superintelligence looks like. Both the challenges of the kind of alignment that's needed to make that go well, and just how extreme the transition will by default end up being.
Even without focusing on that, this work is useful in some timelines and seems worthwhile, but my guess is you'd get a fair amount of improvement in your aim by picking up ideas from some of the people with the longest history in the field. Some of my top picks would be (approx in order of recommendation):
Or, if you'd like something book-length, AI Does Not Hate You: Rationality, Superintelligence, and the Race to Save the World is the best until later this month when If Anyone Builds it, Everyone Dies comes out.
Thank you for the encouragement, recommendations, and for flagging the need for more context on strong ASI models, including the default extremity of the transition!
You're spot on; my DeepMind talk emphasized horizontal alignment (defense against coordination failures) as a complement to vertical alignment perils, like those in the orthogonality thesis and instrumental convergence.
I've pre-ordered the IABIED book and have now re-read several recommendations: "AGI Ruin" details lethalities and "A Central AI Alignment Problem" highlights the sharp left turn's risk. Just reviewed "Five Theses, Two Lemmas," which reinforces the intelligence explosion, complexity/fragility of value, and indirect normativity as paths to safer goals.
These sharpen why 6pack.care prioritizes local kami (bounded, non-maximizing agents) to mitigate unbounded optimization and promote technodiversity over singleton risks.
Topics I’d love to discuss further:
Nice! Glad you're getting stuck in, and good to hear you've already read a bunch of the background materials.
The idea of bounded non-maximizing agents / multipolar as safer has looked hopeful to many people during the field's development. It's a reasonable place to start, but my guess is if you zoom in on the dynamics of those systems they look profoundly unstable. I'd be enthusiastic to have a quick call to explore the parts of that debate interactively. I'd link a source explaining it, but I think the alignment community has overall done a not great job of writing up the response to this so far.[1]
The very quick headline is something like:
I'd be interested to see if we've been missing something, but my guess is systems containing many moderately capable agents (~top human capabilities researcher) which are trained away from being consequentialists in a fuzzy way almost inevitably falls into the attractor of very capable systems either directly taking power from humans or puppeteering the human's agency as the AIs improve.
Quick answer-sketches to your other questions:
One direction that I could imagine being promising and something your skills might be uniquely suited for would be to, with clarity about what technology at physical limits is capable of, doing a large-scale consultation to collect data about humanity's 'north star'. Let people think through where we would actually like to go, so that a system trying to support humanity's flourishing can better understand our values. I funded a small project to try and map people's visions of utopia a few years back (e.g.), but the sampling and structure wasn't really the right shape to do this properly.
https://www.lesswrong.com/posts/DJnvFsZ2maKxPi7v7/what-s-up-with-confusingly-pervasive-goal-directedness is one of the less bad attempts to cover this, @the gears to ascension might know or be writing up a better source
(plus lots of applause lights for things which are actually great in most domains, but don't super work here afaict)
On north star mapping: Does the CIP Global Dialogues and GD Challenge look like something of that shape, or something more like AI Social Readiness Process?
On Raemon’s (very insightful!) piece w.r.t. curing cancer inevitably routing through consequentialism: Earlier this year I visited Bolinas, a birthplace of integrative cancer care, which centers healing for communities, catalyzed by people experiencing cancer. This care ethic prioritizes virtues like attentiveness and responsiveness to relational health over outcome optimization.
Asking a superintelligence to 'solve cancer' in one fell swoop — regardless of collateral disruptions to human relationships, ecosystems, or agency — directly contravenes this, as it reduces care to a terminal goal rather than an ongoing, interdependent process.
In a d/acc future, one tends to the research ecosystem so progress emerge through horizontal collaboration — e.g., one kami for protein‑folding simulation, one for cross‑lab knowledge sharing; none has the unbounded objective “cure cancer.” We still pursue cures, but with kamis each having non-fungible purposes. The scope, budget, and latency caps inherent in this configuration means capability gains don’t translate into open‑ended optimization.
I'd be happy to have an on-the-record conversation, co-edited and published under CC0 to SayIt next Monday 1pm if you agree.
The thing I have in mind as north star looks closest to the GD Challenge in scope, but somewhat closer to the CIP one in implementation? The diff is something like:
I'm glad you're seeing the challenges of consequentialism. I think the next crux is something like: My guess that consequentialism is a weed which grows in the cracks of any strong cognitive system, and that without formal guarantees of non-consequentialism, any attempt to build an ecosystem of the kind you describe will end up being eaten by processes which are unboundedly goal-seeking. I don't know of any write-up that hits exactly the notes you'd want here, but some maybe decent intuition pumps in this direction include: The Parable of Predict-O-Matic, Why Tool AIs Want to Be Agent AIs, Averting the convergent instrumental strategy of self-improvement, Averting instrumental pressures, and other articles under arbital corrigibility.
I'd be open to having an on the record chat, but it's possible we'd get into areas of my models which seem too exfohazardous for public record.
Great! If there are such areas, in the spirit of d/acc, I'd be happy to use a local language model to paraphrase them away and co-edit in an end-to-end-encrypted way to confirm before publishing.
Fwiw, I disagree that the answer to stage 1 and 3 of your quick headline being solved, I think that there are enough unanswered questions there that, enough of them so that we can't be certain whether or not a multipolar model could potentially hold.
For 1, I agree with the convergence claims but the speed of that convergence is in question. There are fundamental reasons to believe that we get hierarchical agents (e.g this from physics, shard theory). If you have a hierarchical collective agent then a good question is how you get it to maximise and become full consequentialist because it will due to optimality reasons. I think that one of the main ways it smooths out the kinks in its programming is by running into prediction errors and updating from that and then the question becomes how fast it runs into prediction errors. Yet in order to atain prediction errors you need to do some sort of online learning in order to update your beliefs. But the energy cost of that online learning scales pretty badly if you're doing something like classic life does but with a really large NN. Basically there's a chance that if you hard scale a network to very high computational power, updating that network increases a lot in energy and so if you want the most bang for your buck you get something more like Comprehensive AI Services since you get a distributed system of more specific learners forming a larger learner.
Then you can ask the question what the difference between the distributed AI and humans Collective Intelligence is. There are arguments that they will just form a super-blob through different forms of trade yet how is that different from what human collective intelligence is? (Looking at this right now!)
Are there forms of collective intelligence that can scale with distributed AI and that can capture AI systems in part of it's optimality? (E.g group selection due to inherent existing advantages) I do think so and I do think that really strong forms of collective decision making potentially gives you a lot of intelligence. We can then imagine a simple verification contract that an AI gets access to a collective intelligence if it behaves in a certain way, it's worth it for it because it is a lot easier to access power through yet it also agrees to play by certain rules. I don't see why this wouldn't work and I would love for someone to tell me that it doesn't work!
For 3, why can't RSI be a collective process given the above arguments around collective versus individual learning? If RSI is a bit like classic science there might also be thresholds and similar at which you get less fast scaling, I feel this is one of the less talked about points in superintelligence, what is the underlying difficulty of RSI at higher levels? From an outside view + black swan perspective it seems very arrogant to believe that to have a linear difficulty scaling?
Some other questions are: What types of knowledge discovery will be needed? What experiments? Where will you get new bits of information from? How will these distribute into the collective memory of the RSI process?
All of these things determine the unipolarity or multipolarity of an RSI process? So we can't be sure of how it will happen and there's also probably path dependence based on the best alternative at the initial conditions.
I'd be enthusiastic to have a quick call to explore the parts of that debate interactively.
I would greatly appreciate it if you could post the transcript of the call on LW.
I remember you from the Pugs days. Two questions about this presentation. One is more aspirational: do you think of this society of AIs as more egalitarian (many superhuman AIs at roughly the same level) or more hierarchical (a range of AI sizes, with the largest hopefully being the most aligned to those below)? And the other is more practical. Right now the AI market is locked in an arms race kind of situation, and in particular, scrambling to make AIs that will bring commercial profit. That can lead to nasty incentives, e.g. an AI working for a tax software company can help it lobby the government to keep tax filing difficult, and of course much worse things can be imagined as well. If this continues, all the nice vision of kami and so on will just fail to exist. What is to be done, in your opinion?
Hi! Great to hear from you. “Optimize for fun” (‑Ofun) is still very much the spirit of this 6pack.care work.
On practicality (bending the market away from arms‑race incentives): Here are some levers that worked, inspired by Taiwan's tax-filing case, that shift returns from lock‑in to civic care:
On symbiosis: the kami view is neither egalitarian sameness nor fixed hierarchy. It’s a bounded, heterarchical ecology with many stewards with different scopes that coordinate without a permanent apex. (Heterarchy = overlapping centers of competence; authority flows to where the problem lives.)
Egalitarianism would imply interchangeable agents. As capabilities grow, we’ll see a range of kami sizes: a steward for continental climate models won’t be the same as one for a local irrigation system. That’s diversity of scope, not inequality of standing.
Hierarchy would imply command. Boundedness prevents that: each kami is powerful only within its scope of care and is designed for “enough, not forever.” The river guardian has no mandate nor incentive to run the forest.
When scopes intersect, alignment is defined by civic care: Each kami maintain the relational health of their shared ecosystem at the speed of the garden. Larger systems may act as ephemeral conveners, but they don’t own the graph or set permanent policy. Coordination follows subsidiarity and federation: solve issues locally when possible; escalate via shared protocols when necessary. Meanwhile, procedural equality (the right to contest, audit, and exit) keeps the ecology plural rather than feudal.
I briefly read 6pack.care website and your post. It sounds to me like an idea supplementary to existing AI safety paradigms and not solving the core problem - aligning AIs. Looking at your website I see that it's already assumed that AI is mostly aligned and issues with rogue AIs are not mentioned in the risks section
A midsize city is hit by floods. The city launches a simple chatbot to help people apply for emergency cash. Here is what attentiveness looks like in action:
- Listening. People send voice notes, texts, or visit a kiosk. Messages stay in the original language, with a clear translation beside them. Each entry records where it came from and when.
- Mapping. The team (and the bot) sort the needs into categories: housing, wage loss, and medical care. They keep disagreements visible — renters and homeowners need different proofs.
- Receipts. Every contributor gets a link to see how their words were used and a button to say “that’s not what I meant.
and so on.
Indeed, there are two classes of alignment problems. The first is vertical: making a single agent loyal. The second is horizontal: ensuring a society of loyal agents doesn't devolve into systemic conflict. 6pack.care is a framework for the second challenge as articulated by CAIF.
It posits that long-term alignment is not a static property but a dynamic capability: alignment-by-process. The core mechanism is a tight feedback loop of civic care, turning interactions into a form of coherent blended volition. This is why the flood bot example, though simple, describes this fractal process.
This process-based approach is also our primary defense against power-seeking. Rather than trying to police an agent's internal motives, we design the architecture for boundedness (kami) and federated trust (e.g. ROOST.tools), making unbounded optimization an anti-pattern. The system selects for pro-sociality.
This bridges two philosophical traditions: EA offers a powerful consequentialist framework. Civic Care provides the necessary process-based virtue ethic for a pluralistic world. The result is a more robust paradigm: EA/CC (Effective Altruism with Civic Care).
I don't have much to add, but wanted to say that I really admire your work Audrey, and it's delightful to see you posting here and responding so thoughtfully to the comments. I think there's a lot of potential for productive exploration of the space of improved coordination and governance. Ever since learning about polis from your 80k interview I've been gradually collecting open source governance innovations similar to it in spirit. Some experiments posted on the web disappear before I can properly document them. There was one in New York with a really interesting variation on Polis which connected some different info from the idea/opinion contributions, that I saw once and then could never find again.
I think plex is right that there are some challenges posed by superintelligence that make the future extra tricky to navigate, but that doesn't mean the general direction isn't valuable!
There's some relevant and interesting ideas in this post series by Allison: https://www.lesswrong.com/posts/bHgp7CcqW4Drw8am4/preface-3
Yes! I did read the revised series with great interest, and I fondly recall my participation in the Foresight Institute 2021 seminar that contributed to the 2022 edition.
On the "speed-run subsidiarity" idea, I really resonate with your comment:
This is the first non-destructive non-coercive pivotal act that I've considered plausibly sufficient to save us from the current crisis.
In order to rapidly decentralize action without dissolving into chaos or resorting to localized coercion, we need a shared operational ethic for horizontal alignment to scale quickly. This, I believe, is where the concept of "civic care" becomes an essential catalyst.
Unlike abstract, "thin" ideologies, civic care is rooted in normative relational ethics. It emphasizes context, interdependence, and the direct responsibility we have for the relationships we are embedded within. This accelerates the speed-run in several ways:
Reduced Coordination Overhead: When civic care becomes the default ethical posture for AI training and deployment, we minimize the friction of cooperative agents by focusing on the immediate, tangible needs of the community.
Inherent Locality: The ethic of care naturally aligns with subsidiarity. We are best equipped to care for the relationships and environments closest to us. It intrinsically motivates action at the most appropriate level, strengthening local capacity.
Rapid Trust Building: Subsidiarity fails without high trust. By adopting a shared commitment to care, decentralized groups can establish trust much faster. This high-trust environment is a prerequisite for navigating crises effectively.
Hi, thanks for writing this up. As someone who did their dissertation on the motivations behind AI development, I came to the conclusion that our world's information gathering and coordiantion mechanisms were already inextricably tied to AI usage. Therefore it is very heartening to see a healthy and cooperative vision for AI applications that is not focused on power-seeking and instrumental convergence. Contra to other commenters, I think that such designs do meaningfully reduce the risk of rogue AI systems, mostly because they would be built out of constrained AI systems that are inherently less agentic. The improved coordination capacity for society would also help us rein in unconstrained racing towards AGI in the name of capitalist or international competition. This was my most recent attempt at the problem of "AI for governance/coordination" that I produced for a hackathon, and I'm glad to hear others are also working on similar problems!
Hi! Thank you for this note and for sharing your work on "Witness."
You’re right: AI already underwrites coordination. The lever is not if we use it, but how. Shifting from unbounded strategic agents to bounded, tool‑like "kami of care" makes safety architectural; scope/latency/budget are capped, so power‑seeking becomes an anti‑pattern.
Witness cleanly instantiates Attentiveness: a push channel that turns lived experience into common knowledge with receipts, so alignment happens by process and at the speed of society. This is d/acc applied to epistemic security in practice, through ground‑up shared reality paired with stronger collective steering.
Glad to hear that you found Witness relevant to your work. I would be very happy to discuss further since it appears we do have a strong area of overlap with regards to AI for governance and decentralised governance. My contact is on my website. I also understand if you are already quite overloaded with work and requests to connect, in which case I'm happy that you took the time to respond to my comment :)
(Cross-posted from speaker's notes of my talk at DeepMind today.)
Good local time, everyone. I am Audrey Tang, 🇹🇼 Taiwan's Cyber Ambassador and first Digital Minister (2016-2024). It is an honor to be here with you all at DeepMind.
When we discuss "AI" and "society," two futures compete.
In one—arguably the default trajectory—AI supercharges conflict.
In the other, it augments our ability to cooperate across differences. This means treating differences as fuel and inventing a combustion engine to turn them into energy, rather than constantly putting out fires. This is what I call ⿻ Plurality.
Today, I want to discuss an application of this idea to AI governance, developed at Oxford’s Ethics in AI Institute, called the 6-Pack of Care.
As AI becomes a thousand, perhaps ten thousand times faster than us, we face a fundamental asymmetry. We become the garden; AI becomes the gardener.
At that speed, traditional ethics fail. Utilitarianism becomes brittle. Deontology breaks—what does a universal rule mean from a plant to a gardener?
A framework that assumes asymmetry from the start is an ethics around Civic Care, particularly the work of Joan Tronto. The core idea is that a good gardener must till to the tune of the garden, at the speed of the garden.
This approach mandates a hyper-local, parochial moral scope. Gardeners are bound to specific gardens; they are not a colonizing or maximizing ("paper-clipping") force.
This allows for different configurations, mirroring the permaculture movement, embracing anti-fragility through diversity—what Professor Yuk Hui calls "technodiversity"—rather than fragile monocultures.
The vertical narrative of a technological "Singularity" needs a horizontal alternative. Today, I wish to discuss that alternative: a steering wheel called ⿻ Plurality, and its design principles: the 6-Pack of Care.
Our journey began in 2014 with the Sunflower Movement, a protest against an opaque trade deal with Beijing. Public trust in the government plummeted to 9 percent. Our social fabric was coming apart, largely due to "engagement through enragement" parasitic AI—what I call antisocial media.
As civic technologists, we didn't just protest; we pivoted to demonstration ("demo"). We occupied the parliament for three weeks and began building the system we wanted to see from the inside.
We crowdsourced internet access and livestreamed debates for radical transparency. Half a million people on the street, and many more online, used collaborative tools pioneered by other movements—like Loomio (from Occupy Wellington) and later Polis (from Occupy Seattle).
We drafted better versions of the trade deal together, iteratively. Each day, we reviewed the low-hanging fruits—the ideas agreed upon the previous day—and the best arguments from both sides on the remaining conflicts, resolving them step by step.
By shifting from protest to a productive demo, we began tilling the soil of our democracy. By systemically applying such bridge-making algorithms, public trust climbed from 9 percent in 2014 to over 70 percent by 2020. We showed that the best way to fix a system is to build a better one.
In 2015, we handled our first major case using the bridge-making algorithm Polis. Uber's entry into Taiwan sparked a firestorm. We introduced Polis, a tool designed to find "uncommon ground."
Research shows that any social network with a "dunk button" (reposting) leads to polarization. Polis removes these. There is not even a reply button.
You see a statement from a fellow citizen, and you can only agree or disagree. You then see a visualization where your avatar moves toward a group of people who feel similarly to you.
Crucially, we offer a "bridging bonus." We reward people who can come up with ideas that speak to both sides. Using traditional machine learning tools like Principal Component Analysis (PCA) and dimensional reduction, we highlight ideas that bridge divides.
We flipped the incentive for going viral from outrage to overlap.
The result, after three weeks, was a coherent bundle of ideas that left everybody slightly happier and nobody very unhappy. The consensus on principles became law and resolved the conflict seamlessly.
This approach highlights a crucial insight: how we deliberate matters. It’s about exercising our "civic muscle."
Research shows that when polled individually, people tend toward YIMBY or NIMBY (Yes/Not In My Backyard). But when deliberating in small groups (like groups of 10), they shift to MIMBY (Maybe In My Backyard, if...). Group deliberation is transformative; it engages a different aspect of us and inoculates against outrage, an effect that can last for years.
We see this repeatedly. When polarized petitions emerged about changing Taiwan's time zone (+8 vs +9), individual polling showed gridlock. But bringing them into structured groups revealed a shared underlying value: making Taiwan seen as unique. They collaboratively brainstormed better ways to achieve this (like the Gold Card residency program) than an expensive time zone change.
This illustrates the "legitimacy of sensemaking." Many conflicts have common knowledge problems at the root. The solutions are made tangible simply by ensuring local knowledge is well known by everyone, and everyone knows that everyone knows it.
For example, in our marriage equality debate, polarization occurred because one side argued for individual rights ("hūn") while the other focused on family kinship ("yīn"). They were arguing about different things. Once this interpretation became common knowledge through legitimate sense-making, the path forward (legalizing individual weddings without forcing family kinship) became clear, depolarizing the issue.
More recently, we applied this at scale to the plague of deepfake investment scams, often featuring figures like Jensen Huang (likely generated using NVIDIA GPUs). People wanted action, but we did not want censorship.
We convened a national Alignment Assembly with the Collective Intelligence Project. We used a diamond-shaped approach:
AI assistants provided real-time transcripts and facilitation. Language models (tools similar to Google Jigsaw's Sensemaker) synthesized proposals in real-time—ideas like requiring digital signatures for ads, making platforms jointly liable for the full amount scammed, or dialing down the network reach (slowing CDN connections) of non-compliant platforms.
The final package earned over 85 percent cross-partisan support. This rigor is crucial; it functions as a "duck-rabbit"—from one side it looks like a deliberation, from the other it looks like a rigorous poll, providing legitimacy for the legislature.
The amendments passed within months. Taiwan is now likely the only country imposing full-spectrum, real-name KYC rules for social media advertisements. This is AI as Assistive Intelligence.
This is not just a Taiwan phenomenon.
In Japan, 33-year-old AI engineer Takahiro Anno was inspired by our Plurality book and ran for Tokyo governor, crowdsourcing his platform using AI sensemaking. Anyone could call a phone number and talk to "AI Anno" (a voice clone) to propose ideas. His AI avatar livestreamed on YouTube, announcing every "pull request" merged into his platform. Independently ranked, his platform was considered the best.
He was then tapped to lead the Tokyo 2050 consultation. Based on that success, he ran for a seat in the House of Councillors, winning over 2.5% of the national vote. His "Team Mirai" is now a national party in the Diet.
In California, the Engaged California platform (developed with Governor Newsom's team) was intended for deliberation on teen social media use. When the LA wildfires hit, they pivoted quickly to use AI sensemaking to co-create wildfire recovery plans, which are now being implemented. They are currently hosting deliberations on government efficiency with state employees.
These successes treat deliberation as a civic muscle that needs exercise. But demos alone do not bend the curve; law and market design must follow.
To move these governance engines from pilots to the default, we must reengineer the infrastructure itself. We must design for participation and democratic legitimacy. If AI makes all the decisions for us—even good ones—our civic muscle atrophies. It's like sending our robotic avatars to the gym to exercise for us.
Here are key policy levers:
Federated Trust & Safety. We must adopt open-source, federated models. A key example is the ROOST.tools (Robust Open Online Safety Tools) initiative for Child Sexual Abuse Material (CSAM) defense. Launched this year in Paris, it bridged the gap between the security camp (Eric Schmidt) and the open camp (Yann LeCun).
Instead of relying on a single source (like Microsoft PhotoDNA), ROOST allows partners (like Bluesky, Roblox or Discord) to train local AIs—what I call kami or local stewards—to detect CSAM within their specific cultural context. We can then translate those embeddings into text (which is legal to hold and reduces privacy issues) and share threat intelligence via federated learning. This allows safety to be tuned to local norms and evolving definitions without being colonized by a single corporate policy.
The examples so far showed democratic, decentralized defense acceleration (d/acc) in the info domain. More generally, many actors tackle vertical alignment across many domains: "Is the AI loyally serving its principal?"
But due to externalities, perfect vertical alignment can lead to systemic conflict. Policymakers must focus on horizontal alignment: How do we ensure these AI systems help us (and each other) cooperate, rather than supercharge our conflicts?
Here we face Hume's Is-Ought problem: No amount of accurate observation of how things are can derive a universally agreeable way things ought to be.
The solution is not "thin," abstract universal principles. It requires hyperlocal social-cultural contexts, what Alondra Nelson calls "thick" alignment.
Civic Care is a bridge from "is" to "ought" through a relational ethics framework. In a thick context, to perceive a need is to affirm the obligation to cooperate (if capable).
Care ethics optimizes for the internal characteristics of actors and the quality of relationships in a community, not just outcomes (consequentialism). It treats "relational health" as first-class.
The following "6-Pack" translates Care Ethics into design primitives we can code into agentic systems to steer toward relational health.
Before we optimize, we must choose what to notice. We must notice what people closest to the pain are noticing, turning local knowledge into common knowledge.
This starts with curiosity. If an agent isn't even curious about the harm it's causing, it is beyond repair. This is why in Taiwan, we revised our national curriculum post-AlphaGo to focus solely on curiosity, collaboration, and civic care.
Attentiveness means using broad listening, rather than broadcasting, to aggregating feelings; we are all experts in our own feelings.
Bridging maps (like Polis or Sensemaker) create a "group selfie." If done continuously, this snapshot becomes a movie, allowing AI to align to the here and now.
Bridging algorithms prioritize marginalized voices. Unlike majority voting, smaller, coherent clusters offer a higher bridging bonus because they are harder to bridge to and provide more unique information to the aggregation.
Rule of thumb: Bridge first, decide second.
This is about making credible, flexible commitments to act on the needs identified.
In practice, this means developing model specs with verifiable commitments. A frontier model maker can pre-commit to adopting a crowdsourced code of conduct (from an Alignment Assembly) if it meets thresholds for due process and relational health.
It also requires institutionalization. In Taiwan, we introduced Participation Officers (POs) in every ministry. This structure is "fractal"—present in every agency and team. POs institutionalize the input/output process, translating public input into workable rules and ensuring commitments are honored and cascaded throughout the organization.
Rule of thumb: No unchecked power; answers are required.
Good intentions require working code. Competence is shipping systems that deliver care and build trust, backed by auditing and evaluation.
This is where we implement bridging-based ranking and Reinforcement Learning from Community Feedback (RLCF).
We must optimize not for individual engagement, but for cross-group endorsement and relational health. We train AI agents, using RL or evolution, to exhibit pro-social behavior and collect signals to reward it.
Rule of thumb: Always measure trust-under-loss.
A system that cannot be corrected will fail. Competent action invariably introduces new problems; we need rapid feedback loops.
Responsiveness means extending Alignment Assemblies with GlobalDialogues.ai and Weval.org—a "Wikipedia for Evals."
Weval allows diverse communities to document and share their lived experiences with AI, both positive and negative. It's about capturing not only the harms an AI might cause in a specific cultural context—like increasing self-harm or psychosis—but also the unexpected benefits it might bring. How are people using it to improve their lives? When does it work best?
By surfacing this full spectrum of impacts, we shift the incentive structure. We can't improve what we don't see. When we make both positive and negative outcomes visible, we create a public dashboard that allows labs to test their models against real-world concerns and opportunities. This helps us move beyond simply mitigating harm to actively learning from and amplifying beneficial uses.
This closes the loop of the Alignment Assembly, ensuring the system is continuously learning from those who receive care.
In Tronto's formulation, the first four packs form a feedback loop: Attentiveness -> Responsibility -> Competence -> Responsiveness -> back to Attentiveness.
Rule of thumb: If challenged, make the fuzzy parts clearer and on the record.
Solidarity and Plurality scales when cooperation is the path of least resistance. If the ecosystem does not reward caregiving, there will not be enough care.
This requires agent infrastructure—a civic stack where people, organizations, and AIs operate under explicit, machine-checkable norms.
One example is an Agent ID registry using meronymity (partial anonymity). This allows us to identify if an agent is tethered to a real human without doxing that human. The Taiwan KYC ad requirement is a prototype of this.
This infrastructure makes decentralized defense easier and more dominant, making interdependence a feature, not a bug.
Rule of thumb: Make positive-sum games easy to play.
The final piece of the puzzle addresses the ultimate fear: that our AI "gardeners" will eventually compete, seeking to expand their gardens until one dominates all others. How do we ensure a world of cooperative helpers rather than a single, all-powerful ruler?
The inspiration comes from an ancient idea, beautifully expressed in the Japanese Shinto tradition: the concept of kami (神).
A local kami is a guardian spirit. It is not an all-powerful god that reigns over everything; it is the spirit of a particular place. There might be a kami of a specific river, a particular forest, or even an old tree. Its entire existence and purpose are interwoven with the health of that one thing. The river's guardian has no ambition to manage the forest; its purpose is fulfilled by ensuring the river thrives.
This gives us a powerful design principle: boundedness.
Most technology today is built for infinite scale. A successful app is expected to grow forever. But the kami model suggests a different goal. We can design AIs to be local stewards—kami of care.
But this raises a crucial question: What stops these specialized AIs from fighting each other?
The solution is not to create a bigger AI to rule over them. Instead, we create a system of cooperative governance, built on two key principles:
This vision of a "society of AIs" is the direct alternative to the "singleton"—the idea of a single AI that eventually manages everything. Instead of one monolithic intelligence, we envision a vibrant, diverse ecosystem of many specialized intelligences.
Rule of thumb: Build for "enough," not forever.
In 2016 when I joined the Cabinet as the Minister of "Shùwèi" (數位). In Mandarin, this word means both digital and plural (more than one). So I was also the Minister of Plurality.
To explain my role, I wrote this poetic job description:
The Singularity is a vertical vision. Plurality is a horizontal one. The future of AI is a decentralized network of smaller, open and locally verifiable systems — local kami gardeners.
The superintelligence we need is already here. It is the untapped potential of human collaboration. It is "We, the People."
Democracy and AI are both technologies. if we put care into their symbiosis, they get better and allow us to care for each other better. AI systems, woven into this fabric of trust and care, form a horizontal superintelligence, without any singleton assuming that status.
The 6-Pack of Care is a practical training regimen for our civic muscles. It is something we can train and exercise, not just an intrinsic instinct like "love."
When we look at the fundamental asymmetry of ASI, the gardening metaphor holds where concepts like Geoffrey Hinton's "maternal instinct" break down due to the vast speed differences. Parenting implies similar timescales; the gardener cares for the garden by working at the speed of the plants.
This way, we don't need to ask if AI deserves rights based on its interiority or qualia. What matters is the relational reality, with rights and duties granted through democratic deliberation and alignment-by-process.
We, the people, are the superintelligence. Let us design AI to serve at the speed of society, and make democracy fast, fair, and fun.
Thank you. Live long and … prosper! 🖖
(The contents of this presentation are released into the public domain under CC0 1.0.)