Feedback welcome: www.admonymous.co/mo-putera
Long-time lurker (c. 2013), recent poster. I also write on the EA Forum.
A sad example of what Scott Aaronson called bureaucratic blankface: Hannah Cairo, who at 17 published a counterexample to the longstanding Mizohata-Takeuchi conjecture which electrified harmonic analysis experts the world over, decided after completing the proof to apply to 10 graduate programs. 6 rejected her because she didn't have a graduate degree nor a high school diploma (she'd been advised by Zvezdelina Stankova, founder of the top-tier Berkeley Math Circle, to skip undergrad at 14 and enrol straight in grad-level courses as she'd already taught herself an advanced undergrad curriculum by then from Khan Academy and textbooks). 2 admitted her but were then overridden by administrators. Only the U of Maryland and John Hopkins overlooked her unconventional CV. This enraged Alex Tabarrok:
Kudos to UMD and JHU! But what is going on at those other universities?!! Their sole mission is to identify and nurture talent. They have armies of admissions staff and tout their “holistic” approach to recognizing creativity and intellectual promise even when it follows an unconventional path. Yet they can’t make room for a genius who has been vetted by some of the top mathematicians in the world? This is institutional failure.
We saw similar failures during COVID: researchers at Yale’s School of Public Health, working on new tests, couldn’t get funding from their own billion-dollar institution and would have stalled without Tyler’s Fast Grants. But the problem isn’t just speed. Emergent Ventures isn’t about speed but about discovering talent. If you wonder why EV has been so successful look to Tyler and people like Shruti Rajagopalan and to the noble funders but look also to the fact that their competitors are so bureaucratic that they can’t recognize talent even when it is thrust upon them.
It’s a very good thing EV exists. But you know your city is broken when you need Batman to fight crime. EV will have truly succeeded when the rest of the system is inspired into raising its game.
On blankfaces, quoting Scott:
What exactly is a blankface? He or she is often a mid-level bureaucrat, but not every bureaucrat is a blankface, and not every blankface is a bureaucrat. A blankface is anyone who enjoys wielding the power entrusted in them to make others miserable by acting like a cog in a broken machine, rather than like a human being with courage, judgment, and responsibility for their actions. A blankface meets every appeal to facts, logic, and plain compassion with the same repetition of rules and regulations and the same blank stare—a blank stare that, more often than not, conceals a contemptuous smile.
The longer I live, the more I see blankfacedness as one of the fundamental evils of the human condition. Yes, it contains large elements of stupidity, incuriosity, malevolence, and bureaucratic indifference, but it’s not reducible to any of those. ...
Update (Aug. 3): Surprisingly many people seem to have read this post, and come away with the notion that a “blankface” is simply anyone who’s a stickler for rules and formalized procedures. They’ve then tried to refute me with examples of where it’s good to be a stickler, or where I in particular would believe that it’s good. But no, that’s not it at all. ...
Here’s how to tell a blankface: suppose you see someone enforcing or interpreting a rule in a way that strikes you as obviously absurd. And suppose you point it out to them.
Do they say “I disagree, here’s why it actually does make sense”? They might be mistaken but they’re not a blankface.
Do they say “tell me about it, it makes zero sense, but it’s above my pay grade to change”? You might wish they were more dogged or courageous but again they’re not a blankface.
Or do they ignore all your arguments and just restate the original rule—seemingly angered by what they understood as a challenge to their authority, and delighted to reassert it? That’s the blankface.
Seems like you missed the point of the post? Last 2 paragraphs
Knowing how to perform a task yourself at all is not the same as knowing how to perform it as well as the person you are delegating the task to. The goal is not to ensure that competence across every work-relevant dimension strictly declines as you go down the organizational hierarchy. You frequently will, and should, delegate to people who are 10x faster, or 10x better at a task than you are yourself.
But by knowing how to perform a task yourself, if slowly or more jankily than your delegees, you will maintain the ability to set realistic performance standards, jump in and keep pushing on the task if it becomes an organizational bottleneck, and audit systems and automations that are produced as part of working on the task. This will take you a bunch of time, and often feel like it detracts from more urgent priorities, but is worth the high cost.
I like Greg Egan's "outlooks" from Diaspora for many reasons: as a reversible customisable solution to value drift, as a way to temporarily experience the world from the perspective of people with very different aesthetic sensibilities or deep values, to approach problem-solving differently, maybe even to simulate high-level generators of disagreement (which would be a boon for erisology), and I wish it already existed:
Any citizen with a mind broadly modeled on a flesher's was vulnerable to drift: the decay over time of even the most cherished goals and values. Flexibility was an essential part of the flesher legacy, hut after a dozen computational equivalents of the preIntrodus lifespan, even the most robust personality was liable to unwind into an entropic mess. None of the polises' founders had chosen to build predetermined stabilizing mechanisms into their basic designs, though, lest the entire species ossify into tribes of self-perpetuating monomaniacs, parasitized by a handful of memes.
It was judged far safer for each citizen to be free to choose from a wide variety of outlooks: software that could run inside your exoself and reinforce the qualities you valued most, if and when you felt the need for such an anchor. The possibilities for short-term cross-cultural experimentation were almost incidental.
Each outlook offered a slightly different package of values and aesthetics, often built up from the ancestral reasons-to-be-cheerful that still lingered to some degree in most citizens' minds: Regularities and periodicities--rhythms like days and seasons. Harmonies and elaborations, in sounds and images, and in ideas. Novelty. Reminiscence and anticipation. Gossip, companionship, empathy, compassion. Solitude and silence. There was a continuum which stretched all the way from trivial aesthetic preferences to emotional associations to the cornerstones of morality and identity.
and further down:
Inoshiro had argued that this was vis last chance to do anything "remotely exciting" before ve started using a miner's outlook and "lost interest in everything else"--but that simply wasn't true; the outlook was more like a spine than a straitjacket, a strengthened internal framework, not a constrictive cage.
One example is miners (of mathematical truth) using outlooks "to keep themselves focused on their work, gigatau after gigatau" (a gigatau is a billion subjective seconds or ~31 years; even among what Mumford calls detective-type mathematicians like Andrew Wiles of FLT fame that's not the norm). Another example is for appreciating otherwise-incomprehensible art:
"Come and see Hashim's new piece."
"Maybe later." Hashim was one of Inoshiro's Ashton-Laval artist friends. Yatima found most of their work bewildering, though whether it was the interpolis difference in mental architecture or just vis own personal taste, ve wasn't sure. Certainly, Inoshiro insisted that it was all "sublime."
"It's real time, ephemeral. Now or never."
"Not true: you could record it for me, or I could send a proxy-"
Ve stretched vis pewter face into an exaggerated scowl. "Don't he such a philistine. Once the artist decides the parameters, they're sacrosanct-"
"Hashim's parameters are just incomprehensible. Look, I know I won't like it. You go."
Inoshiro hesitated, slowly letting vis features shrink back to normal size. "You could appreciate Hashim's work, if you wanted to. If you ran the right outlook."
Yatima stared at ver. "Is that what you do?"
"Yes." Inoshiro stretched out vis hand, and a flower sprouted from the palm, a green-and-violet orchid which emitted an Ashton-Laval library address. ...
Yatima sniffed the flower again, warily. The Ashton-laval address smelt distinctly foreign ... but that was just unfamiliarity. Ve had vis exoself take a copy of the outlook and scrutinize it carefully. ... Yatima had vis exoself's analysis of the outlook appear in the scape in front of ver as a pair of before-and-after maps of vis own most affected neural structures. The maps were like nets, with spheres at every junction to represent symbols; proportionate changes in the symbols' size showed how the outlook would tweak them.
"Death' gets a tenfold boost? Spare me."
"Only because it's so underdeveloped initially." ...
"Make up your mind; it's starting soon."
"You mean make my mind Hashim's?" "Hashim doesn't use an outlook." ...
Vis exoself's verdict on the potential for parasitism was fairly sanguine, though there could he no guarantees. If ve ran the outlook for a few kilotau, ve ought to be able to stop.
Yatima ran the outlook. At once, certain features of the scape seized vis attention: a thin streak of cloud in the blue sky, a cluster of distant trees, the wind rippling through the grass nearby. It was like switching from one gestalt color map to another, and seeing some objects leap out because they'd changed more than the rest. After a moment the effect died down, but Yatima still felt distinctly modified; the equilibrium had shifted in the tug-of-war between all the symbols in vis mind, and the ordinary buzz of consciousness had a slightly different tone to it.
"Are you okay?" Inoshiro actually looked concerned, and Yatima felt a rare, raw surge of affection for ver. Inoshiro always wanted to show ver what ve'd found in vis endless fossicking through the Coalition's possibilities--because ve really did want ver to know what the choices were.
"I'm still myself. I think."
"Pity." Inoshiro sent the address, and they jumped into Hashim's artwork together.
An example of a bad outlook in Diaspora is the one the Ostvalds use which "made them lap up any old astrobabble like this as if it was dazzlingly profound". And here's what I'd consider a horrifying outlook, like a monstrous perversion of enlightenment, which Inoshiro applied to verself after a severely traumatic experience:
Inoshiro said, "I feel great compassion for all conscious beings. But there's nothing to be done. There
will always be suffering. There will always be death." ...Yatima tried to read vis face, but Inoshiro just gazed back with a psychoblast's innocence. "What's happened to you? What have you done to yourself?"
Inoshiro smiled beatifically and held out vis hands. A white lotus flower blossomed from the center of each palm, both emitting identical reference tags. Yatima hesitated, then followed their scent. It was an old outlook, buried in the Ashton-Laval library, copied nine centuries before from one of the ancient memetic replicators that had infested the fleshers. It imposed a hermetically sealed package of beliefs about the nature of the self, and the futility of striving ... including explicit renunciations of every mode of reasoning able to illuminate the core beliefs' failings.
Analysis with a standard tool confirmed that the outlook was universally self-affirming. Once you ran it, you could not change your mind. Once you ran it, you could not be talked out of it.
Yatima said numbly, "You were smarter than that. Stronger than that." But when Inoshiro was wounded by Lacerta, what hadn't ve done that might have made a difference? That might have spared ver the need for the kind of anesthetic that dissolved everything ve'd once been?
Inoshiro laughed. "So what am I now? Wise enough to be weak? Or strong enough to be foolish?"
"What you are now-" Ve couldn't say it.
What you are now is not Inoshiro.
Yatima stood motionless beside ver, sick with grief, angry and helpless. Ve was not in the fleshers' world anymore; there was no nanoware bullet ve could fire into this imaginary body. Inoshiro had made vis choice, destroying vis old self and creating a new one to follow the ancient meme's dictates, and no one else had the right to question this, let alone the power to reverse it.
My interest in Egan's outlooks is motivated by real-world examples too. The example I always think about is Scott's observation that compared to a decade ago he's trended "more cynical, more mellow, and more prone to believing things are complicated" and posits (among others) that it would suck if "everything we thought was “gaining wisdom with age” was just “brain receptors consistently functioning differently with age”", like NMDA receptor function changing with aging and maybe "the genes for liberal-conservative differences are mostly NMDA receptors in the brain" (to give a simplistic illustrative example he doesn't actually put credence in).
The most salient motivating example at the moment is different, it's Cube Flipper's estrogen trip report, which I find fascinating, especially these parts (to summarise their wonderfully-detailed descriptions):
And this summary of changes, from a section where the author investigates whether estrogen was pushing them towards the other end of the "autism-schizotypy continuum" by reducing inherent oversensitivity to sensory prediction errors:
I’ll outline some of the psychological changes I’ve noticed in myself since starting estrogen. ...
- Increased predisposition towards associative thinking. Activities like tarot are more appealing.
- Increased predisposition towards magical thinking, leading to some idiosyncratic worldviews. This can probably be gauged by the nonsense I post on Twitter.
- Increased experience of meaningness in day-to-day life. This felt really good.
- Increased mentalising of other people’s internal states, resulting in a mixture of higher empathy and higher social anxiety. I’m somewhat more neurotic about potential threats.
- Decreased sensory sensitivity.
- Decreased attentional diffusion, contrary to what the paper predicts.
- Decreased systematising and attention to detail, for instance with tedious matters like finances.
Armchair diagnoses aside, I do wish to assert that these psychological changes are quite similar to the kind of psychological changes I tend to experience while on a mild dose of psychedelics.
(Tangentially this seems very relevant to the whole high-decoupling vs high-contextualising thing.)
Egan's outlooks would be like the far more sophisticated version of this: higher precision and customisability (e.g. "death-salience only", or "don't lose interest in everything else" cf. the miner outlooks above), higher control granularity (onset/reversal timescales etc), predictable return to baseline, predictability & previewability of changes (and also non-individual variability).
Interesting anecdotes from an ex-SpaceX engineer who started out thinking "Elon's algorithm" was obviously correct and gradually grew cynical as SpaceX scaled:
Questioning the requirements was an extremely literal thing that you were supposed to do multiple times every single day. I’d make a claim about my system (“hey, if the stuff in this tube gets too hot, my part will explode, so please don’t put anything too hot near it”) and that very afternoon three or four people would stop by my desk, ready to debate.
“Hello,” they would say. “I’m the Responsible Engineer for the Hot Things Near Tubes system,” and then the floodgates would open. What did I mean by near? What did I mean by hot? How hot was too hot? Was it really going to explode? If it exploded, was that really so terrible?
The first time, the debate would be interesting. The second, it would be a bit tiresome. By the first week after a new claim, it was exhausting and a little rote. But you had to win, every time, because if you didn’t, nobody would follow your requirement.
It also worked in the other direction. I learned to pay attention to everything that was happening in the whole program, absorbing dozens of update emails a day, because people would announce Requirements, and I’d need to go Question Them. If I didn’t do this, I’d find my system forced to jump through too many hoops to work, and, of course, I would be Responsible. If I was Responsible for too many things, I wouldn’t be able to support all of them - unless, of course, I managed to Delete the Part and free myself from one of those burdens.
And so when there were requirements, they were strong, because they had to survive an endless barrage of attack. When there were parts, they were well-justified, because every person involved in the process of making them had tried to delete them first. And there were no requirements matrices, no engineering standards, practically no documentation at all.
The key point came in, the reason why it was capitalized. It wasn’t philosophy, it wasn’t advice - it was an Algorithm. A set of process steps that you followed to be a good engineer. And all of us good engineers were being forced by unstoppable cultural forces to maniacally follow it.
There was one question slowly building in my mind. The point of SpaceX was to get good engineers, do first principles analysis, let them iterate, and avoid documentation. This whole process was clearly succeeding at the last three steps. But if we were already so great, why did we have to have this process enforced so aggressively?
As the time went on and the Algorithm grew, screaming ever-louder about what we should specifically do, the question grew more ever more urgent.
Tell people to ritualize Questioning Requirements and they will do so ritually. You’ll deliver the same explanation for how hot your tube can be a hundred times, and each time you deliver it you think about it less. You will realize that the best way to get work done is to build a persona as extremely knowledgeable and worthless to question, and then nobody ever questions your work.
Tell people to Delete the Part, and they’ll have the system perform ridiculous gymnastics in software to avoid making a 30$ bracket, or waste performance to avoid adding a process.
Tell people to Optimize the Part and they’ll push it beyond margins unnecessarily, leaving it exquisite at one thing and hopeless at others.
Tell them to Accelerate, and they’ll do a great job of questioning, but when push comes to shove they will always Accelerate at the cost of quality or rework, and so you find yourself building vehicles and then scrapping them, over and over again.
There is no step for Test in the Algorithm, no step for “prove it works.” And so years went by where we Questioned, and Deleted, and Optimized, and Accelerated, and Automated, and rockets piled up outside the factory and between mid-2021 and mid-2023 they never flew.
Every engineer was Responsible for their own part. But every engineer had perverse incentives. With all that Accelerating and Automating, if my parts got on the rocket on time, I succeeded. In fact, if the rocket never flew, I succeeded more, because my parts never got tested.
And so we made mistakes, and we did silly things. The rocket exploded a lot, and sometimes we learned something useful, but sometimes we didn’t. We spent billions of dollars. And throughout it all, the program schedule slid inexorably to the right.
And I got cynical.
There were enormous opportunities to have upside improvement in the rocket industry of the 2000s and 2010s. The company was small and scrappy and working hard. The rules applied.
But by the 2020s, even SpaceX was growing large. The company had passed 10,000 people, with programs across the country, tendrils in every major space effort and endlessly escalating ambition.
And the larger it became, the greater the costs to its architecture became. As my program grew from dozens of people to hundreds to thousands, every RE needed to read more emails, track more issues, debate more requirements. And beyond that, every RE needed to be controlled by common culture to ensure good execution, which wasn’t growing fast enough to meet the churn rate of the new engineers.
This makes me wonder if SpaceX could actually be substantially faster if it took systems engineering as seriously as the author hoped (like say the Apollo program did), overwhelmingly dominant as they currently are in terms of mass launch fraction etc. To quote the author:
The first recorded use of the term “Systems Engineering” came from a 1950 presentation by Mervin J. Kelly, Vice President of Bell Telephone. It appeared as a new business segment, coequal with mainstays like Research and Development. Like much of the writing on systems engineering, the anodyne tone hid huge ambition.
‘Systems engineering’ controls and guides the use of the new knowledge obtained from the research and fundamental development programs … and the improvement and lowering of cost of services…’
In other words, this was meta-engineering.
The problems were too complex, so the process had to be a designed thing, a product of its own, which would intake the project goals and output good decision making.
It began with small things. There should be clear requirements for what the system is supposed to do. They should be boxed out and boiled down so that each engineer knows exactly what problem to solve and how it impacts the other ones. Changes would flow through the process and their impacts would be automatically assessed. Surrounding it grew a structure of reviews, process milestones, and organizational culture, to capture mistakes, record them, and make sure nobody else made them again.
And it worked! All of those transcendental results from Apollo were in fact supported on the foundations of exquisitely handled systems engineering and program management. The tools developed here helped catapult commercial aviation and sent probes off beyond the Solar System and much more besides.
At SpaceX, there was no such thing as a “Systems Engineer.” The whole idea was anathema. After all, you could describe the point of systems engineering, and process culture more generally, as the process of removing human responsibility and agency. The point of building a system to control human behavior is that humans are fallible. You write them an endless list of rules to follow and procedures to read, and they follow them correctly, and then it works out.
At SpaceX, it wasn’t going to be like that. First principles thinking and Requirements Questioning and the centrality of responsible engineering all centered around the idea of raising the agency of each individual engineer. Raising individual responsibility was always better.
The ever-colorful Peter Watts on how science works because, not despite, scientists are asses:
Science doesn’t work despite scientists being asses. Science works, to at least some extent, because scientists are asses. Bickering and backstabbing are essential elements of the process. Haven’t any of these guys ever heard of “peer review”?
There’s this myth in wide circulation: rational, emotionless Vulcans in white coats, plumbing the secrets of the universe, their Scientific Methods unsullied by bias or emotionalism. Most people know it’s a myth, of course; they subscribe to a more nuanced view in which scientists are as petty and vain and human as anyone (and as egotistical as any therapist or financier), people who use scientific methodology to tamp down their human imperfections and manage some approximation of objectivity.
But that’s a myth too. The fact is, we are all humans; and humans come with dogma as standard equipment. We can no more shake off our biases than Liz Cheney could pay a compliment to Barack Obama. The best we can do— the best science can do— is make sure that at least, we get to choose among competing biases.
That’s how science works. It’s not a hippie love-in; it’s rugby. Every time you put out a paper, the guy you pissed off at last year’s Houston conference is gonna be laying in wait. Every time you think you’ve made a breakthrough, that asshole supervisor who told you you needed more data will be standing ready to shoot it down. You want to know how the Human Genome Project finished so far ahead of schedule? Because it was the Human Genome projects, two competing teams locked in bitter rivalry, one led by J. Craig Venter, one by Francis Collins — and from what I hear, those guys did not like each other at all.
This is how it works: you put your model out there in the coliseum, and a bunch of guys in white coats kick the shit out of it. If it’s still alive when the dust clears, your brainchild receives conditional acceptance. It does not get rejected. This time.
Yes, there are mafias. There are those spared the kicking because they have connections. There are established cliques who decide what appears in Science, who gets to give a spoken presentation and who gets kicked down to the poster sessions with the kiddies. I know a couple of people who will probably never get credit for the work they’ve done, for the insights they’ve produced. But the insights themselves prevail. Even if the establishment shoots the messenger, so long as the message is valid it will work its way into the heart of the enemy’s camp. First it will be ridiculed. Then it will be accepted as true, but irrelevant. Finally, it will be embraced as canon, and what’s more everyone will know that it was always so embraced, and it was Our Glorious Leader who had the idea. The credit may not go to those who deserve it; but the field will have moved forward.
Science is so powerful that it drags us kicking and screaming towards the truth despite our best efforts to avoid it. And it does that at least partly fueled by our pettiness and our rivalries. Science is alchemy: it turns shit into gold. Keep that in mind the next time some blogger decries the ill manners of a bunch of climate scientists under continual siege by forces with vastly deeper pockets and much louder megaphones.
(This might be biased by the fields Watts is familiar with and with his own tendency to seek fights though, cf. Scott's different worlds. I don't get the sense that this is universal or all that effectiveness-improving at finding out the truth of the matter.)
Interesting example. Tangentially I'm guessing believing in substrate dependence is part of some folks' visceral dislike of Richard Ngo's story The Gentle Romance, which was meant to be utopian. I mostly lean against substrate dependence and so don't find your example persuasive, although Scott Aaronson's monstrous edge cases do give me pause:
what if each person on earth simulated one neuron of your brain, by passing pieces of paper around. It took them several years just to simulate a single second of your thought processes. Would that bring your subjectivity into being? Would you accept it as a replacement for your current body?
If so, then what if your brain were simulated, not neuron-by-neuron, but by a gigantic lookup table? That is, what if there were a huge database, much larger than the observable universe (but let’s not worry about that), that hardwired what your brain’s response was to every sequence of stimuli that your sense-organs could possibly receive. Would that bring about your consciousness?
Let’s keep pushing: if it would, would it make a difference if anyone actually consulted the lookup table? Why can’t it bring about your consciousness just by sitting there doing nothing?
To these standard thought experiments, we can add more. Let’s suppose that, purely for error-correction purposes, the computer that’s simulating your brain runs the code three times, and takes the majority vote of the outcomes. Would that bring three “copies” of your consciousness into being? Does it make a difference if the three copies are widely separated in space or time—say, on different planets, or in different centuries? Is it possible that the massive redundancy taking place in your brain right now is bringing multiple copies of you into being?
Maybe my favorite thought experiment along these lines was invented by my former student Andy Drucker. In the past five years, there’s been a revolution in theoretical cryptography, around something called Fully Homomorphic Encryption (FHE), which was first discovered by Craig Gentry. What FHE lets you do is to perform arbitrary computations on encrypted data, without ever decrypting the data at any point. So, to someone with the decryption key, you could be proving theorems, simulating planetary motions, etc. But to someone without the key, it looks for all the world like you’re just shuffling random strings and producing other random strings as output.
You can probably see where this is going. What if we homomorphically encrypted a simulation of your brain? And what if we hid the only copy of the decryption key, let’s say in another galaxy? Would this computation—which looks to anyone in our galaxy like a reshuffling of gobbledygook—be silently producing your consciousness?
Obviously you're not obliged to, but if you ever get round to looking into the GDM paper more deeply like you mentioned I'd be interested in what you have to say, as you might change my opinion on it.
(I actually appreciate the emotion in the response, so thanks for including it)
One of my persistent worries is that superintelligence will have the right values but the wrong ontology of personhood
I would've expected the opposite phrasing (right ontology wrong values, cf. "the AI knows but doesn't care") so this caught my eye. Have you or anyone else written anything about this elsewhere you can point me to? I initially thought of Jan Kulveit's essays (e.g. this or this) but upon re-skimming they don't really connect to what you said.
I like the viewpoint in this Google DeepMind paper A Pragmatic View of AI Personhood (h/t Ben Goldhaber's post), it reads like a modern AI-specific version of Kevin Simler's 2014 essay on personhood. Abstract:
The emergence of agentic Artificial Intelligence (AI) is set to trigger a “Cambrian explosion” of new kinds of personhood. This paper proposes a pragmatic framework for navigating this diversification by treating personhood not as a metaphysical property to be discovered, but as a flexible bundle of obligations (rights and responsibilities) that societies confer upon entities for a variety of reasons, especially to solve concrete governance problems.
We argue that this traditional bundle can be unbundled, creating bespoke solutions for different contexts. This will allow for the creation of practical tools—such as facilitating AI contracting by creating a target “individual” that can be sanctioned—without needing to resolve intractable debates about an AI’s consciousness or rationality.
We explore how individuals fit in to social roles and discuss the use of decentralized digital identity technology, examining both ‘personhood as a problem’, where design choices can create “dark patterns” that exploit human social heuristics, and ‘personhood as a solution’, where conferring a bundle of obligations is necessary to ensure accountability or prevent conflict.
By rejecting foundationalist quests for a single, essential definition of personhood, this paper offers a more pragmatic and flexible way to think about integrating AI agents into our society.
I was already primed to unbundle personhood because I bought Simler's view of personhood as an abstract interface that can be implemented to varying degrees by anything (not just humans) in return for getting to participate in civil society:
The authors argue that taking the pragmatic stance helpfully dissolves the personhood question and lets them craft bespoke solutions to specific governance problems:
This paper offers a pragmatic framework that shifts the crucial question from what an AI is to how it can be identified and which obligations it is useful to assign it in a given context. We regard the pragmatic stance as crucial. Assuming some essence of personhood is “out there” waiting to be discovered, or a metaphysical fact about what AIs or persons “really are” that can settle our practical questions seems to us, unlikely to prove helpful. We propose treating personhood not as something entities possess by virtue of their nature, but as a contingent vocabulary developed for coping with social life in a biophysical world (Rorty, 1989).
The default philosophical impulse is to ask what an entity truly is in its essence. The pragmatist instead asks what new description would be more useful for us to adopt. What vocabulary must we invent to cope? We think this move is a vital one for navigating our likely future where some AIs are owned property while similar AIs operate autonomously. ...
Inspired by Schlager and Ostrom (1992)’s demonstration that the property rights bundle can be broken apart to fit specific contexts, we propose that the personhood bundle can be similarly unbundled into components. Our position on personhood as a bundle resembles that of Kurki (2019) but we put greater emphasis on the bundle’s plasticity and the diversity of different bundles. For AI persons, the components of the bundle need not co-occur in accordance with the specific configuration they take for natural human persons. Without essences to constrain us, we are free to craft bespoke solutions: sanctionability without suffrage, culpability and contracting without consciousness attribution, etc.
What's in the personhood bundle? What kind of bundles?
The crucial question is always: what bundle of components constitutes the “person” that society needs to address for a given purpose? The answer changes depending on who is doing the addressing and for what reason. For a human user building a relationship, the person is a story—the (model + chat history) that creates a unique, evolving individual to bond with. For a court of law assigning liability, the person is the locus of responsibility—the entire operational stack of (model + instance + runtime variables + capital + registration) that can be held accountable, sanctioned, updated, and forced to pay for the harm it causes.
For the sake of concreteness, we can describe several possible configurations of the addressable bundle useful in different situations, for different kinds of AIs.
In the specific case of a goal-driven autonomous AI agent, perhaps the kind of personhood would be a Chartered Autonomous Entity, with a bundle consisting of rights to (1) Perpetuity, (2) Property, (3) Contract, and duties of (1) Mandate Adherence, (2) Transparency (3) Systemic Non-Harm, and (4) Self-Maintenance.
In other situations, it may be useful to define a Flexible Autonomous Entity with all the same bundle elements except the duty of mandate adherence. Perhaps the former could be seen as analogous to a for-profit company and the latter as analogous to a non-profit company.
It may also be useful to define Temporary Autonomous Entities (either chartered or flexible). These would drop the right to perpetuity and add a duty of self deletion under specified conditions.
This process of bundling and unbundling obligations is the engine of the Cambrian explosion.
More on their stance. I like how sensible it is, it's like the authors clearly internalised a human's guide to words (whether they've read it or not):
Our theory is developed in the context of an account of personhood that defines a person as a ‘political and community-participating’ actor (Haugeland, 1982). This is a status that depends not on an entity’s intrinsic properties, but on collective recognition from the community it seeks to join, a recognition which is itself dependent on adherence to norms. On this view, personhood status is always a collective decision, a contingent outcome of social negotiation, not a fixed metaphysical status.
This stance is strongly non-essentialist and, crucially, it partially dissolves the traditional distinction between a ‘natural person’, whose status is typically grounded in their intrinsic nature (like consciousness or rationality), and a ‘legal person’, a functional status conferred by a community to solve practical governance problems (like a corporation). From our perspective, this distinction is a relic of the search for essences.
Some motivating examples:
Our focus is on agentic AI systems, rather than on the underlying foundation models that power them. These are the long-running, persistent agents that maintain state, remember past interactions, and adapt their behavior over time. This persistence is what makes an agent a plausible candidate for other entities to relate themselves to. A human’s relationship with such a persistent agent can be emotionally salient and economically consequential in ways a one-off, stateless interaction cannot.
The part of this paper concerned with “personhood as a problem” applies most clearly to companion AIs, where long-term interaction is designed to foster emotional bonds, creating risks of exploitation (Earp et al., 2025; Manzini et al., 2024a). Conversely, the part of the paper concerned with “personhood as a solution” applies to more utility-like and virus-like agents, especially self-sufficient ownerless systems (or systems whose owner cannot be identified; Fagan (2025)), where persistence creates an accountability gap that legal personhood might fill. Consider an AI designed to seek out funding and pay its own server costs. It could easily outlive its human owner and creator. If this ownerless agent eventually causes some harm, our vocabulary of accountability, which searches for a responsible ‘person’, would fail to find one (Campedelli, 2025).
After discussing a historical precedent from maritime law (see the ships section), the authors argue:
The parallel to autonomous AI agents is striking. An artificial intelligence agent could be built upon open-source code contributed by a global network of developers, making it difficult to trace liability to any single party. When such an agent causes harm—by manipulating a market or causing a supply chain failure—the prospect of identifying a single, responsible human or human organization can be practically impossible. For the ownerless AI that outlives its creator, the problem is especially acute.
Following the logic of maritime law, we could grant a form of legal personhood directly to such AI agents. A judgment against an AI could result in its operational capital being seized or its core software being “arrested” by court order (see Section 9).
The next example is hypothetical – a generative "ghost" of a family's late matriarch:
Now, consider a different kind of non-human entity that could fill such a role: an AI. Imagine a family that interacts for decades with a “generative ghost” of their late matriarch (Morris and Brubaker, 2024), an AI trained on her lifetime of diaries, messages, and videos. It shares her wisdom, recalls her stories, and even helps mediate disputes according to the principles she espoused. Or picture a small community whose collective history, language, and cultural traditions are held and nurtured by a persistent AI—a digital elder that has tutored their children and advised their leaders for generations.
For the great-grandchildren in that family or the youth of that community, their AI elder is not a tool; it is a constant, foundational presence. It is a source of identity and connection to their own past. Could they, in time, come to see it as an ancestor? Could they regard their identity as intertwined with it, and view themselves as having a duty to care for it as it cares for them?
There's a novella I really like that explores a version of this, Catherynne Valente's surrealist far-future Silently and Very Fast (see part III, "Three: Two Pails of Milk"), deservingly nominated for numerous awards.
On how their stance interacts with morality:
In our theory, “morality talk” is a form of social sanctioning used to make two specific claims about a norm: (1) that it is exceptionally important, and (2) that it has a wide or universal, scope of applicability (Leibo et al., 2024). Thus, in our theory, to argue an AI is a ‘person’ is not to make a metaphysical claim about its nature, but to make an emphatic political claim that the obligations bundled together as its personhood ought to take precedence over other considerations. For us, any form of personhood—moral, legal, or otherwise—is a functional status conferred by a community.
Therefore, we see the role of science, the institution, not as clarifying the list of properties an AI must satisfy to be a person, but as illuminating what may cause human communities to collectively ascribe personhood status to them.
The authors reject foundationalist stances in general (explicitly calling their pragmatism "anti-foundationalist") and reject consciousness as a foundation for AI personhood in particular, which motivates welfarists:
On the welfare side, this tradition’s power lies in its combination of compassion with universalism and its account of moral progress (toward greater pleasure and lesser pain for more individuals). It provides a clear, non-arbitrary reason to prevent harm—because suffering is bad, regardless of who is suffering. This one-size-fits-all principle works powerfully in contexts like the movement for animal welfare. When applied to industrial farming or the use of animals in cosmetic testing, the question “does it suffer?” serves as a potent tool for moral argument capable of cutting through cultural justifications for cruelty and providing a clear metric for reformers to work to optimize (Singer, 2011).
However this focus on suffering arguably fails to address important welfare problems arising for pragmatic reasons. Consider again the “generative ghost” of a family’s late matriarch (recalling Section 1; Morris and Brubaker (2024)). Or picture a community whose history and traditions are held by a “digital elder” that has advised them for generations. For these groups, their AI is a a source of identity and connection to their past—an “ancestor”. The obligation they may feel to protect their AI from arbitrary deletion would not necessarily have anything to do with their assessment of its capacity to feel pain. After all, arguments that the ghost would not feel pain when deleted don’t seem likely to persuade them to permit its deletion.
The morally-relevant concern may be that the AI’s deletion would destroy an entity in a foundational relational role for their family or community (Kramm, 2020). In which case it would be the relational harm of deleting the AI that matters, not the pain the AI may or may not feel. ...
The relational harm remark jives with Simler's nihilistic account of meaning as relational (among other properties), which I already buy, which is probably why I find it sensible.
The authors call out the welfarists' rhetorical sleight-of-hand:
Viewed together, the dual use of consciousness as backstop for welfare rights and accountability obligations reveals a stark asymmetry. When arguing for rights, the mere possibility of consciousness is deemed sufficient to open the debate. But when arguing against responsibilities, an impossible standard of proof for an internal state is demanded. This shows that consciousness is mostly being used as a rhetorical tool, not as a stable conceptual foundation. Therefore, anyone uninterested in the metaphysics may regard AI personhood as having no conceptual dependence on AI consciousness.
In fact, we would predict the dependence to run in the opposite direction. Both usage of the word consciousness, and human intuitions around it, are likely to shift in response to the emergence of pragmatic reasons to consider AIs as persons. ... Notice that many cultures attribute consciousness to objects not conventionally considered alive (Keane, 2025). For example Shintoism posits that objects and places can have conscious spirits (kami) within them. It is likely that eventually some groups of people will attribute consciousness to AIs, while others will not. These groups will view their ethical obligations differently from each other, similarly to how people have diverse opinions on animal consciousness and whether eating animals is normative. The pragmatic question then is how to arrange institutions to resolve the conflicts that arise from these differences (Rorty, 1999).
My instinctive answer to that last question is "probably whatever the folks at the Meaning Alignment Institute are cooking up" (I linked to their full-stack agenda, but the writeup that personally convinced me to pay attention to them was the 500 participants' positive experience especially getting Democrats and Republicans to agree substantively in their democratic fine-tuning experiment, contra my skepticism from predicting that the polarizing questions asked in the experiment would be mostly irreconcilable due to differently crystallised metaphysical heuristics).
Gwern's essay you mentioned, in case others are curious: https://gwern.net/ai-daydreaming
Despite impressive capabilities, large language models have yet to produce a genuine breakthrough. The puzzle is why.
A reason may be that they lack some fundamental aspects of human thought: they are frozen, unable to learn from experience, and they have no “default mode” for background processing, a source of spontaneous human insight.
To illustrate the issue, I describe such insights, and give an example concrete algorithm of a day-dreaming loop (DDL): a background process that continuously samples pairs of concepts from memory. A generator model explores non-obvious links between them, and a critic model filters the results for genuinely valuable ideas. These discoveries are fed back into the system’s memory, creating a compounding feedback loop where new ideas themselves become seeds for future combinations.
The cost of this process—a “daydreaming tax”—would be substantial, given the low hit rate for truly novel connections. This expense, however, may be the necessary price for innovation. It would also create a moat against model distillation, as valuable insights emerge from the combinations no one would know to ask for.
The strategic implication is counterintuitive: to make AI cheaper and faster for end users, we might first need to build systems that spend most of their compute on this “wasteful” background search. This suggests a future where expensive, daydreaming AIs are used primarily to generate proprietary training data for the next generation of efficient models, offering a path around the looming data wall.
I'd also highlight the obstacles and implications sections:
Obstacles and Open Questions
…Just expensive. We could ballpark it as <20:1 based on the human example, as an upper bound, which would have severe implications for LLM-based research—a good LLM solution might be 2 OOMs more expensive than the LLM itself per task. Obvious optimizations like load shifting to the cheapest electricity region or running batch jobs can reduce the cost, but not by that much.
Cheap, good, fast: pick 2. So LLMs may gain a lot of their economic efficiency over humans by making a severe tradeoff, in avoiding generating novelty or being long-duration agents. And if this is the case, few users will want to pay 20× more for their LLM uses, just because once in a while there may be a novel insight.
This will be especially true if there is no way to narrow down the retrieved facts to ‘just’ the user-relevant ones to save compute; it may be that the most far-flung and low-prior connections are the important ones, and so there is no easy way to improve, no matter how annoyed the user is at receiving random puns or interesting facts about the CIA faking vampire attacks.
Implications
Only power-users, researchers, or autonomous agents will want to pay the ‘daydreaming tax’ (either in the form of higher upfront capital cost of training, or in paying for online daydreaming to specialize to the current problem for the asymptotic scaling improvements, see AI researcher Andy Jones 2021).
Data moat. So this might become a major form of RL scaling, with billions of dollars of compute going into ‘daydreaming AIs’, to avoid the “data wall” and create proprietary training data for the next generation of small cheap LLMs. (And it is those which are served directly to most paying users, with the most expensive tiers reserved for the most valuable purposes, like R&D.) These daydreams serve as an interesting moat against naive data distillation from API transcripts and cheap cloning of frontier models—that kind of distillation works only for things that you know to ask about, but the point here is that you don’t know what to ask about. (And if you did, it wouldn’t be important to use any API, either.)
Given RL scaling laws and rising capital investments, it may be that LLMs will need to become slow & expensive so they can be fast & cheap.
This reminded me of this line from Peter Watts' Blindsight, which seems relevant: