I passed up an invitation to invest in Anthropic in the initial round which valued it at $1B (it's now planning a round at $170B valuation), to avoid contributing to x-risk. (I didn't want to signal that starting another AI lab was a good idea from a x-safety perspective, or that I thought Anthropic's key people were likely to be careful enough about AI safety. Anthropic had invited a number of rationalist/EA people to invest, apparently to gain such implicit endorsements.)
This idea/plan seems to legitimize giving founders and early investors of AGI companies extra influence on or ownership of the universe (or just extremely high financial returns, if they were to voluntarily sell some shares to the public as envisioned here), which is hard for me to stomach from a fairness or incentives perspective, given that I think such people made negative contributions to our civilizational trajectory by increasing x-risk.
I suspect that others will have other reasons (from other political or ethical perspectives) to object to granting or legitimizing a huge windfall to this small group of people, and it seems amiss that the post/paper is silent on the topic.
A few more related thoughts:
I always thought it was totally crazy for people to lump Nick Bostrom and Marc Andreessen together into TESCREAL and criticize them in the same breath, but this post plays right into such criticism.
I'm also bald...
it seems fine to invest and then publicly state your views, including that it should not be interpreted as an endorsement. your investment (and that of other people who decide similarly) is trivial in size compared to the other sources of funding, such that it's not counterfactual. you're not going to cause the founders of anthropic to get any less of a windfall. the decision process for the vast majority of possible investors does not take into account whether or not you invested.
i think you've already sufficiently signaled your genuineness, for all practical purposes. i don't think it's healthy to have a purity spiral.
There are like 4 reasons why I think this logic doesn't check out:
I don't think it's impossible to work these out, and think there is at least one case of an investor in Anthropic and other capability companies where I think it is plausible they made the right choice in doing so, but the vast majority of people didn't do anything to counteract the issues above and did indeed just end up causing harm this way.
Do you have any older comment indicating proof of this? (That the actual reason you turned it down was x-risk and not, let's say, because you thought the investment was not rewarding enough.) Seems very important to me if true, and will cause me take your claims more seriously in general in future.
I think this 2023 comment is the earliest instance of me talking about turning down investing in Anthropic due to x-risk. If you're wondering why I didn't talk about it even earlier, it's because I formed my impression of Dario Amodei's safety views from a private Google Doc of his (The Big Blob of Compute, which he has subsequently talked about in various public interviews), and it seemed like bad etiquette to then discuss those views in public. By 2023 I felt like it was ok to talk about since the document had become a historical curiosity and there was plenty of public info available about Anthropic's safety views from other sources. But IIRC, "The Big Blob of Compute" was one of the main triggers for me writing Why is so much discussion happening in private Google Docs? in 2019.
I have done a lot of thinking about punishment for systemically harmful actors. In general, I have landed on the principle that justice is about prevention of future harm more than exacting vengeance and some kind of "eye for an eye" justice. As satisfying as it seems, most of history is fairly bleak on the prospects of using executions and other forms of violent punishment to deter future people from endangering society. This is quite difficult to stomach, however, in the face of people who are seemingly recklessly leading us in a dance on the edge of a volcano. I also don't really buy the whole "give the universe to Sam Altman/POTUS and then hope he leaves everyone else some scraps" model of universal governance.
I think, in light of this, that the open investment model could work, on two conditions:
A) Regulatory intervention happens to ensure that most of the investment is reinvested in the company's safety R&D efforts rather than to enrich its owners e.g. with stock buybacks. There is precedent for this, Amazon famously reinvested lots of money into improving its infrastructure to the point of making a loss for decades.
B) The ownership shares of existing shareholders are massively diluted or redistributed to prevent concentration of voting rights in a few early stakeholders.
If these companies are as critical to humanity's future as we say they are, we should start acting like it.
This idea/plan seems to legitimize giving founders and early investors of AGI companies extra influence on or ownership of the universe (or just extremely high financial returns, if they were to voluntarily sell some shares to the public as envisioned here), which is hard for me to stomach from a fairness or incentives perspective, given that I think such people made negative contributions to our civilizational trajectory by increasing x-risk.
One question is whether a different standard should be applied in this case than elsewhere in our capitalist economy (where, generally, the link between financial rewards and positive or negative contributions to xrisk reduction is quite tenuous). One could argue that this is the cooperative system we have in place, and that there should be a presumption against retroactively confiscating people who invested their time or money on the basis of the existing rules. (Adjusting levels of moral praise in light of differing estimations of the nature of somebody's actions or intentions may be a more appropriate place for this type of consideration to feed in. Though it's perhaps also worth noting that the prevailing cultural norms at the time, and still today, seem to favor contributing to the development more advanced AI technologies.)
Furthermore, it would be consistent with the OGI model for governments (particularly the host government) to take some actions to equalize or otherwise adjust outcomes. For example, many countries, including the U.S., have a progressive taxation system, and one could imaging adding some higher tax brackets beyond those that currently exist - such as an extra +10% marginal tax rate for incomes or capital gains exceeding 1 trillion dollars, or exceeding 1% of GDP, or whatever. (In the extreme, if taxation rates began approaching 100%, this would become confiscatory and would be incompatible with the OGI model; but there is plenty of room below that for society to choose some level of redistribution.)
I'm unsure whether a different standard is needed. Foom Liability, and other such proposals, may be enough.
For those who haven't read the post, a bit of context. AGI companies may create huge negative externalities. We fine/sue folks for doing so in other cases. So we can set up some sort of liability. In this case, we might expect a truly huge liability in plausible worlds where we get near misses from doom. Which may be more than AGI companies can afford. When entities plausibly need to pay out more than they can afford, like in health, we may require they get insurance.
What liability ahead of time would result in good incentives to avoid foom doom? Hanson suggests:
Thus I suggest that we consider imposing extra liability for certain AI-mediated harms, make that liability strict, and add punitive damages according to the formulas D= (M+H)*F^N. Here D is the damages owed, H is the harm suffered by victims, M>0,F>1 are free parameters of this policy, and N is how many of the following eight conditions contributed to causing harm in this case: self-improving, agentic, wide scope of tasks, intentional deception, negligent owner monitoring, values changing greatly, fighting its owners for self-control, and stealing non-owner property.
If we could agree that some sort of cautious policy like this seems prudent, then we could just argue over the particular values of M,F.
Yudkowsky, top foom doomer, says
"If this liability regime were enforced worldwide, I could see it actually helping."
The proposal could work even if countries were to only buy stocks of publicly traded companies in highly efficient secondary markets (and exclude IPOs and secondary public offerings), so that we do not affect the stock price or how much capital the company has at hand and thus doesn't speed up AI progress.
Microsoft, Google, Amazon, Nvidia have quite a bit of exposure to Anthropic, DeepMind, OpenAI, and xAI.
Appreciate your integrity in doing that!
At the same time, the unfairness of early frontier lab founders getting rich seems to me like a very acceptable downside given that open investment could solve a lot of issues and the bleakness of many other paths forward.
Couldn't we just... set up a financial agreement where the first N employees don't own stock and have a set salary?
My main concern is that they'll have enough power to be functionally wealthy all-the-same, or be able to get it via other means (e.g. Altman with his side hardware investment / company).
Couldn't we just... set up a financial agreement where the first N employees don't own stock and have a set salary?
Maybe, could be nice... But since the first N employees usually get to sign off on major decisions, why would they go along with such an agreement? Or are you suggesting governments should convene to force this sort of arrangement on them?
My main concern is that they'll have enough power to be functionally wealthy all-the-same, or be able to get it via other means (e.g. Altman with his side hardware investment / company).
I'm not sure I understand this part actually, could you elaborate? Is this your concern with the OGI model or with your salary-only for first-N employees idea?
But since the first N employees usually get to sign off on major decisions, why would they go along with such an agreement?
I'm imagining a world where a group of people step forward to take a lot of responsibility for navigating humanity through this treacherous transition, and do not want themselves to be corrupted by financial incentives (and wish to accurately signal this to the external world). I'll point out that this is not unheard of, Altman literally took no equity in OpenAI (though IMO was eventually corrupted by the power nonetheless).
To help with the incentives and coordination, instead having the first frontier AI megaowner step forward and unconditionally relinquish some of their power, they could sign on to a conditional contract to do so. It would only activate if other megaowners did the same.
Ok yes, that would be great.
I'll point out that this is not unheard of, Altman literally took no equity in OpenAI (though IMO was eventually corrupted by the power nonetheless).
He may have been corrupted later by power later. Alternatively, he may have been playing the long game, knowing that he would have that power eventually even if he took no equity.
I'm not sure I understand this part actually, could you elaborate? Is this your concern with the OGI model or with your salary-only for first-N employees idea?
This is a concern I am raising with my own idea.
I think that if you have a knack for ordinary software development, one application of that is to work at a tech company whose product already has or eventually obtains widespread adoption. This provides you with a platform where there is a straightforward path towards helping improve the lives of hundreds of millions of people worldwide by a small amount. Claude has around ~20-50 million monthly active users, and for most users it appears to be beneficial overall, so I believe that this criterion is met by Anthropic.
If you capture a small fraction of the value that you generate as a competent member of a reasonably effective team, then that often leads to substantial financial returns, and I think this is fair since the skillset and focus required to successfully plan and execute on such projects is quite rare. The bar for technical hires at a frontier lab is highly competitive, which commands equally competitive compensation in a market economy. You almost certainly had to clear a relatively higher bar (though one with less legible criteria) to be invited as an early investor. Capital appreciation is the standard reward for backing the production of a reliable and valuable service that others depend upon.
If you buy into the opportunity in AGI deployment, even the lower bounds of mundane utility can be one of the most leveraged ways to do good in the world. Given the dangers of ASI development, improvements to the safety and alignment of AGI systems can prevent profound harm, and the importance of this cannot be understated. Even in the counterfactual scenario where Anthropic was never founded, the urgency of such work would still be critical. There is some established precedent for handling a profitable industry with negative externalities (tobacco, petroleum, advertising) and it would be consistent to include the semiconductor industry in this category. I agree that existing frameworks are insufficient for making reasonable decisions about catastrophic risks. These worries have shaped my career working in AI safety, and a majority of the people here share your concerns.
However, I'm uncertain whether vilifying any small group of people would be the right move to achieve the strategic goals of the AI safety community. For example, Igor Babushchkin's recent transition from xAI to Babuschkin Ventures could have been complicated by an attitude of hostility towards the founders and early investors of AGI companies. Since nuanced communication doesn't work at scale, adopting this as our public position might inadvertently increase the likelihood of pivotal acts being committed by rogue threat actors, with inevitable media backlash identifying rationalist/EA people as culpable for "publishing radicalizing material". But taken seriously, that would be a fully general argument against the distribution of online material warning of existential risks from advanced AI, and being dumb enough to be vulnerable to making that sort of error tends to exclude you from being in positions where your failures can cause any real damage, so I think my real contention with such objections is not on strategy, but on principle.
I'd be much more comfortable with accountability falling to the level of the faceless corporate entity rather than on individual members of the organization, because even senior employees with a lot of influence on paper might have limited agency in carrying out the demands of their role, and I think it would be best to follow the convention set by criticism such as Anthropic is Quietly Backpedalling on its Safety Commitments and ryan_greenblatt's Shortform which doesn't single out executives or researchers as responsible for the behavior of the system as a whole.
I have made exceptions to this rule in the past, but it's almost always degraded the quality of the discussion. When asked about my opinion on this essay Dario Amodei — The Urgency of Interpretability at an AI Safety Social, I said that I thought it was hypocritical since a layoff at Anthropic UK had affected the three staff comprising their entire London interpretability team, which contradicts the top-level takeaway that labs should invest in interpretability efforts since if that was what was happening then you'd ideally be growing headcount on those teams instead of letting people go. But it's entirely possible Dario had no knowledge of this when writing the article, or that the hiring budget was reallocated to the U.S branch of the interp team, or even that offering relocation to other positions at the company wasn't practical for boring-and-complex H.R/accounting reasons. It doesn't seem like the pace of great interpretability research coming out of Anthropic has slowed down, so they're clearly still invested in it as a company. My hypothesis is that the extremely high financial returns are more of a side effect of operating at that caliber of performance instead of serving as a primary motivator for talent. If they didn't get rich from Anthropic, they'd get rich at a hedge fund or startup. The stacks of cash are not the issue here. The ambiguous future of the lightcone is.
It's possible that investors might be more driven by money, but I have less experience talking to them or watching how they work behind the scenes so I can't claim to know much about what makes them tick.
Slightly off-topic:
It’s a pleasant surprise to see Nick Bostrom posting here.
His perspective is unusually valuable. Whether or not one agrees on all points, having him in the conversation feels like a meaningful update.
Thanks for sharing this, Nick. I hope we’ll see more.
I didn't feel like there was a serious enough discussion of why people might not like the status quo.
Another model to compare to might be the one proposed in AI For Humanity (Ma, Ong, Tan) - the book as a whole isn't all that, but the model is a good contribution. It's something like "international climate policy for AGI."
Tunneling is always a concern in corporate structures, but alternative organizational forms suffer similar problems. Government officials, university department heads, and NGO executives also sometimes misuse the powers of their office to pursue personal or factional interests rather than the official mission of the organization they are supposed to represent. We would need a reason for thinking that this problem is worse in the corporate case in order for it to be a consideration against the OGI model.
As for the suggestion that governments (nationally or internationally) should prohibit profit-generating activities by AI labs that have major negative externalities, this is fully consistent with the OGI model (see section "The other half of the picture", on p. 4). AGI corporations would be subject to government regulation and oversight, just like other corporations are - and, plausibly, the intensity of government involvement would be much greater in this case, given the potentially transformative impacts of the technology they are developing. It would also consistent with the OGI model for governments to offer contracts or prizes for various prosocial applications of AI.
We would need a reason for thinking that this problem is worse in the corporate case in order for it to be a consideration against the OGI model.
Could we get info on this by looking at metrics of corruption? I'm not familiar with the field, but I know it's been busy recently, and maybe there's some good papers that put the private and public sectors on the same scale. A quick google scholar search mostly just convinced me that I'd be better served asking an expert.
As for the suggestion that governments (nationally or internationally) should prohibit profit-generating activities by AI labs that have major negative externalities, this is fully consistent with the OGI model
Well, I agree denotationally, but in appendix 4 when you're comparing OGI with other models, your comparison includes points like "OGI obviates the need for massive government funding" and "agreeable to many incumbents, including current AI company leadership, personnel, and investors". If governments enact a policy that maintains the ability to buy shares in AI labs, but requires massive government funding and is disagreeable to incumbents, that seems to be part of a different story (and with a different story about how you get trustworthiness, fair distribution, etc.) than the story you're telling about OGI.
Could we get info on this by looking at metrics of corruption? I'm not familiar with the field, but I know it's been busy recently, and maybe there's some good papers that put the private and public sectors on the same scale. A quick google scholar search mostly just convinced me that I'd be better served asking an expert.
I suspect it would be difficult to get much useful signal on this from the academic literature. This particular issue might instead come down to how much you trust the various specific persons that are the most likely corporate AI leaders versus some impression of how trustworthy, wholesome, and wise the key people inside or controlling a government-run AGI program would be (in the U.S. or China, over the coming years).
Btw, I'm thinking of the OGI model as offering something of a dual veto structure - in order for something to proceed, it would have be favored by both the corporation and the host government (in contrast to an AGI Manhattan project, where it would just need to be favored by the government). So at least the potential may exist for there to be more checks and balances and oversight in the corporate case, especially in the versions that involve some sort of very soft nationalization.
your comparison includes points like "OGI obviates the need for massive government funding" ... If governments enact a policy that maintains the ability to buy shares in AI labs, but requires massive government funding and is disagreeable to incumbents, that seems to be part of a different story (and with a different story about how you get trustworthiness, fair distribution, etc.) than the story you're telling about OGI.
In the OGI model, governments have the option to buy shares but also the option not to. It doesn't require government funding, but if one thinks that it would be good for governments to spend money some on AGI-related stuff then they could do so in the OGI model just as well as in other models - in some countries, maybe even more easily, since e.g. some pension funds and sovereign wealth funds could more easily be used to buy stocks than to be clawed back and used to fund a Manhattan project. Also, I'm imagining that it would be less disagreeable to incumbents (especially key figures in AI labs and their investors) for governments to invest money in their companies than to have their companies shut down or nationalized or outcompeted by a government-run project.
Btw, I'm thinking of the OGI model as offering something of a dual veto structure - in order for something to proceed, it would have be favored by both the corporation and the host government (in contrast to an AGI Manhattan project, where it would just need to be favored by the government). So at least the potential may exist for there to be more checks and balances and oversight in the corporate case, especially in the versions that involve some sort of very soft nationalization.
Interesting, thanks.
Stating the obvious here but Trump has ensured that the USG cannot credibly guarantee anything at all and hence this is a non-starter for foreign governments.
I think that's an overstatement. There is still plenty of demand from international investors for U.S. assets, and e.g. yields on U.S. 30-year treasury bonds are not that high by historical standards (although they've been climbing somewhat since their historic lows in 2020.) If there is a reduction in confidence in other sorts of USG commitments, but maintained confidence in it's basic financial commitments and in U.S. property law, then that might support the paper's contention that the latter kind form a comparatively more reliable basis for positive-sum deals in relation to AGI-development than other types of agreements.
I invest in US assets myself but not because of any faith in the US, in fact the opposite - Firstly it's like a fund manager investing into a known bubble - You know it's going to burst but, if it doesn't burst in the next year or so you cannot afford the short/medium term loss relative to your competitors and, secondly, If the US crashes it takes down the rest of the world with it and is probably the first to recover so you might as well stick with it. None of this translates to faith in US, AI, governance. Your mention of positive-sum deals is particularly strange since, if the world has learned one thing about Trump, it is that he sees the world, almost exclusively, in zero sum terms.
Every known plan for a post-AGI world is one which I do not expect my loved ones to survive.
I am grateful that you have spread awareness of the risk of human extinction from AI. I am genuinely saddened that you seem to be working to bring it about.
I stand with the majority of humanity, who do not want superintelligence to be created, and who do not consent to the risks. As the science stands, today, and for the foreseeable future, it is a fact that the only safe AGI is no AGI. If we here who know the stakes are not united in our call to shut down frontier AI development and preserve our very lives — a thing that unlike alignment we actually know is possible to achieve — then what was the rationalist project ever about? Math games? Playing inside politics with money? Risking everyone else's everything forever because it's fun to watch lines go up today? Being the limbs of Moloch? Feigning helplessness and pretending we don't have the ability to pull our own children out of the fire that we lit with our own match?
needfully combative
Every known plan for a post-AGI world is one which I do not expect my loved ones to survive.
I think your life expectancy and that of your loved ones (at least from a mundane perspective) is longer if AGI is developed than if it isn't.
Btw, the OGI model is not primarily intended for a post-AGI world, but rather for a near-term or intermediary stage.
However, I agree that if somebody thinks that we should completely stop AGI then the OGI model would presumably not be the way to go. It is presented as an alternative to other governance models for the development of AGI (such as Manhattan project, CERN, Intelsat, etc.). This paper doesn't address the desirability of developing AGI.
to shut down frontier AI development and preserve our very lives — a thing that unlike alignment we actually know is possible to achieve —
Fwiw, I think it's more likely that AI will be aligned than that it will be shut down.
I am grateful that you have spread awareness of the risk of human extinction from AI. I am genuinely saddened that you seem to be working to bring it about.
One has to take the rough with the smooth... (But really, you seem to be misattributing motive to me here.)
If we here who know the stakes are not united in our call to shut down frontier AI development and preserve our very lives — a thing that unlike alignment we actually know is possible to achieve — then what was the rationalist project ever about?
I see it more like a flickering candle straining to create a small patch of visibility in an otherwise rather dark environment. Strong calls for unanimity and falling into line with a political campaign message is a wind that might snuff it out.
I think your life expectancy and that of your loved ones (at least from a mundane perspective) is longer if AGI is developed than if it isn't.
You must have extreme confidence about this, or else your attitude about AGI would be grossly cavalier. Were it put to me, I would never take a bet at 99 to 1 that humanity will survive and flourish beyond my wildest dreams, vs. quickly come to a permanent end. That is a terrible deal. I genuinely love humanity, the monkeys that we are, and it is not my place to play games with other people's lives. I have not asked for a longer life, and I especially would not do so in exchange for even a small risk of immediate death. Most importantly, if I never asked everyone else what risks they are willing to shoulder, then I shouldn't even consider rolling dice on their behalf.
Fwiw, I think it's more likely that AI will be aligned than that it will be shut down.
I am aware that you think this, and I struggle to understand why. There are tractable frameworks for global governance solutions that stand a good chance of being able to prevent the emergence of AGI in the near term, which leaves time for more work to be done on more robust governance as well as on alignment. There are no such tractable frameworks for AGI alignment. There is not even a convincing proof that AGI alignment is solvable in principle. Granting that it is solvable, why hasn't it already been solved? How can you have such extreme confidence that it will be solved within the next few years, when we are by many measures no closer to a solution than we were two decades ago?
you seem to be misattributing motive to me here
I do not mean to attribute motive. Only to point out the difference between what it appears you are doing, and what it appears you think you are doing. I will eat crow if you can point to where you have said that if AGI development can be halted in principle (until it is shown to be safe), then it should be. That AGI should not be built if it cannot be built safely is a minimum necessary statement for sanity on this issue, which requires only that you imagine you could be incorrect about the ease of very soon solving a problem that no one knows how to begin to solve.
I see it more like a flickering candle straining to create a small patch of visibility in an otherwise rather dark environment. Strong calls for unanimity and falling into line with a political campaign message is a wind that might snuff it out.
I can empathize with the sentiment, but this is an outdated view. The public overwhelmingly want AI regulation, want to slow AI down, want AI companies to have to prove their models are safe, and want an international treaty. Salience is low, but rising. People take action when they understand the risks to their families. Tens of thousands of people have contacted their political representatives to demand regulation of AI, over ten thousand of which have done so through ControlAI's tool alone. Speaking of ControlAI, they have the support of over 50 UK lawmakers, and are making strides in their ongoing US campaign. In a much shorter campaign, PauseAI UK secured the support of 60 parliamentarians in calling for Google Deepmind to honor its existing commitments. The proposed 10-year moratorium on US states being allowed to regulate AI was defeated due at least in part to hundreds of phone calls made to congressional staffers by activist groups. US Congressman Raja Krishnamoorthi, the ranking member of the Select Committee on the CCP, recently had this to say:
Whether it's American AI or Chinese AI, it should not be released until we know it's safe. ... This is just common sense.
These beginnings could never have happened through quiet dealings and gently laid plans. They happened because people were honest and loud. The governance problem (for genuine, toothed governance) has been very responsive to additional effort, in a way that the alignment problem never has. An ounce of genuine outrage has been significantly more productive than a barrel of stratagems and backroom dealings.
The light of humanity's resistance to extinction is not a flickering candle. It is a bonfire. It doesn't need to be shielded now, if indeed it ever did. It needs the oxygen of a rushing wind.
"I think your life expectancy and that of your loved ones (at least from a mundane perspective) is longer if AGI is developed than if it isn't."
You must have extreme confidence about this, or else your attitude about AGI would be grossly cavalier.
Regarding attitudes about AGI, that's probably a bigger topic for another time. But regarding your and your loved ones' life expectancy, from a mundane perspective (which leaves out much that is actually very relevant), it would presumably be some small number of decades without AGI---less if the people you love are elderly or seriously ill. Given aligned AGI, it could be extremely long (and immensely better in quality). So even if we assume that AGI would arrive soon unless stopped (e.g. in 5 years) and would result in immediate death if unaligned (which is very far from a given), then it seems like your life expectancy would be vastly longer if AGI developed even if the chance of alignment were quite small.
These beginnings could never have happened through quiet dealings and gently laid plans. They happened because people were honest and loud.
I don't doubt that loud people sometimes make things happen, though all-too-often the things they make happen turn out to have been for the worse. For my own part, I don't feel there's such a deficit of loud people in the world that it is my calling to rush out and join them. This is partly a matter of personality, but I hope there's a niche from which one can try to contribute in a more detached manner (and that there is value in a "rationalist project" that seeks to protect and facilitate that).
So even if we assume that AGI would arrive soon unless stopped (e.g. in 5 years) and would result in immediate death if unaligned (which is very far from a given), then it seems like your life expectancy would be vastly longer if AGI developed even if the chance of alignment were quite small.
This naive expected value calculation completely leaves out what it actually means for humanity to come to an end: if you ever reach zero, you cannot keep playing the game. As I said, I would not take this chance even if the odds were 99 to 1 in favor of it going well. It would be deeply unethical to create AGI under that level of uncertainty, especially since the uncertainty may be reduced given time, and our current situation is almost certainly not that favorable.
I am not so egoistic as to value my own life (and even the lives of my loved ones) highly enough to make that choice on everyone else's behalf, and on behalf of the whole future of known sentient life. But I also don't personally have any specific wishes to live a very long life myself. I appreciate my life for what it is, and I don't see any great need to improve it to a magical degree or live for vastly longer. There are people who individually have such terrible lives that it is rational for them to take large risks onto themselves to improve their circumstances, and there are others who simply have a very high appetite for risk. Those situations do not apply to most people.
We have been monkeys in shoes for a very long time. We have lived and suffered and rejoiced and died for eons. It would not be a crime against being for things to keep happening roughly the way they always have, with all of the beauty and horror we have always known. What would be a crime against being is to risk a roughly immediate, permanent end to everything of value, for utopian ideals that are shared by almost none of the victims. Humanity has repeatedly warned about this in our stories about supervillains and ideologue despots alike.
Under our shared reality, there is probably no justification for your view that I would ever accept. In that sense, it is not important to me what your justification is. On the other hand, I do have a model of people who hold your view, which may not resemble you in particular: I view the willingness to gamble away all of value itself as an expression of ingratitude for the value that we do have, and I view the willingness to do this on everyone else's behalf as a complete disregard for the inviolable consent of others.
On reading the paper I came here to question whether OGI helps or harms relative to other governance models should technical alignment be sufficiently intractable and coordinating on a longer pause required. (I assume it harms.) It wasn't clear to me whether you had considered that.
Grateful for both the "needfully combative" challenge and this response.
I'm reading Nick as implicitly agreeing OGI doesn't help in this case, but rating treaty-based coordination as much lower likelihood than solving alignment. If so, I think it worth confirming this and explicitly calling out the assumption in or near the essay.
(Like Haiku I myself am keen to help the public be rightfully outraged by plans without consent that increase extinction risk. I'm grateful for the ivory tower, and a natural resident there, but advocate joining us on the streets.)
I don't quite understand what point is being made here.
The way I see it, we already inhabit a world in which half a dozen large companies in America and China are pressing towards the creation of superhuman intelligence, something which naturally leads to the loss of human control over the world unless human beings are somehow embedded in these new entities.
This essay seems to propose that we view this situation as a "governance model for AGI", alongside other scenarios like an AGI Manhattan Project and an AGI CERN that have not come to pass. But isn't the governance philosophy here, "let the companies do as they will and let events unfold as they may"? I don't see anything that addresses the situation in which one company tries to take over the world using its AGI, or in which an AGI acting on its own initiative tries to take over the world, etc. Did I miss something?
the governance philosophy here seems to be "let the companies do as they will and let events unfold as they may"
That is not quite right. The idea is rather that the government does whatever it does by regulating companies, or possibly entering into some soft-nationalization public-private partnership, as opposed to by operating an AGI project on its own (as in the Manhattan model) or by handing it over to an international agency or consortium (as in the CERN and Intelsat models).
There doesn't seem to be anything here which addresses the situation in which one company tries to take over the world using its AGI, or in which an AGI acting on its own initiative tries to take over the world, etc.
It doesn't particularly address the situation in which an AGI on its own initiative tries to take over the world. That is a concern common to all of the governance models. In the OGI model, there are two potential veto points: the company itself can choose not to develop or a deploy an AI that it deems too risky, and the host government can prevent the company from developing or deploying an AI that fails to meet some standard that the government stipulates. (In the Manhattan model, there's only one veto point.)
As for the situation in which one company tries to take over the world using its AGI, the host government may choose to implement safeguards against this (e.g. by closely scrutinizing what AGI corporations are up to). Note that there are analogous concerns in the alternative models, where e.g. a government lab or some other part of a government might try to use AGI for power grabs. (Again, the double veto points in the OGI model might have some advantage here, although the issue is complicated.)
> It doesn't particularly address the situation in which an AGI on its own initiative tries to take over the world.
That is a concern common to all of the governance models
I think this is wrong. The MIRI Technical Governance Team, which I'm part of, recently wrote this research agenda which includes an "Off switch and halt" plan for governing AI. Stopping AI development before superintelligence directly addresses the situation where an ASI tries to take over the world by not allowing such AIs to be built. If you like the frame of "who has a veto", I think at the very least it's "every nuclear-armed country has a veto" or something similar.
A deterrence framework—which could be leveraged to avoid ASI being built and thus impacts AI takeover risk—also appears in Superintelligence Strategy.
Under the status quo, it's pretty hard for private individuals to acquire significant shares in AGI companies; a key step towards realizing OGI would be to require AGI companies to be listed publicly.
Microsoft, Google, Amazon, Nvidia have quite a bit of exposure to Anthropic, DeepMind, OpenAI, and xAI.
Your working paper, "Open Global Investment as a Governance Model for AGI." It provides a clear, pragmatic, and much-needed baseline for discussion by grounding a potential governance model in existing legal and economic structures. The argument that OGI is more incentive-compatible and achievable in the short term than more idealistic international proposals is a compelling one.
However, I wish to offer a critique based on the concern that the OGI model, by its very nature, may be fundamentally misaligned with the scale and type of challenge that AGI presents. My reservations can be grouped into three main points.
1. The Inherent Limitations of Shareholder Primacy in the Face of Existential Stakes
The core of the OGI model relies on a corporate, shareholder-owned structure. While you thoughtfully include mechanisms to mitigate the worst effects of pure profit-seeking (such as Public Benefit Corporation charters, non-profit ownership, and differentiated share classes), the fundamental logic of such a system remains beholden to shareholder interests. This creates a vast principal-agent problem where the "principals" (all of humanity) have their fate decided by "agents" (a corporation's board and its shareholders) who are legally and financially incentivized to prioritize a much narrower set of goals.
This leads to a global-scale prisoner's dilemma. In a competitive environment (even OGI-1 would have potential rivals), the pressure to generate returns, achieve market dominance, and deploy capabilities faster will be immense. This could force the AGI Corp to make trade-offs that favor speed over safety, or profit over broad societal well-being, simply because the fiduciary duty to shareholders outweighs a diffuse and unenforceable duty to humanity. The governance mechanisms of corporate law were designed to regulate economic competition, not to steward a technology that could single-handedly determine the future of sentient life.
2. Path Dependency and the Prevention of Necessary Societal Rewiring
You astutely frame the OGI model as a transitional framework for the period before the arrival of full superintelligence. The problem, however, is that this transitional model may create irreversible path dependency. By entrenching AGI development within the world's most powerful existing structure—international capital—we risk fortifying the very system that AGI's arrival should compel us to rethink.
If an AGI corporation becomes the most powerful and valuable entity in history, it will have an almost insurmountable ability to protect its own structure and the interests of its owners. The "rewiring of society" that you suggest might be necessary post-AGI could become politically and practically impossible, because the power to do the rewiring would have already been consolidated within the pre-AGI paradigm. The stopgap solution becomes the permanent one, not by design, but by the sheer concentration of power it creates.
3. Misidentification of the Ultimate Risk: From Distributing Wealth to Containing Unchecked Power
My deepest concern is that the OGI model frames the AGI governance challenge primarily as a problem of distribution: how to fairly distribute the economic benefits and political influence of AGI. This is why it focuses on mechanisms like international shareholding and tax revenues.
I fear the ultimate risk is not one of unfair distribution, but of absolute concentration. As you have explored in your own work, AGI represents a potential tool of immense capability. It is a solution to the game of power, allowing its controller to resolve nearly any game-theoretic dilemma in their favor. The single greatest check on concentrated power throughout human history has been the biological vulnerability and mortality of leaders. No ruler has been immortal; no regime has been omniscient. AGI could sweep those limitations away.
From this perspective, a governance system based on who can accumulate the most capital (i.e., buy the most shares) seems like a terrifyingly arbitrary method for selecting the wielders of such ultimate power. It prioritizes wealth as the key qualification for stewardship, rather than wisdom, compassion, or a demonstrated commitment to the global good.
In conclusion, while I appreciate OGI's pragmatism, I believe its reliance on a shareholder-centric model is a critical flaw. It applies the logic of our current world to a technology that will create a new one, potentially locking us into a future where ultimate power is wielded by an entity optimized for profit, not for the flourishing of humanity.
Curated! Delighted to get a thoughtful new paper from Nick Bostrom, and I'm very intrigued by the relative pragmatic proposal here. There are many details that seem worthwhile, such as avoiding an AGI project primarily dictated by market forces, and also avoiding an AGI project primarily dictated by national militaristic forces, through having a company that intentionally distributes its voting shares in a rough heuristic approximation of a global democracy.
This seems to me quickly a plausible candidate for a new worthwhile direction in which to apply optimization pressure upon the world; though I cannot quite tell how such an improvement (if indeed it is a major improvement) trades off against work to simply cease global AGI development. I am curating in the strong hope that more people will engage with the details, propose amendments and changes, or offer arguments for why this direction is not workable or desirable.
For instance I quite appreciated Wei Dai's comment; I too feel that it's been very disheartening to see so many people get rich while selling out humanity's existential fate. I am unclear exactly what makes sense here, but I think it plausible that in choosing to be part of the leadership of such an AGI project one should permanently give up the possibility of ever becoming wealthy on the scale of more than a few million dollars, in order to remove the personal incentives upon you. I think the main counterargument here is that if one is not independently quite powerful (i.e. wealthy) then one may be targeted by forces that have far more power (i.e. funds) than you, and you will become controlled by them. I'm not sure how these considerations balance out.
Overall, very exciting, I look forward to thinking about this proposal more.
The assumption of "US-OGI-1" works well, but I think it's misleading to say that "the OGI model itself is geographically neutral—it could in principle be implemented by any technologically capable nation as host, or by multiple different countries as hosts for different AGI ventures."
Part of the appeal of this proposal is that "norms and laws around property ownership, investor rights, and corporate governance are comparatively well established and integrated with civilian society", but this clearly doesn't hold everywhere.
The second-most-likely country for a leading AGI lab to emerge is China, and we can imagine a "China-OGI-1" scenario or a multipolar "US-OGI-n + China-OGI-n" scenario. In China, a few issues come to mind:
1. Weaker/more arbitrary corporate law and property rights, so no credible investor protections
2. More state/party interference in law, rights and governance
3. Weaker, less independent corporate governance
I'm curious if you think that this would still be a good model, assuming China-OGI-1 or a multipolar scenario. Or does it make more sense to say that OGI is only a good model where these norms and laws are actually well-established and integrated?
It becomes partly terminological, but I would say that China-OGI-1 would by definition be a situation in which global investors in a Chinese company that develops AGI enjoy reasonably reliable assurances that their property rights would be largely respected. It seems maybe more attractive than the closest alternatives (i.e. a situation in which AGI is developed by a Chinese company and international investors don't have reasonable assurances that their rights would be protected, or a situation in which AGI is developed by a Chinese Manhattan project)? So the factors you point to don't affect the desirability of China-OGI-1 but rather the probability of that version of the OGI model becoming instantiated.
Btw, I should maybe also reemphasize that I'm putting forward the model more so that it can be considered alongside other models that have been proposed, rather than as something that I have a strong or definitive commitment to. I could easily imagine coming to favor other approaches, either as a result of arguments or because of changes in the world that alters the practical landscape. I generally have a high level of uncertainty about the politics/governance/macrostrategy of AI (doubly so in the Chinese context, where I have even less understanding), and I tend to think we'll need to feel and negotiate our way forward one tentative step at a time rather than operate on a fixed plan.
AI regulation can take many forms, using existing structures, or via new ones our governments create. The United States government (USG) has long favoured private ownership of corporations, including AI and tech companies, to encourage technology and investment. Contrast this with communist and socialist approaches, where state ownership is favoured. I think your proposal is the one favoured by the status quo, and I don't see the USG abandoning private ownership of corporations anytime soon, though the administration's recent acquisition of a stake in Intel is a new development and move towards a more socialist approach.
AI companies, as private corporations, are therefore already subject to all of the rules for corporations under US law, as well as subject to being sued in the courts, as is already happening to many AI companies. Our court system is the most powerful in the world and, in some ways, arguably the most powerful entity the world has ever seen. (The recent anti-trust litigation against Google is but one example.) Often, private companies in the US will also enact voluntary, internal standards as part of a good faith effort to demonstrate to the USG that they are capable of operating safely without intervention. One benefit of internal compliance systems is that they shift costs to the company and save the USG money.
There is also the question of whether or not the US requires bespoke regulation for AI corporations beyond what already exists. The EU has decided that it does via the AI Act; the US Congress is taking a wait-and-see approach. California is discussing this question right now. Because of the importance of California and the EU to the world economy, their rules will regulate US AI companies, and can therefore be viewed as a type of global regulation. We do not need to wait for the US federal government on this.
International cooperation on AI, whether through the UN, NATO or a bespoke treaty body, is a parallel option that can work in tandem with the regulatory frameworks discussed above, (with EU and CA law automatically setting global standards because of the size of the California and EU economies). Such global frameworks, like the UN, allow smaller countries to have more of a say and provide a forum to discuss how the system is working and pressure governments to take more concrete action. Some treaty mechanisms come with real enforcement mechanisms, others don't. Fitting AI into the existing system of treaties and treaty bodies is a big job that is currently under discussion at the UN, with UNESCO in some ways the most obvious choice. A bespoke treaty body is also an option.
Many, many authors point to the US economy juggernaut and rising living standards as proof of concept of the USG's light touch and collaborative approach when it comes to corporate regulation. The resiliency of the corporate framework shows in its use in everything from mining minerals, making chocolate bars and building AI. The good news is that the corporate structure we have can work in tandem with other options, like treaty bodies and the UN, so we don't need an either/or approach.
My biggest problem with this proposal is that it restricts AGI projects to a single entity, which I think is pretty far from the status quo.
It seems unlikely that Chinese companies would want to split voting rights with U.S. companies on how to develop AGI.
It seems much more likely that there would be multiple different entities that basically do whatever they want but coordinate on a very select, narrow set of safety-related standards.
My biggest problem with this proposal is that it restricts AGI projects to a single entity, which I think is pretty far from the status quo.
It doesn't, though. The paper talks about OGI-N as well as OGI-1. In the former version, there are multiple AGI-developing entities.
I've seen many prescriptive contributions to AGI governance take the form of proposals for some radically new structure. Some call for a Manhattan project, others for the creation of a new international organization, etc. The OGI model, instead, is basically the status quo. More precisely, it is a model to which the status quo is an imperfect and partial approximation.
It seems to me that this model has a bunch of attractive properties. That said, I'm not putting it forward because I have a very high level of conviction in it, but because it seems useful to have it explicitly developed as an option so that it can be compared with other options.
(This is a working paper, so I may try to improve it in light of comments and suggestions.)