Peter Berggren


Sorted by New

Wiki Contributions


Man, this article hits different now that I know the psychopharmacology theory of the FTX crash...

Have any prizes been awarded yet? I haven't heard anything about prizes, but that could have just been that I didn't win one...

I'm still not sure why exactly people (I'm thinking of a few in particular, but this applies to many in the field) tell very detailed stories of AI domination like "AI will use protein nanofactories to embed tiny robots in our bodies to destroy all of humanity at the press of a button." This seems like a classic use of the conjunction fallacy, and it doesn't seem like those people really flinch from the word "and" like the Sequences tell them they should.

Furthermore, it seems like people within AI alignment aren't taking the "sci-fi" criticism as seriously as they could. I don't think most people who have that objection are saying "this sounds like science fiction, therefore it's wrong." I think they're more saying "these hypothetical scenarios are popular because they make good science fiction, not because they're likely." And I have yet to find a strong argument against the latter form of that point.

Please let me know if I'm doing an incorrect "steelman," or if I'm missing something fundamental here.

Some figures within machine learning have argued that the safety of broad-domain future AI is not a major concern. They argue that since narrow-domain present-day AI is already dangerous, this should be our primary concern, rather than that of future AI. But it doesn't have to be either/or.

Take climate change. Some climate scientists study the future possibilities of ice shelf collapses and disruptions of global weather cycles. Other climate scientists study the existing problems of more intense natural disasters and creeping desertification. But these two fields don't get into fights over which field is "more important." Instead, both fields can draw from a shared body of knowledge and respect each other's work as valuable and relevant.

The same principle applies to machine learning and artificial intelligence. Some researchers focus on remote but high-stakes research like the alignment of artificial general intelligence (AGI). Others focus on relatively smaller but nearer-term concerns like social media radicalization and algorithmic bias. These fields are both important in their own ways, and both fields have much to learn from each other. However, given how few resources have been put into AGI alignment compared to nearer-term research, many experts in the field feel that alignment research is currently more worthy of attention.

(tech executives, ML researchers)

You wouldn't hire an employee without references. Why would you make an AI that doesn't share your values?

(policymakers, tech executives)

The future is not a race between AI and humanity. It's a race between AI safety and AI disaster.

(Policymakers, tech executives)

We need to be proactive about AI safety, not reactive.


In the Soviet Union, there was a company that made machinery for vulcanizing rubber. They had the option to make more efficient machines, instead of their older models. However, they didn't do it, because they wouldn't get paid as much for making the new machines. Why would that be? Wouldn't more efficient machines be more desirable?

Well, yes, but the company got paid per pound of machine, and the new machines were lighter.

Now, you may say that this is just a problem with communist economies. Well, capitalist economies fall into very similar traps. If a company has a choice of making slightly more profit by putting massive amounts of pollution into public waterways, they'll very often do it. The profit that they get is concentrated to them, and the pollution of waterways is spread out over everyone else, so of course they'll do it. Not doing it would be just as foolish as the Soviet company making new machines that weighed less.

Modern machine learning systems used in artificial intelligence have very similar problems. Game-playing AIs have exploited glitches in the games they play. AIs rewarded based on human judgements have deceived their judges. Social media recommendation AIs have recommended posts that made people angry and radicalized their politics, because that counted as "engagement."

At this point, we have stumbled into an economic system which combines capitalist private enterprise with regulation to correct for market failures. But there may not be time for "stumbling" once superhuman-level AI comes around. If a superintelligent AI with poorly designed goals is told to make thumbtacks, and it decides to turn the universe and everyone in it into thumbtacks... we're doomed.

Let's make sure AI does what we want it to do, not just what we tell it to do, the first time.


(policymakers, tech executives, ML researchers)

There is an enormous amount of joy, fulfillment, exploration, discovery, and prosperity in humanity's future... but only if advanced AI values those things.


(Policymakers, tech executives)

Even if you don't assume that the long-term future matters much, preventing AI risk is still a valuable policy objective. Here's why.

In regulatory cost-benefit analysis, a tool called the "value of a statistical life" is used to measure how much value people place on avoiding risks to their own life (source). Most government agencies, by asking about topics like how much people will pay for safety features in their car or how much people are paid for working in riskier jobs, assign a value of about ten million dollars to one statistical life. That is, reducing the risk of a thousand people dying by one in a thousand each is worth ten million dollars of government money.

If experts on AI such as Stuart Russell are to be believed (and if they're not to be believed, who is?), then superintelligent AI poses a sizeable risk of leading to the end of humanity. For a very conservative estimate, let's just assume that the AI will only kill every single American. There are currently over 330 million Americans (source), and so the use of the value of a statistical life implies that reducing AI risk by just one in a million is worth:

330 million Americans *  1 outcome in which all of them die / 1 million outcomes * 10 million dollars / statistical life = $3,300,000,000

No, this is not a misprint. It is worth 3.3 billion dollars to reduce the risk of human extinction due to AI by one in one million, based on the government's own cost-effectiveness metrics, even assuming that the long-term future has no significance, and even assuming that non-American lives have no significance.

And AI experts say we could do a lot more for a lot less.


Load More