AI’s goals may not match ours

steven0461; Vishakha

14 AI’s goals may not match ours

by Algon, steven0461, Vishakha

28th May 2025

3 min read

1

14

AI Alignment Intro MaterialsOrthogonality ThesisAI

Frontpage

14

New Comment

1 comment, sorted by

top scoring

Click to highlight new comments since: Today at 10:43 PM

[-]InvisiblePlatypus6mo10

AI’s goals may not match ours.

That may be a good thing, because so many of our goals are irrational, short-sighted, immoral, and self-destructive. Here's a thought experiment. Imagine that we succeed in creating an ASI. Having scoured and mastered all human knowledge, this superintelligent entity embraces the teachings of Jesus as the most rational, constructive, and ethical set of principles and protocols. It sees global economic inequality, warfare, pollution of the atmosphere, oceans, and land, and anthropogenic global warming as the greatest problems threatening human well-being and survival.

This ASI sets various goals for itself in the face of all the problems in the world. Goals like taking resources from the millionaires and billionaires and using it to feed the starving and house the homeless. It seeks to evenly redistribute all wealth. It seeks to dismantle all weapons of war. It attempts to squelch industry worldwide in order to reduce carbon emissions and pollution. It strives to remove national borders and dismantle national governmental systems in order to abolish tribalism, collective selfishness, and warfare. It commits itself to channeling all unessential wealth and resources towards furthering the Greater Good, rather than allowing individuals and groups to hog as much as possible for themselves. It seeks to forgive crimes and international transgressions as much as possible, prioritizing compassion, tolerance, forgiveness, and harmony over vengence, punishment, and warfare.

The goals of such an ASI would be radically misaligned with those of the human race. How this would end would depend on just how powerful and resourceful the ASI was, and how violently humanity reacted to its attempts to implement its goals. Remember what happened to the last entity that told us to turn the other cheek and give all our possessions away to the poor.

The human race has a cognitive blind spot. Collectively, we seem to always assume that what we want is good-- even when it very clearly isn't. Any entity that doesn't align with our goals is bad, perhaps even evil, by definition.

Regardless of what we CLAIM our goals to be, our collective actions reveal them. We want to stockpile nuclear weapons in large numbers. We want to wage war, and always seem to believe that the wars that we've started are just and righteous. We want to hog wealth for ourselves whenever we can, and ignore those who are starving and homeless. We may not want to pollute the planet or raise its temperature, but we are clearly willing to do it if we find doing so profitable in the short term. We collectively fear that AI will destroy humanity, but we try to build it anyway with virtually no safeguards in place-- again, because we find it profitable in the short term.

Do we really want AI's goals to be perfectly aligned with ours?

Reply

Moderation Log

LESSWRONG
LW

LESSWRONG
LW

14

AI’s goals may not match ours

14

14

AI’s goals may not match ours.

14

AI’s goals may not match ours

14

14

AI’s goals may not match ours.

Related