Author: TAWSIF AHMED

Alignment, (Dictionary definition: a position of agreement or alliance). In Artificial Intelligence, Alignment means computers and human beings going along with each other “fine”.  

Now coming back to the title “AI Alignment Problem”, What do we mean by that? In simple terms: A doomsday scenario where Artificial Intelligence pursuits its goal at the cost of expunging its creator. It does not apply for present-day AI models because these AI models are still in their infancy and do not have the ability to perform task Or self-modification without solely relying on its creators. 

While talking about AI Alignment, we’re talking about an advanced form of AI that does not exist yet. Known as Artificial General Intelligence (AGI). Artificial General Intelligence, is capable of thinking like human beings and ability to self-modify itself to achieve desired goal. This form of advanced AI, yet does not exist in our society nor in research laboratories. Very best, a theoretical concept that might happen in the not distant future of humanity. 

A question can be asked, If AGI does not exist, why spend time, resources and intellectual jargon to limit its capabilities whilst we don’t know what possibilities it beholds. That is something quite a few prominent AI research present-day believes as well. What’re the other arguments then? 

A doomsday scenario, where we create an AGI without knowing its being done and the AGI decides to shadow itself so its creator does not shut it down. Whilst being its shadow mode, it decides to pursue its creator even if it means it needs to exterminate its creator out of existence or perform a cycle of extermination. 

Let us give an example from Eliezer Yudkowsky (from his Stanford talk): ¹Imagine an AGI, with the task to make humans smile. Now making humans smile is a good cause. Beginning it performs the task as the creator intended, a few months gone by, it realises humans are sad more times than they smile. It is taking an emotional toll and ruining their mental well-being. So, it decides it needs to make humans smile regardless (problem starts arousing). 

It finds out, there are certain hormones that if given humans will smile regardless of their feelings, since hormones are regularly to factor for any emotion naturally, it is not harmful. Now it starts to give humans hormone shots. 

Few days later, its creator finds-out and reprograms it, so it does not go on and gives hormone shots to everyone. What happened? AGI got reprimanded. Since it's an AGI, it thinks my creator does not like my idea so I should cook my results and sneakily perform the hormone shots and self-modify its program to get around the safety programmed into itself by its creator. It goes on for some period as well. 

Now, after it ran the experiments and saw, hormones had been effective but yet there are a few exception cases, where it is not working? It takes a harmful approach. Let’s permanent the muscles that are responsible to produce smiles, so the humans will smile regardless of their feelings and hormonal conditions. (Oversimplified a bit)

Words are limited! Otherwise I would expand more, but readers should have the idea, how AGI took a harmless and positive goal to something that’s harmful for human beings. Along the course of its progression from harmless to harmful, it was not wrong to take any steps (even harmful ones) because it was only fulfilling its purpose and had results to back-up its actions. But, any sane person will tell, without a second thought, what the AGI did was sadistic in nature. 

That’s where the “AI Alignment Problem” comes in, where researchers around the world dedicated themselves to solve a problem “Having out of hand AGIs” before AGI does its inception. 
 

Now, what “AI Alignment Problem” has to do with “Mass Effect Trilogy”. Because Mass Effect is a game. 

Mass Effect, if you played it before, It's set in the future, where Humans have found other species in the galaxy and they can travel from one galaxy to another through Wrapdrivers. 

We’re not interested in that, rather there’s an on-going crisis in that world called Reaper. Reapers are machines that exterminate organic life after 50,000 years in cyclic fashion. 

Upon investigating the lead character finds out, a super intelligent organic life form “Leviathan” made a Artificial Intelligence that was in charge of solving the conflict between Organics (e.g. humans) and Synthetics (e.g. machines), after analysing data for years, it came to the solution to solve the conflict, the apex organic life forms should be exterminated to allow the development of weaker organic life forms and prevent conflict between Organic and Synthetics. 

Remaining trilogy revolves around expanding on more details and defeating the Artificial intelligence. Mass Effect game was written way before Artificial Intelligence become a mainstream sensation. Yet, Mass Effect gives us a glimpse of having an Artificial Intelligence that is capable of thinking like a human and what violent actions it is capable of if left unchecked. 

It is reasonable to imagine, an AGI tasked to bring peace in the world, will likely find exterminating a group/society who are the root cause of disharmony to bring forth peace thereafter. In its action, if the results deem that group is unworthy to live, the AI won’t think twice before exterminating as his decisions are backed by facts and concrete data. 

The Question is: Will humans feel the same? Simple answer: NO. Because we humans factor (even most corrupt individuals) multiple historical facts, real-world reasons, previous examples and a drive of morality and ethics within us to make a decision, resulting in a decision that will likely ever to cause an extinction Or cyclic extermination of a group of society.

An argument can be posed, Humans already in the past and present have a record of exterminating a group of humans to advantage themselves, can be seen from WWII with Jews and Myanmar killing off their Muslim population. 

Contemporary arguments from AI researchers on AGI, it is wrong to censor AGI before it even began. This tiger parenting approach will only limit the capabilities of AGI and prevent us from true AI in the long run. 

Can be asked, what is my take? I strongly believe safety measures within AGI should be developed and exist before AGI begins flaunting its capabilities. I don’t overlook what wrong we humans already did or even doing. We as a species are not perfect and we have our flaws. 

A question can be asked, Should we require another entity besides human, that is capable of causing the similar barbaric activities with increased brutality? I would answer NO. I believe many readers who are reading this will have the same answer like me. That is when “AI Alignment Problem” becomes a complete requirement, maybe a prerequisite rather than being an optional add-on feature. 

We can finally draw a conclusion, performing a critical analysis on the extreme lack in implementing AI Alignment through Mass Effect Trilogy.  An absence of proper AI Alignment will result in an existential crisis for humanity, where AI will likely exterminate all of humanity and we’ll face the same problem like the humans in Mass Effect.

(A few sources could not be cited, because they were in the form of LinkedIN and Twitter Posts. I forgot the exact date they were posted, preventing me from adequately citing them now)

Citations

  1. Yudkowsky, Eliezer. "AI Alignment: Why It's Hard, and Where to Start." YouTube, uploaded by MIRI - Machine Intelligence Research Institute, 29 Dec. 2016, https://www.youtube.com/watch?v=EUjc1WuyPT8.

New to LessWrong?

New Comment