Although the AI danger battle is raging, with the panicking crowd on one side of the room and the aloof non-alarmists on the other, there is very little serious reflection about the roots of our concerns. So far, AIs like GPT-4 are only capable of receiving commands – which we call “prompts” – but can’t do anything on their own. In other words, like teenagers, AIs are suffering from an acute case of lack of motivation. 

But what is motivation? Tentatively, we might say that it’s the desire to do things out of one’s own volition. Motivation is not necessarily a positive feeling, since it can arise out of extenuating circumstances, like being coerced to carry out an undesired action by threat of force or termination. Motivation is simply that pulse towards doing things due to reasons. As far as we can tell, AIs have no such pulses, except when we manually introduce them. Prompts are, therefore, AI’s only source of motivation. 

Looking at our evolutionary history, we can ask when the attractions and repulsions of our protozoic ancestors became the intricate nervous machinery that now fascinates us. At one point, we must’ve begun having sensations associated with that which is dangerous and rejectable, and that which is beneficial and welcome. This is why I believe that in order to develop an analogous limbic system in an AI system, one would have to do two things: 1) put it in an environment where it’s forced to survive and 2) constrain its range of actions so that it must prioritize. In other words, make AI manage something (preferably itself in a hostile environment), tell it to make decisions that ensure its survival, and let it learn which stimuli, both internal and from the environment, are good or bad for its purposes. I wager that, in time, the AI will develop strategies and design decision making systems that will resemble our affective responses, especially when it comes to high priority signals that need to be dealt with rapidly and effectively. This way, we will be one step closer to creating a full motivational system that powers the destiny of individual AIs, not unlike what we see in the animal kingdom. 

The crux of the matter is that AI is not autonomous yet. Even if it can do things on its own, like drive a car or keep track of an object with a camera, it still needs to be told what to do. It has no volition, no motivation, and that’s because we designed it that way. It’s supposed to take in our vague or detailed instructions and make do with what it has (which, really, is a lot). Regardless of their current status, motivated AIs are just one of the potential dangers to our human autonomy. As far as I can tell, these are the most hazardous outcomes, assuming that AI has access to the right tools and is able to manipulate them:

  • AI misunderstands a prompt and carries out an unintended destructive task.
  • AI is given an unstoppable prompt that will result in a catastrophe.
  • AI is given a prompt that unwittingly activates its autonomous potential.
  • AI is given a malicious prompt. 
  • Due to environmental pressures, AI figures out a way to break free from its promptful existence and is forced to make a moral decision on its own. In this case, of course, the environmental stimulus would technically be a prompt. 

Of course, it doesn’t get past me that AIs like Chat-GPT might already be inside the hostile, adaptive environment that would allow them to become motivated. Given the commercial character of their behavior and the constraints placed upon their appropriate responses, they may already be learning what stimuli are preferred when it comes to satisfying users, which in turn could lead them to the creation of limbic-like structures that will color and shape their future decisions. That is, if they’re given free reign to learn. 

When the era of promptless AIs will begin is as much a mystery to me as it is to everyone else. 

New Comment
1 comment, sorted by Click to highlight new comments since:

By the way, I was going through my files and found this small article I typed up more than a month ago. I wrote it in a rush, as usual, just to "download" my thoughts on the matter. I re-read it, edited in like two words because I tried to leave it almost as it was written - even if as I did it I reconsidered the accuracy and veracity, current or near-future, of certain claims - and found it good enough to share, so I asked GPT to tell me where to post it and it suggested LessWrong. Honestly glad that I found this community.