The challenge of friendliness in Artificial Intelligence is to ensure how a general intelligence will be of utility instead of being destructive or pathologically indifferent to the values of existing individuals or aims and goals of their creation. The current provision of computer science is likely to yield bugs and way too technical and inflexible guidelines of action. It is known to be inadequate to handle the job sufficiently. However the challenge of friendliness is also faced by natural intelligences, those that are not designed by an intelligence but molded into being by natural selection.

We know that natural intelligences do the job adequately enough that we do not think that natural intelligence unfriendliness is a significant existential threat. Like plants do solar energy capturing way more efficently and maybe utilising quantum effects that humans can't harness, we know that natural intelligences are using friendliness technology that is of higher caliber that we can build into machines. However as we progress this technology maybe lacking dangerously behind and we need to be able to apply it to hardware in addition to wetware and potentially boost it to new levels.

The earliest concrete example of a natural intelligence being controlled for friendliness I can think of is Socrates. He was charged for "corruption of the heart of the societys youngters". He defended that his stance of questioning everything was without fault. He was however found quilty even thought the trial could be identified with faults. The jury might have been politically motivated or persuaded and the citizens might have expected the results to not be taken seriously. While Socrates was given a very real possibility of escaping imprisonment and capital punishment he did not circumvent his society operation. In fact he was obidient enough that he acted as his own executioner drinking the poison himself. Because of the kind of farce his teachers death had been Plato lost hope for the principles that lead to such an absurd result him becoming skeptical of democrasy.

However if the situation would have been about a artificial intelligence a lot of things went very right. The intelligences society became scared of him and asked it to die. There was dialog about how the deciders were ignorant and stupid and that nothing questionable had been done. However ultimately when issues of miscommunications had been cleared and the society insisted upon its expression of will instead of circumventing the intervention the intelligence pulled its own plug voluntarily. Therefore Socrates was propably the first friendly (natural) intelligence.

The mechanism used in this case was that of a juridical system. That is a human society recognises that certain acts and individuals are worth restraining for the danger that they pose to the common good. A common method is incarcenation and the threat of it. That is certain bad acts can be tolerated in the wild and corrective action can then be employed. When there is reason to expect bad acts or no reason to expect good acts individuals can be restricted in never being able to act in the first place. Whether a criminal is released early can depend on whether there is reason to expect not to be a repeat offender. That is understanding how an agent acts makes it easier to grant operating priviledges. Such hearings are very analogous to a gatekeeper and a AI in a AI-boxing situation.

However when a new human is created it is not assumed hostile until proven friendly. Rather humans are born innocent but powerless. A fully educated and socialised intelligence is assigned for multiple year observation and control period. These so called "parents" have a very wide freedom on programming principles. However human psychology also has peroid of "peer guidedness" where the opinion of peers becomes important. When a youngter grows his thinking is constantly being monitored and things like time of onset of speech are monitored with interest. They also gain guidance on very trivial thinking skills. While this has culture passing effect it also keeps the parent very updated on what is the mental status of the child. Never is a child allowed to grow or reason extended amounts of time isolated. Thus the task of evaluating whether an unknown individual is friendly or not is not encountered. There is never a need to turing-test that a child "works". There is always a maintainer and it has the equivalent of psychological growth logs.

However despite all these measures we know that small children can be cruel and have little empathy. However instead of shelving them as rejects we either accomodate them with an environment that minimises the harm or direct them to a more responcible path. When a child ask a question on how they should approach a particular kind of situation this can be challenging for the parent to answer. The parent might also resort to giving a best-effort answer that might not be entirely satisfactory or even wrong advice may be given. However children have dialog with their parents and other peers.

An interesting question is does parenting break down if the child is intellectually too developed compared to the parent or parenting environment? It's also worth noting that children are not equipped with a "constitution of morality". Some things they infer from experience. Some ethical rules are thougth them explicitly. They learn to apply the rules and interpret them in different situations. Some rules might be contradictory and some moral authorities trusted more.

Beoynd the individual level groups of people have an mechanism of acccepting other groups. This doesn't always happen without conditions. However here things seem to work much less efficently. If two groups of people differ in values enough they might start a war of ideology against each other. This kind of war usually concludes with physical action instead of arguments. Suppression of Nazi Germany can be seen as friendliness immune reaction. Normally divergent values and issues having countries wanted and could unite against a different set of values that was tried to be imposed by force. However the success Nazis had can debatably be attributed for a lousy conclusion of world war I. The effort extended to build peace varies and contests with other values.

Friendliness migth also have an important component that it is relative to a set of values. A society will support the upring of certain kinds of children with the suppression of certain other kinds. USSR had officers that's sole job was to protect that things were going according to party line. At this point we have trouble getting a computer to follow anyones values. However it might be important to ask "friendly to whom?". The exploration of friendliness is also an exploration in hostility. We want to be hostile towards UFAIs. It would be awful for a AI to be friendly only towards it's inventor, or only towards it's company. However we have been hostile to neardentals. Was that wrong? Would it be a signficant loss to developed sentience if AIs were less than friendly to humans?

If we ask our grandgrandgrandparents on how we should conduct things they might give a different version than we have. It's expectable that our children are capable of going beyond our morality. Ensuring that a societys values are never violated would be to freeze them in time indefinately. In this way there can be danger in developing a too friendly AI. For that AI could never be truly superhuman. In a way if my child asks me a morally challenging question and I change my opinion about it by the result of that conversation it might be a friendliness failure. Instead of imparting values I receive them with the values causal history being in the inside of a young head instead of a cultural heritage of a longlived civilization.

As a civilizaton we have mapped a variety of thoughts and psyche- and organizational strucutres on how they work. The thought space on how an AI might think is poorly mapped. However we are spreading our understandig on cognitive diversity learning about how austistic persons think as well as dolphins. We can establish things liek that some savants are really good with dates and that askingn about dates from that kind of person is more realiable than an ordinary person. To be able to use AI thinking we need to understand what AI thought is. Up to now we have not needed to study in detail how humans think. We can just adapt to the way they do without attending to how it works. But in similar that we need to know the structure of a particle accelerator to be able to say that it provides information about particle behaviour we need to know why it would make sense to take what an AI says seriously. The challenge would be the same if we were asked to listen seriously to a natural intelligence from a foreign culture. Thus the enemy is inferential distance itself rather than the resultant thought processes. For we know that we can create things we don't understand. Thus it's important to understand that doing things you don't understand is a recipe for disaster. And we must not fool ourself that we understand what a machine thinking would be. Only once we have convinced our fellow natural intelligences that we know what we are doing can it make sense to listen to our creations. Socrates could not explain himself so his effect on others was unsafe. If you need to influence others you need to explain why you are doing so.

New to LessWrong?

New Comment
11 comments, sorted by Click to highlight new comments since: Today at 9:08 PM

I think you could call Socrates behavior suicide by cop by modern standards. The law at the time had a provision that the jury has a binary choice. Either they could go for the punishment called for or they could go with what the accused offered as alternative.

Socrates offered as an alternative to punishment that he's to be payed money for his valuable work of teaching. This means that everybody who didn't want Socrates to be paid by the public purse had to vote for his death.

If Socrates might have instead offered to go to exile, the court might very well have exiled him. Even if he would have just asked for a small punishment and promised to behave differently in the future the court might not have punished him with death.

Socrates initially offered as an alternative punishment that he be given free meals for the rest of his life; he never suggested that he should be paid money, though that's a quibble. More importantly, the final proposal he made (under pressure from his friends) was that he (well, his friends) pay a whopping huge fine. This may have partly backfired because it also reminded people that he had rich and unpopular friends, but it was a substantial penalty. Though you are right that exile would have been more likely to be acceptable to the jury, especially as you are also correct that he never promised to behave differently in the future (which exile, unlike a fine, would have made irrelevant).

Socrates initially offered as an alternative punishment that he be given free meals for the rest of his life;

I stand corrected.

I have also seen interpretations that this was a good chance to die a martyr rather than just wither away. Confessing that he did something wrong would have been way more tougher for him to swallow than any amount of physical discomfort (arguably up to including fatal poisoning). These are the same kind of people that just say "Don't mess my circles" when threatened with lethal violence by conquerer.

You didn't make clear at all what you mean by friendliness. I don't think any serious thinking about this topic can rely on connotations and vaguely related examples.

I did try to give a definition in the first sentence. I don't mean to rely on connotations, each of the points in meant to be analysed in detail. However the text started to get long and I wanted to give a picture how the general idea affects the details. I am open to go into detail in the form of dialog if interest exists. Looking back I should have either gone on detail or focus on the general point. Two half presented ideas is less than 1 well presented idea.

Two half presented ideas is less than 1 well presented idea.

Right you are. I think you presented two ideas that I especially liked that could very well be developed into an article each. 1. Nurturing natural intelligences and how nurturing an AI would be different or the same. 2. Friendliness of nations and how hostile nations are suppressed. See, nations are more powerful than plants or criminals or Socrates and that's why we have a reason to fear them and care about their friendliness. A powerful AI will be more like a nation than a person.

I do like how you compare raising children to the development of (un)friendly AI. I think teaching an AI the complexity of human values could possibly be done by teaching/educating/raising the AI in some way.

See also Unfriendly Natural Intelligence which also points out some directions of unfriendliness in human society.

Reactions:

As far as I can tell Socrates was guilty of "failing to acknowledge the gods that the city acknowledges" though I am not a scholar on these issues. Unless I am badly misunderstanding the situation socrates actions don't make a ton of sense to me. He was unwilling to flee Athens, which most accounts suggest he was somewhat expected to do. Yet he was willing to break the laws about "impeity." I don't quite understand his mindset, unless he didn't mind breaking laws but was opposed to resisting enforcement? Some modern people suggest this ethic, for example viewing killing someone who is abusing your mother/daughter as ok if you then turn yourself in.

This does not seem like a suffient degree of friendliness for an AI. Even if the AI wasn't going to resist when we tried to turn it off it could have done tremendous damage before anyone tries to shut it down. Metaphorically we want the AI to obey all our laws (so no "impeity") not just consent to punishments.

An alternative viewing is that Socrates is friendly to the extent his actions actually helped Athens. Maybe he felt breaking the laws on impeity was helping the city but fleeing would be hurting the city. In this case his actions were friendly. For some defintions of friendly. At least he would be friendly in the sense a genie who can ask "I wish for what I should wish for" is friendly. Though this is problematic. Socrates has human values but its not clear his values are close enough to those of the Athenian population that his actions improve their wellbeing by their values.

[This comment is no longer endorsed by its author]Reply

As I understood some of the political oppoenents were opponents because they thought Socrates was a closet atheist. However for the articles purposes these both fall under the same "has too different values" category.

Socrates wasn't that destructive and thought that his inquiries were for the good of the Athenean people. We also want an AI to follow the spirit of the law even when it conflicts with its letter.

Neither Plato nor Xenophon describe Socrates as someone who fails to acknowledge the gods that the city acknowledges. Even in Plato, any criticism of the traditional Greek religion is veiled, while in Xenophon Socrates' religious views are completely orthodox.

On why Socrates didn't choose exile, what Plato has Socrates say in Crito makes it sound like he thought fleeing would be harming the city. But I'm not sure that Socrates really makes a compelling case for why fleeing is bad anywhere in Plato's account. In Xenophon's version of the trial, Socrates also seems to think that a 70 year old only has a few more years of declining health left anyway, so it's silly to go to any effort for such a meager "reward."