Charles Gaston — LessWrong

Dr. Yoshua Bengio wrote:

And there is another issue which was not much discussed (although the article does talk about the short-term risks of military uses of AI etc), and which concerns me: humans can easily do stupid things. So even if there are ways to mitigate the possibility of rogue AIs due to value misalignment, how can we guarantee that no single human will act stupidly (more likely, greedily for their own power) and unleash dangerous AIs in the world?

My take:

I am not sure we want to live in a world in which freedom is reduced to the extent that warrants or insures that the possibility that humans might do stupid things is non existent. It seems to me that there is a thick and straight line of philosophical knowledge, spanning at least from J.S Mill to the tradition of existentialists, and which has overall fought against dogmatism (in favor of humanism). I "dare" evoking philosophy in the present debate, as it seems to me that all "non scientific" concepts debated here are of a philosophical nature. By "non scientific" I mean "non empirical".

Science, preferably empirical science --including information or computing science-- remains a privileged source of information on which such important topics should be grounded. For instance, "the fear that an artificially intelligent system could, as a result of a very problematic training process, could suddenly develop its own sense of will, and that this sense of will could be opposed to that of its 'Creator'", has, so far as I know, no empirical justification.

To turn this conjecture into a scientific fact, science, computing science, i.e. "you", should aim at creating a toy model in which the goal would be that such state be achieved. You see?

In other words, the question we should ask ourselves, is not "How can we make sure that no humans will do stupid things", but rather, provided that we know that humans will do stupid things, how will your system react?

Just like social entities (governments etc) should count on the services of people capable of hacking them, I think it is not unreasonable to conjecture that an entity which operate a LLM should invest (in R&D) in trying to break a toy version, trying to elicit what you seem to fear.

For speculations and conjectures will only take you "so far".

CG

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments

Posts

Wikitag Contributions

Comments