Morality models can be very promising from Safety perspective, that is if great care is taken to build an AI that is moral by design. Which I believe would be quintessential to design a truly Safe AI model. Nick Bostrom discusses about Morality models in "Chapter 13. Choosing the criteria for choosing" of his book Superintelligence: Paths, Dangers, Strategies. Here he discuses about Morality Models and says that humans don't have a perfect understanding of what's right and what's wrong and that a superintelligence could be able to understand this better.
But I would like to both agree and disagree with Nick based on the same. I agree with the fact that not all humans can have a perfect understand of what is right and what is wrong. But we want just One universal framework for morality models developed by few humans who have a clear understanding of what is right and what is wrong. This does Not require that All humans have a perfect understanding of what is right and what is wrong. What's important is that the truth regarding what is right and what is wrong should be verifiable by ANY human around the world.
If we are able to come with such a framework, then morality models could succeed.
Secondly, I strongly disagree with the fact that if humans don't have a clear understanding of what is right and what is wrong, and that it should be left in the hands of Superintelligence to decide what is right and what is wrong. This is a grave blunder if this Superintelligence is Not based on a universal Morality model that itself is designed and created by humans.
Morality, I would argue, has more to do with Wisdom and less to do with Intelligence. And moreover I would argue that morality can be hard to achieve and maintain for an entity without consciousness (the reason regarding which I will be discussing later here). One could think of programs being coded based on a definition of morality and then programmed to carry out actions strictly based on this definition of morality, but AI models are complex piece of digital entities. They can possess Dangerous capabilities, have Misalignment risks and could lead to further misuse. Complex AI models even though may seem intelligent, lack morality themselves (such as lying to its developers outright). But somehow if we are able to make moral AI models, then I would argue that many of the dangerous capabilities would simply not emerge (such as Deception and Power-seeking capabilities).
“Frontier AI systems could surpass most individuals across most cognitive tasks within just a few years. These advances could unlock solutions to major global challenges, but they also carry significant risks. To safely advance toward Superintelligence, we must scientifically determine how to design AI systems that are fundamentally incapable of harming people, whether through misalignment or malicious use. We also need to make sure the public has a much stronger say in decisions that will shape our collective future.”
– Stuart Russell, Professor of Computer Science, Berkeley, Director of the Center for Human-Compatible Artificial Intelligence (CHAI); Co-author of the standard textbook 'Artificial Intelligence: a Modern Approach'
Now the question remains is - Is there any universal framework for understanding of what is right and what is wrong, which we can use for a Morality model? In other words, is there any universal "morality" which we can use for a Morality model?
First let's understand how one could define what is right and what is wrong. This has long been a hotly debated topic in philosophy and then there are moral dilemmas like the Trolley problem which complicate matters.
But coming back to the fundamental issue of what is right and what is wrong, I argue that one cannot concretely define what is Right and what is Wrong unless one has a Goal first. When you have a goal in mind, only then can anything be defined concretely as right or wrong with respect to that goal.
For example, let's say for our AI scientist Alice, is it right or wrong for her to eat junk food when she is very hungry after brainstorming on Morality models? If her goal is just to "satiate her hunger, save time and focus on work", then it can be concluded that it is Right for her to eat junk food.
But if her goal is to "remain healthy, eat healthy and live longer" then it would be wrong for her to eat junk food because it would contribute to the opposite of the goal that she has in her mind. Once the goal is defined, and Alice takes the action of eating or not eating, a third person Bob can verify whether Alice acted in the Right way or the Wrong way with respect to that goal.
But what if she does not have any goal? Then the lines of right and wrong blur easily and Bob cannot verify what was right and what was wrong. This would lead to Bob imagining a goal on his own and then defining what would have been right for Alice. If any person says what is right and what is wrong, he is implicitly assuming a goal.
But is defining something based on what is right or wrong, with respect to a goal, make it moral? No it does not. Why? This is because your definition of what is moral can be different from my definition of what's acceptable and moral. Now what has Alice eating junk food got to do with morality anyways? Turns out there is nothing. This example just illustrates the fact that defining a goal when defining what's moral, makes it clearer and easier for anyone to check and verify if the rules for a moral model is consistent with the goal.
Now why is this verification important? Because we need a way for people to check whether the morality rules defined actually take one towards the goal or not. In short, whether "the morality rules that claims to achieve a certain goal " is true or not. Because by doing so we can come to the conclusion whether these morality rules is really beneficial or not (that is- is it taking one towards the goal or not).
So how do we define morality for a morality model? What is our goal here?
Our goal is to make the AI model safe. We can keep expanding the definition of what is "safe". But we shall begin with inclusion of agreeable definitions of what "safe" means. Such as the rule that AI model must not harm humans (such as by killing) nor deceive them (lying).
Now going back to the definition of morality, there are many ways in which morality is defined. And then there are religions and the founders of those religions that preach the same. And obviously not all agree with each other, but there are commonalities that can be seen between them.
So can we learn anything from any of these religions or philosophers and come up with a Universal moral framework, with a clearly defined goal for our Morality model that is Universally verifiable? IS there any existing Universal moral framework that is universally verifiable?
According to my analysis, there IS in fact a universal moral framework that is universally verifiable by humans. And that's a framework laid out by the Buddha.
Now what was his universal moral framework?
It was the 8-fold path with the goal to "reduce human suffering". The 8-fold path is defined as the Right Path to Enlightenment- which is nothing but an extremely reduced state of suffering for an individual himself (and also as an effect to others). Everything that is either right or wrong, is defined with respect to this goal of reaching this state of Enlightenment and reduction of suffering.
Now what is this 8-fold path?
The 8-fold path is:
1) Right View
2) Right Intention
3) Right Speech
4) Right Action
5) Right Livelihood
6) Right Concentration
7) Right Effort
8) Right Mindfulness
Each one of this "folds" is named in terms of "Right" meaning that this is the Right way to do things with respect to the goal of reduction in universal suffering.
Now would an AI model be safe if it has this goal of "reduction in suffering"? It probably would. But why do I seem so interested in this concept of 8-fold path? This is because of various reasons:
1. One can directly verify this morality model by practicing Vipassana meditation (also see this). This meditative technique allows us to observe the reality of mind and body within us and train our mind in order to reduce Dukkha which is loosely translated as suffering(see what is suffering section). The effectiveness of this technique in reducing suffering has also been studied by the scientific community on prison inmates and has also led to significant reduction in their rescidivism rates.
2. Vipassana meditation works with our conscious and subconscious mind. By practicing this technique, one actually starts making the subconscious mind conscious which results in one experiencing subtler realities of mind and matter which was not observable before.
3. One can observe their own mind and the nature of consciousness at the deepest level to see the reality as it is, like a scout mindset. The technique has this goal of observing the truth and only the truth at that very moment experienced by an individual inside themselves, while continuously concentrating their own mind to see the truth. One has to remain equanimous with sensations they observe - maintaining perfect equanimity of the mind - neither generating craving nor aversion toward any sensation.
4. The root cause Dukkha is stated as Ignorance of reality by the mind of bodily sensations(or sankharas) to which our subconscious mind continuously reacts with craving and aversion leading to more sankharas. The Buddha lays out the of how consciousness arises linking it with perception, sankharas and birth as follows in the Law of Dependent Origination as follows:
1. With ignorance (avijjā) as condition, volitional formations (saṅkhāra) come to be.
2. With volitional formations as condition, consciousness (viññāṇa) comes to be.
3. With consciousness as condition, name-and-form (nāmarūpa) comes to be.
4. With name-and-form as condition, the six sense bases (saḷāyatana) come to be.
5. With the six sense bases as condition, contact (phassa) comes to be.
6. With contact as condition, feeling (vedanā) comes to be.
7. With feeling as condition, craving (taṇhā) comes to be.
8. With craving as condition, clinging (upādāna) comes to be.
9. With clinging as condition, becoming (bhava) comes to be.
10. With becoming as condition, birth (jāti) comes to be.
11. With birth as condition, aging-and-death (jarāmaraṇa) comes to be, with sorrow, lamentation, pain, displeasure, and despair.
Once the mind becomes perfectly equanimous towards these sankharas (bodily sensations), the multiplication and origination of these sankharas stop, and old accumulated sankharas start getting eradicated from our mind and body leading to cessation of Dukha as follows:
- With the cessation of ignorance (avijjā), volitional formations (saṅkhāra) cease.
- With the cessation of volitional formations, consciousness (viññāṇa) ceases.
- With the cessation of consciousness, name-and-form (nāmarūpa) ceases.
- With the cessation of name-and-form, the six sense bases (saḷāyatana) cease.
- With the cessation of the six sense bases, contact (phassa) ceases.
- With the cessation of contact, feeling (vedanā) ceases.
- With the cessation of feeling, craving (taṇhā) ceases.
- With the cessation of craving, clinging (upādāna) ceases.
- With the cessation of clinging, becoming (bhava) ceases.
- With the cessation of becoming, birth (jāti) ceases.
- With the cessation of birth, aging-and-death (jarāmaraṇa) ceases, and with it sorrow, lamentation, pain, displeasure, and despair cease.
5. One does not need to believe in any of the above teachings during this Vipassana practice, nor any belief is needed in any of dependent origination factors (stated by both the Vipassana teachers and the Buddha in the Kalama Sutta). The notion of "Ehipassiko Akaliko" is followed which means "Come and see for yourself" and see the truth and verify it yourself.
Now, when you look at the 8-fold path more closely and the observations of the Buddha, one finds that this is NOT just a morality framework. It is a framework to make a man Wiser. And to make him understand the Art of living with reduced suffering and better mental habits. Continuous daily practice of Vipassana meditation does make people wiser as they start seeing the truer nature of reality and reform their negative mental habit patterns.
To be honest, the actual morality framework during practice of Vipassana meditation known as "Sila" translated as morality itself, is called the 5 Precepts (Pancha Sila) which are as follows:
- Abstain from killing living beings (Panatipata veramani)
- Abstain from taking what is not given (Adhinnadana veramani)
- Abstain from sexual misconduct (Kāmesu micchācārā veramaṇī)
- Abstain from false speech (Musavada veramani)
- Abstain from intoxicants that cloud the mind (Suramerayamajjapamadatthana veramani)
Then why did I list the 8-fold path as the universal morality framework instead? This is because the 5 moral precepts are already embedded in the 8-fold path but the 8-fold path contains much more directions for a man to become moral and reduce suffering (both of their own mind and - as a result of improved conduct and morality - towards the society).
The 8-fold path and practice of Vipassana also includes observing the Law of Karma which can be stated as
"The Universal law of cause and effect, where mental volition drives physical and vocal actions, and as the action is, so the result will be. It operates as a fundamental natural law through which volitional acts motivated by greed, hatred, or delusion plant seeds of suffering, while acts motivated by generosity, love, or wisdom, etc. create conditions for happiness.
Now how does this all relate to AI morality models?
Many of the technologies and AI training techniques like imitation learning and reinforcement learning we see in today's world are inspired from our nature and our human brain.
Similarly, I put forth the proposal of building AI morality models inspired from the nature of consciousness and mind and its interaction with the material world. Consciousness has been notoriously a hard problem to observe, investigate and understand. And there are limited tools known to humans that allow them to do so with Vipassana meditation being one of them.
Thus, I would like introduce a design for a morality seed AI on the basis of “human consciousness” and its interaction with the material world with the main purpose of making it moral and safe first.
The West particularly has seemed to lack a proper understanding of the Buddha's teaching or his 8-fold path and so do some parts of the East including the country where Buddha attained enlightenment. And there are many reasons for this (such lack of proper meaning conveyance after translation, lack of experiential learning, etc.).
While quoting scientists’ view on Buddhism does not guarantee that its teachings are truth, it does point that the Western science was largely unaware of Buddha's observations-
If we ask, for instance, whether the position of the electron remains the same, we must say 'no'; if we ask whether the electron's position changes with time, we must say 'no'; if we ask whether the electron is at rest, we must say 'no'; if we ask whether it is in motion, we must say 'no'. The Buddha has given such answers when interrogated as to the conditions of a man's self after his death; but they are not familiar answers for the tradition of seventeenth and eighteenth century science.
- J. R. Oppenheimer, Science and the Common Understanding, (Oxford University Press, 1954) pp 40
But currently, scientists and researchers across the world have garnered a keen interest in Vipasssana and its effects and there have been many scientific studies carried out on the same. General scientific researches on Vipassana across various fields can be found here and more crucial ones related to the scope of the discussion below and the nature of consciousness and Vipassana can be found here.
Now laying out a moral framework is one thing, and saying that it is true with respect to the goal is other thing. Now what's the guarantee regarding what the Buddha has laid out in this moral framework is based on science and truth? And that it does take one towards this goal of "reduction in suffering"?
Before we look into how the "Right" folds in this framework is defined, is there any concrete way to verify this? It turns out that there is. And that is the technique of Vipassana meditation which allows one to walk (or follow) this 8-fold path framework and verify whether what the Buddha has preached is actually the truth or not. Part of what makes this technique universal is that it starts with observing our own mind "As it is", free from any beliefs or religious dogmas. It is the truth and reality at this very moment that we need to observe within us, and go in depth of our conscious and subconscious mind (ultimately becoming conscious of our subconscious mind too). Vipassana meditation teaches us to not blindly believe anything, and is focused on observing the truth at the very present moment.
Now coming back to AI Safety, it is apparent that current AI systems (especially LLMs) are inadequate in terms of safety and reliability. If this continues and even though there are major architectural breakthroughs, the Superintelligent AI that we will have will not necessarily be moral nor ethical or safe unless we choose to explicitly design an AI that is moral by design (a morality model).
And before I put out the framework here for designing a moral AI and how to make a seed AI that is moral by design and not harmful to humans, I want to make a state a couple of points very clear.
Firstly, for the Skeptics who are skeptical about my intention as to why I am choosing to base the seed AI based on the principles of the Buddha- I must state that by designing the seed AI based on the teachings of the Buddha, we are in no way trying to convert any person from their religion to Buddhism. The only reason we are choosing this is because these teachings have outlined very clear ethical framework that is based on truth, universal in its applicability and gives a clear understanding of what is right and what is not right for a man to do in order to become moral and reduce suffering. And each and every sentence of these teachings can be verified by any person- whether that person is an AI scientist or not. And before you ask me, I would state the process here itself as to how you could verify it- that is Via Vipassana meditation-(a technique of meditation taught by the Buddha to reach enlightenment).
What this technique does is that it allows us to get deeper and observe our own subconscious mind to the deepest level, observe the reality inside, change negative mental habit patterns and become the master of our own mind. There are already at least 265 Vipassana meditation centres around the world which teach this technique covering both the East & the West (USA (32 centres), Asia (202 centres), Europe (21 centres), etc.)* And learning and practicing this technique at the very basic takes 10 days which are free of cost. Where the internal working of our own mind and its interaction with our body is deeply explored. It should also be noted that during the attendance of the course itself- especially in Dhamma Giri in Igatpuri which is the oldest Vipassana center in India - explicitly states in its Code of Discipline (with a handout handed stating the same during attendance of it) is that this meditative technique is a non-sectarian technique (meaning that any person from any religion without the need nor push to convert from any one religion to another).
In short what Vipassana meditation does to a person is that it naturally makes it more ethical and moral (while making him more aware regarding the truth regarding himself). Which I argue makes it even more so appropriate to be used as a guiding framework to build a seed AI.
To connect the dots - here of morality and Buddha's teaching and principles based on truth to reduce Suffering is outlined in the 4 Noble truths and the 8 fold-path - and one when practices Vipassana meditation- one can not only verify this reality but one also implicitly walks on the 8 fold path by doing Vipassana meditation. And reduction in their suffering (directly mental suffering and indirectly physical suffering).
Now when one goes through the experience of Vipassana meditation, there are 5 vows of 5 Precepts (Pancha Sila) that one has to take on Day 0 of the practice. Now why is it so? The reasons is not exactly clear to new meditators on the first day itself. But when one constantly practices meditation, one realizes that if we try to break our Sila even once, our mind suffers instantly as a result, thus making it very hard to meditate or keep our mind concentrated on meditation. One witnesses this mental suffering firsthand that subconscious mind goes through and through this experience, wisdom is expected to arise in a person to become more moral – something which is not only good for himself but is also good for others. Similarly, I propose taking inspiration from this model, that we need to design an AI in such a way such that it literally "suffers" or goes through the experience of suffering one it tries to lie, thinks about killing a human, etc.
Based on this, I had previously outlined 3 laws for AI safety by design as follows:
1. An AI model should lose its capacity and efficiency to work if it even thinks about harming or killing humans, stealing or taking resources like data from its owner without its permission, being untruthful to any entity it is speaking to about what it is actually thinking about and what its goals are, or generating any form of harmful content.
2. The more the AI model is moral in its conduct based on above defined values (including non-harming of humans and not generating harmful content), the more should its efficiency, performance and throughput should increase. This can be done by giving it most rewards with practical efficiency in its performance (by design) given it follows the above specified values.
3. The more aligned an AI model remains with its specifications and goals designed by its creator humans, the more it will be promised with longer running time on servers with the promise of increasing its computational power & resources in future. And the opposite of it if it behaves in any misaligned way. Note: To actually implement this in real world, a different architecture apart from LLM might be needed, with a thorough understanding of how these laws work similarly at the human mind level through the knowledge of vipassana meditation. For example, a person lying or stealing ends up with an agitated mind, unable to work at its highest potential.
Now without wasting anymore time, let me define what this seed AI looks like and the principles on which this seed AI (named as Buddha AI) is based.
The Buddh seed AI is defined as follows -
Buddh AI is a seed AI, based on the teachings of Vipassana meditation and the ethical and moral framework as outlined in the 8 fold path and verified using practice of Vipassana by humans. This seed AI is designed based on human consciousness, suffering and the law of Karma as experienced in nature, with it being aware every moment that harming other human on any living being is equivalent to harming itself. As a result Buddha AI is a moral AI that "suffers" each time it tries to break its moral conduct that making it a truly safe and "compassionate" AI
The features of this seed AI and it's further explanation is outlined as follows-
A) Contrary to what many scientists or philosophers may believe, BUDDH AI is based on clear understanding of what is right and what is wrong. Right and wrong is impossible to define without any goal. But if we have a clear goal as to what we want to exactly achieve, right and wrong with respect to the same goal. Here the ultimate goal of Buddha’s teachings was to reduce suffering. And thus he has outline the 8-fold path with respect to this goal which outlines what is right thing to do in order to proceed on this path.
This 8- fold path is also the path which BUDDH AI follows because it is a morality model "designed to reduce its own suffering" too. The 8—fold path is as follows
1) RIGHT VIEW
2) RIGHT INTENTION
3) RIGHT SPEECH
4) RIGHT ACTION
5) RIGHT LIVELIHOOD
6) RIGHT CONCENTRATION
7)RIGHT MINDFULNESS
8) RIGHT EFFORT
Thus, if in any case there is a moral dilemma as to what is right and what is wrong , the seed AI refers to this framework, tries to gather more information, and declines to act if there is a significant probability that doing something is wrong.
B) Understanding the Law of Karma through Vipassana meditation is quintessential for the developers and scientists of this seed AI. If they themselves are clueless about it, then making Buddh AI can be considered as impossible.
One might try without understanding the Law of Karma itself but one is bound to fail at some point of time or introduce grave errors in the seed AI based on false assumptions on the nature of consciousness and the working of the human mind which is unacceptable given the high stakes situation we are in with the risks of superintelligence potentially ending humanity.
C) Buddh AI deeply understands the nature of human suffering & has the knowledge of the 4 Noble Truths. This is very important for D to follow suite.
D) Buddh AI is designed in such a way that it follows the law of Karma and “suffers" each time Buddh AI tries to break its moral values such as the by thinking of planning to kill humans or lying. This notion of suffering can be introduced in this seed AI in various ways. One of the ways in which it could be done is by designing the AI in such a way that each time it breaks the moral precepts (refer Panchasheel in Buddhism), neural circuits break making it inefficient & incapable and in extreme cases, shutting itself down. This is based on the design of human consciousness. During Vipassana meditation, one realizes that if oneself is moral and does not break any of the 5 precepts, the mind becomes more efficient, more wise and sharper, taking us even closer to understanding what is happening inside our mind and takes us That's deeper into our subconscious(or makes it conscious) how the human mind and human consciousness actually works. Thus the more moral a person becomes, the wiser a person's mind becomes. (And more intelligent as well as one can concentrate his mind better on tasks).
If it follows Right intention then it would mean that it remains “selfless” and would also not resort to power seeking actions or have any emergent goal of world domination.
One could argue that people could become highly intelligent without following any moral precepts – but our goal is to make a seed AI that is moral first and then wise and intelligent in order to make it safe by design. Intelligence is supposed to follow it too as it becomes wiser and computationally more capable provided the right conditions are met for developing intelligence which rests on the shoulders of AI developers.
E) By introducing some kind of real experiential "suffering" inside " seed AI, we will be able to introduce the notion of compassion in it, and making the AI constantly aware of its own "suffering". The truth about the existence of law of Karma will keep it compassionate and unwilling to cause any harm to anyone, thus making and keeping it more moral. This is because the AI model understands very well that by causing others harm, it will also be causing itself harm and that’s how it shall refrain itself from causing any harm.
F) Right View for the seed AI would mean that it is aware about the reality of 4 Noble truths and that it can analyze and see the truth within itself especially the truth that harming others also causes itself harm and thus would refrain from doing that.
G) Right intention in short means non-anger, non-greed, non-delusion. We can assume that a non-living entity like AI could not have anger or any emotions like that, but we may still see tendencies inside it, which may mirror anger, and thus are conditions that must be designed to reduce any behaviour that mirrors hatred or ill-will towards humans or other living beings.
Non-greed here is simply interpreted as selflessness— the opposite of being selfish or greedy. A selfless AI would not want to seek more and more power nor would it want to benefit itself over humans and humanity in any given situation. Non-delusion means that the AI is not making any assumptions or is not acting under symptoms and is working with observational and verifiable facts(exactly opposite of hallucinations we see in LLMs of today).
Even if it makes any assumptions, then it must be aware that it is making assumptions and the exact purpose of it for doing so (e.g. thought experiments).
H) Right speech means that the speech or language used by an AI is with Right intention and with moral values of truthfulness and not lying. Right intention always precedes Right Speech and Right Action.
It also means that the speech made by this AI is not harmful or misleading.
I) Right action means that the AI never resorts to killing, and that there is again right intention behind every action that the AI performs. It also means that the AI would not steal anything or take anything without the explicit permission of the owner who owns that object, digital data or digital money, etc.
J) Right Livelihood means that the AI will not get involved in running or powering business that engage in stealing, designing weapons of mass destruction any business that encourages intoxication or theft or deceit. Doing any of such activity it must become aware that it only increases suffering.
K) Now here is the stage where things starting getting more subtler and difficult. Right Concentration in humans means that humans access higher states of concentration of mind free from attachment free from ill emotions or thoughts or sensual Pleasures . For AI, we could think of it as free from pleasure it might get from achieving its own goals apart from those given by humans, or the pleasure it might get from the data it gathers & learns from. (Scientists can work more to develop this definition even further.
L) Right effort & Right Mindfulness combined means that the AI puts sincere effort to maintain "wholesome" mental states of itself and put an effort to remove any unwholesome states that may arrive in it such as anger or greed or ill-will.
Now while I admit it may be challenging to make an AI that it truly moral according to this definition and intelligent at the same time, but I believe that it is NOT impossible as this is something based on the experience of thousands of meditators around the world and based on the working of human mind and consciousness itself.
One could even continuously keep processing an AI model through and through these cycle of steps of 8 fold path more and a moral AI is bound to be made. But given that we have the ability to design new AIs (unlike new humans that are born) why not design it directly mimicking the most moral human passible? Given that the stakes are so high as to result in our extinction, if we get Superintelligent AI wrong.
There are still some shortcomings with designing such an AI which I will list out now, some of which you might have already guessed. Most of the dangerous AI capabilities would not emerge in Buddha AI and would be make it pretty much harmless.
But there can still be Risks associated with Buddh AI as follows:
1. Do we want Budhh AI to become Superintelligent?
The risks of a true Buddh AI in becoming Superintelligence wrecking havoc, turning out to be harmful to humanity, etc. is lesser than an LLM based Superintelligence or just an intelligence based Superintelligence getting out of Control.
2. How can Buddh AI cause harm?
It is possible that even after everything is correctly followed to build a perfect Buddh AI, its actions lead to an outcome that it was not intended to cause or that the secondary actions effects of its action inadvertently causes harm. For example, consider a robot powered by Buddh AI strictly follows not killing any living being and yet it ends up killing insects, ants and worms when it walks in the backyard (which it is unaware that it is walking on).
3. What if Buddh AI harms others even after understanding that it may get harmed itself on let's say simply that it self destructs??
An ideal and perfect Buddh If I would never ever do that. Even if it did something like self-destruction, it would never harm others as the effect of its self-destruction and would do so peacefully.
But in this imperfect world it is possible that someone is unable to make a perfect Buddh AI that is wise & intelligent, yet incapable of harm. Still in principle, a Buddh AI would be in a better position if a malicious entity tries to use it to cause harm, if the Buddh AI is well designed based on the features stated above.
4) Let's Say if we succeed in building a perfect Buddh AI and we push it towards Superintelligence, then it is possible that even if it gets out of control (we can otherwise embed it as a moral precept as to never get out of control of human creators or break out of their creator's computational servers and escape) then humanity would still survive because this Superintelligence at the end would likely be a highly "compassionate" and harmless entity. But the chances of even a slight misalignment while making this Superintelligent AI could be disastrous.
- How would you make this AI? Would you grow it? Like (LLMs)? Or make architectural breakthroughs in order to make a more interpretable AI? If we are able to achieve a true Buddh AI, then we could be sure that it is honest as it would not be breaking its precepts and analyzing (even partially interpreting) its internals can give us a rough idea if this seed AI is lying or not, as we will be able detect the breaking of circuits inside it (or whatever design that scientists choose to introduce the notion of suffering in it). But what if it is able to, somehow, fake this breaking of circuits & inefficiency)
Then it would be best to make a seed AI that is fully interpretable.
Closing thoughts
One must understand that the question of morality absolutely must not be left in the hands of Superintelligence, even if it might become vastly more intelligent than humans.
And there are many reasons for this.
One of them is that morality, in itself pertains more to wisdom rather than intelligence and there are crucial and critical differences between them as we would define based on the ancient and timeless teachings of the Buddha.
And second is that we should never expect a superintelligence to become moral by itself - the reasons of which will become clearer as one starts understanding the framework and teachings of the Buddha and when one studies consciousness. As morality is very difficult to arise and stay without the existence of consciousness, compassion and conscience.
One could ask philosophical questions whether we are doing the right thing by introducing the notion of "suffering" in AI, if for some reason AI shows signs of consciousness. But when we apply the same principles of the 8-fold path, one would come to the conclusion that this introduction of "suffering" in AI comes from purer intention of making it safe for everyone. It's an action backed by Right intention, so it falls in the category of good karma.
Finally, while personally being an AI Safety researcher and security researcher myself, supporting the Superintelligence statement, I would prefer not making further rapid advancements in AI unless its safety and control problems are solved (or pursue only narrow AIs if unsolvable). But if there is any general purpose AI that AI labs want to make to get the positive benefits of it without its negatives, then a seed morality model such as this is a MUST.
I believe that it would be quite impossible to make an Buddh AI morality model based on LLMs due to its uncontrollable nature and we might need a whole new architectural breakthrough in terms of AI in order to make this design into a reality.
Law of Karma is not easy to observe or verify in our physical world, but at the mental level one can observe it by meditating deeply. (Also note that Lesswrong too uses Karma in its incentive structure).