"AI alignment" has the application, the agenda, less charitably the activism, right in the name. It is a lot like "Missiology" (the study of how to proselytize to "the savages") which had to evolve into "Anthropology" in order to get atheists and Jews to participate. In the same way, "AI Alignment" excludes e.g. people who are inclined to believe superintelligences will know better than us what is good, and who don't want to hamstring them. You can think we're well rid of these people. But you're still excluding people and thereby reducing the amount of thinking that will be applied to the problem.
"Artificial Intention research" instead emphasizes the space of possible intentions, the space of possible minds, and stresses how intentions that are not natural (constrained by evolution) will be different and weird.
And obviously "Artificial Intention" is an alliteration and a close parallel with "Artificial Intelligence", so it is very catchy. Catchiness matters a lot when you want an idea to catch on at scale!
Extremely superficially, it doesn't sound "tacked on" to Artificial Intelligence research, it sounds like a logical completion.
The necessity of alignment doesn't have to be in the name, because it logically follows from the focus on intention, with this very simple argument:
- Intention doesn't have to be conscious or communicable. It is just a preference for some futures over others, inferred as an explanation for behavior that chooses some future over others. Like, even single celled organisms have basic intentions if they move towards nutrients or away from bad temperatures.
- Therefore, anything that selectively acts in the world, including AI systems, can be modeled to have some intent that explains its behavior.
- So you're always going to get an intent, and if you don't design it thoughtfully you'll get an essentially random one.
- ...which is most likely bad (e.g. the paperclip maximizer) because it is random and different and weird.
So this would continue to be useful for alignment. Just like anthropology continued to be useful, and in fact was even more useful than original missiology, to the missionaries.
Having the Intelligence (the I in "AI Alignment") only implicitly part of it (because of the alliteration and the close parallel) might lose some of the focus on how the Intelligence makes the Intention much more relevant? If that isn't obvious enough? But it also allows us to also look at Grey Goo scenarios, another existential risk worth preventing.
Changing names will cause confusion, which is bad. But the shift from "friendly AI" to "AI alignment" went fine, because "AI alignment" just is a better name than "friendly AI". I imagine there wouldn't be much more trouble in a shift to an even better one. After all, "human-compatible" seems to be doing fine as well.
What do you think?
Do philosophers commonly use the word "intention" to refer to mental states that have intentionality, though? For example, from the SEP article on intentionality:
>intention and intending are specific states of mind that, unlike beliefs, judgments, hopes, desires or fears, play a distinctive role in the etiology of actions. By contrast, intentionality is a pervasive feature of many different mental states: beliefs, hopes, judgments, intentions, love and hatred all exhibit intentionality.
(This is specifically where it talks about how intentionality and the colloquial meaning of intention must not be confused, though.)
Ctrl+f-ing through the SEP article gives only one mention of "intention" that seems to refer to intentionality. ("The second horn of the same dilemma is to accept physicalism and renounce the 'baselessness' of the intentional idioms and the 'emptiness' of a science of intention.") The other few mentions of "intention" seem to talk about the colloquial meaning. The article seems to generally avoid the avoid "intention". Generally the article uses "intentional" and "intentionality".
Incidentally, there's also an SEP article on "intention" that does seem to be about what one would think it to be about. (E.g., the first sentence of that article: "Philosophical perplexity about intention begins with its appearance in three guises: intention for the future, as I intend to complete this entry by the end of the month; the intention with which someone acts, as I am typing with the further intention of writing an introductory sentence; and intentional action, as in the fact that I am typing these words intentionally.")
So as long as we don't call it "artificial intentionality research" we might avoid trouble with the philosophers after all. I suppose the word "intentional" becomes ambiguous, however. (It is used >100 times in both SEP articles.)