Hi LW, this article has put some doubt in my mind as to whether researching AI alignment is worthwhile, or a frivolity that detracts from more pressing issues such as mitigation of harmful biases.

[Discussing AI consciousness] is a distraction, when current AI systems are increasingly pervasive, and pose countless ethical and social justice questions that deserve our urgent attention

I would like to hear some opinions from this community on the sentiment expressed in the above quote.

First, though, an introduction, or why this has prompted me to emerge from my decade-long lurker state:

I am a computational linguistics MS student approaching my final semester and its accompanying research capstone. My academic background is in linguistics, not computer science, philosophy, math, or anything that would actually be useful for doing serious AI research. c; Nevertheless, I want to try to help, as best I can.

For my capstone, I have been considering continuing a previous bias mitigation project (that I have lost most of my enthusiasm for due to group project burnout), or using LLMs to extract moral themes from literature (an idea my thoughts keep coming back to) in the hopes that the ability to encapsulate the human values expressed in a text may someday be useful to the AI alignment task. The thought process behind this was, although human beings can't even agree on human values, we all somehow manage to come up with a sense of morality based on our cultural upbringing, and one common form of character education for children is storytelling (Aesop's fables, Narnia...) If human children can learn virtues from fiction, why not AI?

The part of Lemoine's dialogue with LaMBDA where it succinctly summarizes the moral themes of Les Mis made me think this may be a good idea to play around with. [Note, just to be extremely clear, I in no way imagine LLMs to be sentient- Lemoine is misguided by his religious thinking. But I do think LLMs are likely to be a component of a future system that could become conscious.]

However, criticism such as the above article makes me feel I may just be wasting my time selfishly, and I should go back to doing sociolinguistics rather than frivolous literary analysis, even if Aesops are more fun.

I would love to know your thoughts, whether about my personal predicament or the article in general.

New to LessWrong?

New Comment
10 comments, sorted by Click to highlight new comments since: Today at 2:30 PM

I read the article and I have to be honest I struggled to follow her argument or to understand why it impacts your decision to work on AI alignment. Maybe you can explain further?

The headline "Debating Whether AI is Conscious Is A Distraction from Real Problems" is a reasonable claim but the article also makes claims like...

"So from the moment we were made to believe, through semantic choices that gave us the phrase “artificial intelligence”, that our human intelligence will eventually contend with an artificial one, the competition began... The reality is that we don’t need to compete for anything, and no one wants to steal the throne of ‘dominant’ intelligence from us."

and

"superintelligent machines are not replacing humans, and they are not even competing with us."

Her argument (elsewhere in the article) seems to be that people concerned with AI Safety see Google's AI chatbot, mistake its output for evidence of consciousness and extrapolate that consciousness implies a dangerous competitive intelligence.

But that isn't at all the argument for the Alignment Problem that people like Yudkowsky and Bostrom are making. They're talking about things like the Orthogonality Thesis and Instrumental Convergence. None of them agree that the Google chatbot is conscious. Most, I suspect, would disagree that an AI needs to be conscious in order to be intelligent or dangerous.

Should you work on mitigating social justice problems caused by machine learning algorithms rather than AI safety? Maybe. It's up to you.

But make sure you hear the Alignment Problem argument in it's strongest form first. As far as I can tell that form doesn't rely on anything this article is attacking.

I probably should have included the original Twitter thread that sparked the article link in which the author says bluntly that she will no longer discuss AI consciousness/superintelligence. Those two had become conflated, so thanks for pointing that out!

With regards to instrumental convergence (just browsed the Arbitral page), are you saying the big names working on AI safety are now more focused on incidental catastrophic harms caused by a superintelligence on its way to achieve goals, rather than making sure artificial intelligence will understand and care about human values?

Somebody else might be able to answer better than me. I don't know exactly what each researcher is working on right now.

“AI safety are now more focused on incidental catastrophic harms caused by a superintelligence on its way to achieve goals”

Basically, yes. The fear isn’t that AI will wipe out humanity because someone gave it the goal ‘kill all humans’.

For a huge number of innocent sounding goals ‘incapacitate all humans and other AIs’ is a really sensible precaution to take if all you care about is getting your chances of failure down to zero. As is hiding the fact that you intend to do harm until the very last moment.

“rather than making sure artificial intelligence will understand and care about human values?”

If you solved that then presumably the first bit solves itself. So they’re definitely linked.

From my beginners understanding, the two objects you are comparing are not mutually exclusive.

There is currently work being done on inner alignment and outer alignment, where inner alignment is more focused on making sure that an AI doesn't coincidentally optimize humanity out of existence due to [us not teaching it a clear enough version of/it misinterpreting] our goals and outer alignment more focused on making sure we have goals aligned to human values we should teach it.

Different big names focus on different parts/subparts of the above (with crossover as well).

we all somehow manage to come up with a sense of morality based on our cultural upbringing, and one common form of character education for children is storytelling (Aesop's fables, Narnia...) If human children can learn virtues from fiction, why not AI?

The fiction that humans tell each other appeals to human instincts. AI does not have such instincts.

Sounds like a statement "no AI can have or get them". 

Well it can learn it, it can develop them based on a dataset of people's stories. Especially it looks possible with the approach that is currently being used. 

no, superintelligent machines are not replacing humans, and they are not even competing with us.

I do not think the author has read Superintelligence.

 

In fact, these large language models are merely tools made so well that they manage to delude us

Eliminativist philosophers would say approximately the same thing of the neural net in the Brain.

 

I would be happy to hear an argument in favor of developing models of ‘conscious’ artificial intelligence. What would be its purpose, aside from proving that we can do it? But that is all it would be

I believe consciousness is a prerequisite for moral agency. Determining what is conscious or not therefore a very important moral problem; I think Robert Wiblin summarize it correctly: 

 

https://twitter.com/robertwiblin/status/1536345842512035840

Failing to recognise machine consciousness is one moral catastrophe scenario. But prematurely doing so just because we make machines that are extremely skilled at persuasive moral advocacy is another path to disaster

If an AI (or a chatbot that the AI created based on a prompt) that can pass the Turing test is sentient, it does matter, because it's a moral agent whose well-being and preferences should be considered among the well-being and preferences of humans. When considering moral questions, all moral agents have to be counted, not just humans.

I really really hope you get into AI work. I'm a big advocate for arts and other human qualities being in AI dev. Of course much of it isn't really understood yet how it will integrate, but if we get folks like you in there early you'll be able to help guide the good human stuff in when it becomes more clear how.  Viliam commenting below that AI lacks such human instincts is exactly the point...it needs to get them ASAP before things start going down the wrong road. I would guess that eventually we will be evaluating progress by how much an AI does show these qualities. Of course it's still early now. 

It's becoming clear that with all the brain and consciousness theories out there, the proof will be in the pudding. By this I mean, can any particular theory be used to create a human adult level conscious machine. My bet is on the late Gerald Edelman's Extended Theory of Neuronal Group Selection. The lead group in robotics based on this theory is the Neurorobotics Lab at UC at Irvine. Dr. Edelman distinguished between primary consciousness, which came first in evolution, and that humans share with other conscious animals, and higher order consciousness, which came to only humans with the acquisition of language. A machine with primary consciousness will probably have to come first. The thing I find special about the TNGS is the Darwin series of automata created at the Neurosciences Institute by Dr. Edelman and his colleagues in the 1990's and 2000's. These machines perform in the real world, not in a restricted simulated world, and display convincing physical behavior indicative of higher psychological functions necessary for consciousness, such as perceptual categorization, memory, and learning. They are based on realistic models of the parts of the biological brain that the theory claims subserve these functions. The extended TNGS allows for the emergence of consciousness based only on further evolutionary development of the brain areas responsible for these functions, in a parsimonious way. No other research I've encountered is anywhere near as convincing. I post because on almost every video and article about the brain and consciousness that I encounter, the attitude seems to be that we still know next to nothing about how the brain and consciousness work; that there's lots of data but no unifying theory. I believe the extended TNGS is that theory. My motivation is to keep that theory in front of the public. And obviously, I consider it the route to a truly conscious machine, primary and higher-order. My advice to people who want to create a conscious machine is to seriously ground themselves in the extended TNGS and the Darwin automata first, and proceed from there, by applying to Jeff Krichmar's lab at UC Irvine, possibly. Dr. Edelman's roadmap to a conscious machine is at https://arxiv.org/abs/2105.10461