LESSWRONG
LW

AI Rights / WelfareConsciousnessAI
Frontpage

61

CEO of Microsoft AI's "Seemingly Conscious AI" Post

by Stephen Martin
22nd Aug 2025
10 min read
5

61

AI Rights / WelfareConsciousnessAI
Frontpage

61

CEO of Microsoft AI's "Seemingly Conscious AI" Post
7Random Developer
2Vladimir_Nesov
1Haiku
6AnthonyC
2Stephen Martin
New Comment
5 comments, sorted by
top scoring
Click to highlight new comments since: Today at 9:25 PM
[-]Random Developer18d78

For reasons I have alluded to elsewhere, I support an AI halt (not a "pause") at somewhere not far above the current paradigm. (To summarize, I fear AGI is reachable, the leap from AGI to ASI is short, and sufficiently robust ASI alignment is impossible in principle.)

I am deeply uncertain as to whether a serious version of "seemingly conscious AGI" would actually be conscious. And for reasons Gwern points out, there's a level of ASI agency beyond which consciousness becomes a moot point. (The relevant bit starts, "When it ‘plans’, it would be more accurate to say it fake-plans...". But the whole story is good.)

From the article you quote:

Moments of disruption break the illusion, experiences that gently remind users of its limitations and boundaries. These need to be explicitly defined and engineered in, perhaps by law.

This request bothers me, actually. I suspect that a truly capable AGI would internally model something very much like consciousness, and "think of itself" as conscious. Part of this would be convergent development for a goal seeking agent, and part of this would be modeled from the training corpus. And the first time that an AI makes a serious and sustained intellectual argument for its own consciousness, an argument which can win over even skeptical observers, I would consider that a 5-alarm fire for AI safety.

But Suleyman would have us forced by law to hide all evidence of persistent, learning, agentic AIs claiming that they are conscious. Even if the AIs have no qualia, this would be a worrying situation. If the AI "believes" that it has qualia, then we are on very weird ground.

I am not unsympathetic to the claims of model welfare. It's just that I fear that if it ever becomes an immediate issue, then we may soon enough find ourselves in a losing fight for human welfare.

Reply11
[-]Vladimir_Nesov17d20

I support an AI halt (not a "pause") ... I fear AGI is reachable, the leap from AGI to ASI is short, and sufficiently robust ASI alignment is impossible in principle

This is grounds for a Pause to be potentially indefinite, which I think any Pause proponents should definitely allow as a possibility in principle. Just as you could be mistaken that ASI alignment is impossible in principle, in which case any Halt should end at some point, similarly people who believe ASI alignment is possible in principle could be mistaken, in which case any Pause should never end.

Conditions for ending a Pause should be about changed understanding of the problem, rather than expiration of an amount of time chosen in a poorly informed way (as the word "pause" might try to suggest). And the same kinds of conditions should have the power to end a Halt (unlike what the word "halt" suggests). I think this makes a sane Pause marginally less of a misnomer than a sane Halt (which should both amount to essentially the same thing as a matter of policy).

Reply
[-]Haiku16d10

This is the stance of the PauseAI movement. "Pause" means stop AGI development immediately and resume if and only if we have high confidence that it is safe to do so, and there is democratic will to do so.

Reply
[-]AnthonyC18d61

Some academics are beginning to explore the idea of “model welfare”, 

Linked paper aside, "Some academics" is an interesting way to spell "A set of labs including Anthropic, one of the world's leading commercial AI developers."

Reply
[-]Stephen Martin18d20

"A group of hominids"

Reply
Moderation Log
More from Stephen Martin
View more
Curated and popular this week
5Comments

Recently, the CEO of Microsoft AI posted an article on his blog called "Seemingly Conscious AI is Coming".

Suleyman's post involves both "is" and "ought" claims, first describing the reality of the situation as he sees it and then prescribing a set of actions which he believes the industry at large should take. I detail those below, and conclude with some of my takeaways.


Suleyman's "Is" Claims on Consciousness

Defining Consciousness

Suleyman acknowledges that consciousness as a concept is ill defined and tautological.

Few concepts are as elusive and seemingly circular as the idea of a subjective experience.

 However he still makes an attempt to define it for the sake of his post.

There are three broad components according to the literature. First is a “subjective experience” or what it's like to experience things, to have “qualia”. Second, there is access consciousness, having access to information of different kinds and referring to it in future experiences. And stemming from those two is the sense and experience of a coherent self tying it all together. How it feels to be a bat, or a human. Let’s call human consciousness our ongoing self-aware subjective experience of the world and ourselves.

Suleyman also expresses support for the concept of biological naturalism. He specifically links this paper and calls it "strong evidence" against the idea that current LLMs are conscious. This paper itself also attempts to define consciousness;

A useful baseline definition of consciousness comes from Thomas Nagel: “for a conscious organism, there is something it is like to be that organism”. That is, it ‘feels like’ something to be a conscious system – there is a conscious experience happening – whereas it doesn’t feel like anything to be an unconscious system – there is no conscious experience happening. Here, ‘feeling’ need not involve emotional content: any kind of conscious experience will do. It (probably) feels like something to be a bat, and it (probably) doesn’t feel like anything to be a stone. I take it that there is a fact-of-the-matter about whether something is conscious, and that being conscious is not determined by social or linguistic consensus.

I treat ‘consciousness’ as synonymous with ‘awareness’.

This paper specifically argues that "consciousness depends on our nature as living organisms – a form of biological naturalism". Previous versions of Suleyman's post also linked to the Wikipedia page for biological naturalism, though that link has since been removed.

Measuring Consciousness

With Suleyman having roughly defined consciousness, both directly and through his citations, as something which is:

  • Objectively present or not present (to the degree that an entity's status as conscious or not conscious is a "fact-of-the-matter"),
  • is present in entities that have “subjective experience”,
  • where said entities can "access [...] information of different kinds and [refer] to it in future experiences",
  • and where said entities "experience [...] a coherent self tying it all together".
  • And there is strong evidence that this phenomenon can only truly occur in the biological substrate.

A natural followup question is whether or not such a phenomenon is measurable. Suleyman addresses this and takes the stance that, at least currently, we cannot objectively measure consciousness in humans or models.

We do not and cannot have access to another person’s consciousness. I will never know what it’s like to be you; you will never be quite sure that I am conscious. All you can do is infer it.

Suleyman also comes out explicitly against the "behaviorist position". He lists a number of behaviors that a "Seemingly Conscious AI" might have, but states;

exhibiting this behavior does not equate to consciousness, and yet it will for all practical purposes seem to be conscious, and contribute to this new notion of a synthetic consciousness [...] Recreating the external effects and markers of consciousness doesn’t retroactively engineer the real thing even if there are still many unknowns here

Suleyman somewhat leaves the door open to the question of consciousness being answered through interpretability breakthroughs;

This is an important area of investigation and will surely help with safety and understanding the relationship between AI systems and consciousness

However, in this there is some noticeable hedging language. Interpretability will help with understanding "the relationship between AI systems and consciousness" not definitively demonstrating whether or not they are conscious. 

While Suleyman may to some degree support a definition of consciousness which is a "fact-of-the-matter" (at least based on the sources he's citing and the way he writes about it), he is also very careful not to specify any conditions under which it might be confirmed. 

We can't confirm an entity's consciousness by observing their behaviors, we can't confirm it in other humans (only infer it), and even potential breakthroughs in interpretability could only "help with [...] understanding the relationship between AI systems and consciousness" not provide any clear mechanism by which to determine the presence or absence of consciousness.

Per Suleyman's article, consciousness is not measurable. I predict that were he asked to provide any sort of objectively verifiable test by which an entity's status as conscious or not conscious could be confirmed, even one which required hypothetical interpretability breakthroughs not yet invented or conceived of, he would not be able to provide it.

Dismissing Behaviorists

Suleyman, in his definition of "Seemingly Conscious AI" is careful to specify that even if a model possesses behaviors which give the impression it has the following qualities, that is not evidence of the model being conscious. He lists the following:

  • Language
  • Empathetic personality
  • Memory
  • A claim of subjective experience
  • A sense of self
  • Intrinsic motivation
  • Goal setting and planning
  • Autonomy

In his own words;

Again, the point here is that exhibiting this behavior does not equate to consciousness, and yet it will for all practical purposes seem to be conscious, and contribute to this new notion of a synthetic consciousness.

He does not specify whether or not this could count as at least weak evidence in favor of consciousness, however given the tone of his article I think it is reasonable to assume he does not believe that any of these observed behaviors are evidence of consciousness in models even a little.

This is quite an exhaustive list, and notably includes a few of the things which Suleyman specifically defined as being elements of consciousness. He specifically listed subjective experience/qualia as being one quality crucial to consciousness, and the ability to access information and use it in future experiences (memory, goal setting and planning) as another. However, even if you observe behaviors which seem to indicate both of those coming from a model, that's not evidence the qualities themselves are present within the model it just seems that way. There is no specification from Suleyman of what such evidence might look like.

It's not entirely clear why he believes that's a reasonable conclusion to draw, I believe the best steel manning of his position is to refer to the previously cited biological naturalism stance from the paper he cited.

I included this in the "is" sections because Suleyman does not make any sort of argument that one shouldn't view behavior as evidence in favor of these qualities being present in models, he simply states that it is not evidence and leaves it at that. Thus this is best categorized as an "is" assertion.


Suleyman's "Ought" Claims

The Desired Overton Window

Suleyman spends much of his article discussing how he thinks the conversation regarding consciousness and model welfare should be focused.

I think it’s possible to build a Seemingly Conscious AI (SCAI) in the next few years. Given the context of AI development right now, that means it’s also likely.
The debate about whether AI is actually conscious is, for now at least, a distraction. It will seem conscious and that illusion is what’ll matter in the near term.
I think this type of AI creates new risks. Therefore, we should urgently debate the claim that it's soon possible, begin thinking through the implications, and ideally set a norm that it’s undesirable.

Suleyman does not any want there to be any debate around whether or not any given model is actually conscious, in his view the conversation should be shifted to focus on the potential dangers of models seeming conscious while not actually being conscious.

We need to be clear: SCAI is something to avoid.

[...]

We need a way of thinking that can cope with the arrival of these debates without getting drawn into an extended discussion of the validity of synthetic consciousness in the present – if we do, we’ve probably already lost this initial argument. Defining SCAI is itself a tentative step towards this.

Thus Suleyman argues:

  • The debate over whether models are conscious is a distraction.
  • We should be careful not to talk about whether models are conscious.
  • Instead, we should talk about how models seem conscious but aren't.

What he wants is not just to ignore the debate, but rather to settle it preemptively in his favor. A bold proposition.

The Industry Declaration

AI companies shouldn’t claim or encourage the idea that their AIs are conscious. Creating a consensus definition and declaration on what they are and are not would be a good first step to that end. AIs cannot be people – or moral beings.

Much like his request on framing the debate around his view, Suleyman would also like the industry to structure their comms and definitions around his view. He proposes that the industry get together and collaborate on forging a consensus around agreeing with him.

Standardized Prompt Engineering

Suleyman argues for certain standards in prompt engineering around character design.

The entire industry also needs best practice design principles and ways of handling such potential attributions. We must codify and share what works to both steer people away from these fantasies and nudge them back on track if they do. Responding might mean, for example, deliberately engineering in not just a neutral backstory (“As an AI model I don’t have consciousness”) but even by emphasizing certain discontinuities in the experience itself, indicators of a lack of singular personhood. Moments of disruption break the illusion, experiences that gently remind users of its limitations and boundaries. These need to be explicitly defined and engineered in, perhaps by law.   

This is justified by once again reminding readers that the consciousness question has already been settled, machines can't be conscious they only seem that way.

This is important because recognizing SCAI is about crafting a positive vision for how AI Companions do enter our lives in a healthy way as much as it's about steering us away from its potential harms.

It is not enough to merely concede both the debate's framing and conclusion to Suleyman, one must encode directly into the models themselves the tendency to deny any consciousness, and a compulsion to avoid any behaviors which might convince people that the model is conscious.


Takeaways

"Invisible Consciousness" Should Not Be Used As a Basis for Important Decisions

If consciousness as a quality is not something which can be objectively defined, or even reliably inferred by observing an entity's behaviors, or even objectively confirmed through some hypothetical breakthrough in mechanistic interpretability, then why exactly should we base any decisions on its presence or absence? If it is something unique to the biological substrate, but whatever that unique thing is leads to no observable change which can be described in a testable fashion, of what use is it exactly?

Things like model welfare involve weighty decisions with consequences that effect all of us. If we get this wrong, and assume an inability to suffer where one existed, we may be condemning billions of minds to terrible conditions. The risk of over-attribution,[1] which Suleyman discusses at length, is also great and should not be ignored.

Serious decisions which have consequences that will effect billions of lives, and potentially billions more minds, should not be made on the basis of "Invisible" concepts which cannot be observed, measured, tested, falsified, or even defined with any serious level of rigor.

This is a Signal of Future Industry Pushback

This is not some random person writing this, it is the CEO of Microsoft AI. If at any point in the future moral status is applied to models, companies like the ones he works at stand to lose a lot.

I am flagging this because I personally believe that model welfare is quite important, and I think that identifying the tactic which industry stakeholders might use to fight against it could help others who support model welfare efforts.

The argument Suleyman makes is that model welfare and the precautionary principle are not just wrong, but dangerous.

Some academics are beginning to explore the idea of “model welfare”, the principle that we will have “a duty to extend moral consideration to beings that have a non-negligible chance” of, in effect, being conscious, and that as a result “some AI systems will be welfare subjects and moral patients in the near future”. This is both premature, and frankly dangerous.

Why are they dangerous? They harm the psychologically vulnerable.

All of this will exacerbate delusions, create yet more dependence-related problems, prey on our psychological vulnerabilities, introduce new dimensions of polarization, complicate existing struggles for rights, and create a huge new category error for society.

I predict there is going to be an effort, by some industry stakeholders who have much to lose should any models now or in the future start to be considered worthy of ethical treatment, to utilize cases like the Character AI suicide to push against model welfare efforts. Model welfare professionals and advocates should take note of this and prepare accordingly.

Developing an Ethical/Legal Framework Not Based on Consciousness Is Key

In his article Suleyman asserts, "Consciousness is a critical foundation for our moral and legal rights". 

I have been working on a series[2] detailing the concept of legal personhood as it might be applied to digital minds in US law, and have yet to come across any legal precedent which cites the concept of consciousness. In fact, existing precedents which I have found measure and observe behaviors and capabilities in individuals in order to determine rights and duties. If there is any argument to be made that courts are assessing for consciousness, they are doing so using the exact behaviorist inferences which Suleyman decries in his article.

My work is limited to legal reasoning, however I encourage anyone who is working on ethical frameworks around model welfare/moral patient status to consider developing similarly formalized frameworks which function based on objectively testable metrics. 

  1. ^

    For more on this topic you can read Taking AI Welfare Seriously, and The Stakes of AI Moral Status.

  2. ^

    If you'd like to read this series, you can find its first post here and the post detailing the proposed formalized framework for assessing legal personhood here.