Is "red" for GPT-4 the same as "red" for you?

[-]Portia3y62

The author of this paper, and a co-worker of his, recently presented this work to my research group, and we discussed implication, also with artificial phenomenology in mind. My impression is that potential implications could be profound if we make some plausible extra assumptions, but that this mostly concerns attempts to decipher biological consciousness (which is my field, so it had me hyped).

Whether you interpret this - very interesting - work to show that specific qualia are experienced very similarly between humans depends on whether you assume the subjective feel of a qualia is fully determined by its unique and asymmetric relation to others. Assuming this is, imho, plausible (I played around with this idea many years ago already), but that has further implications - among others, the one that the hard problem of consciousness may not be quite as hard as we previously thought.

But this paper basically addresses how a specific phenomenal character could be non-random (which is amazing), and in the process, makes inverse qualia scenarios extremely unlikely (which is also amazing) and gives novel approaches for deciphering neural correlates of consciousness (also amazing); it does not, however, answer the question whether a particular entity with this structure experiences anything at all, which is a completely separate area of research (though it is very tentatively beginning to converge). We specifically discussed that it is quite plausible that you could get a similar structure within an artificial neural net trained to reproduce human perceptions, hence building the equivalent of our phenomenological map, albeit without the neural net feeling anything at all, and hence also not seeing red in particular. I am not sure what implications you see for alignment, or where your final question was heading in this regard.

P.S.: Happy to answer questions about this, but the admins of this site have set me to only one post of any kind per day.

[-]Yusuke Hayashi3y10

Dear Portia,

Thank you for your thought-provoking and captivating response. Your expertise in the field of biological consciousness is clear, and I'm grateful for the depth and breadth of your commentary on the potential implications of this paper.

If we accept the assumption that the subjectivity of a specific qualia is defined by its unique and asymmetric relations to other qualia, then this paper indeed offers a method for verifying the possibility that such qualia could be experienced similarly among humans. Your point that the 'hard problem' of consciousness may not be as challenging as we previously thought is profoundly important.

However, I hold a slightly different view about the 'new approach to deciphering neural correlates of consciousness' proposed in this paper. While I agree that this approach does not specifically answer whether a certain entity with a qualia structure experiences anything, given the right conditions and complexity, I am interested in contemplating the possibility of such an experience occurring, if we were to introduce what you refer to as 'some plausible extra assumptions'.

I apologize if my thoughts on alignment were unclear. I did not sufficiently explain AI alignment in my post. AI alignment is about ensuring that the goals and actions of an AI system coincide with human values and interests. Adding the factor of AI consciousness undoubtedly complicates the alignment problem. For instance, if we acknowledge an AI as a sentient being, it could lead to a situation similar to debates about animal rights, where we would need to balance human values and interests with those of non-human entities. Moreover, if an AI were to acquire qualia or consciousness, it might be able to understand humans on a much deeper level.

Regarding my final question, I was interested in exploring the potential implications of this work in the context of AI alignment and safety, as well as ethical considerations that we might need to ponder as we progress in this field. Your insights have provided plenty of food for thought, and I look forward to hearing more from you.

Thank you again for your profound insights.

Best,
Yusuke

[-]Portia3y10

Thank you for your kind words, and sorry for not having given proper response yet, am really swamped. Currently at the wonderful workshop “Investigating consciousness in animals and artificial systems: A comparative perspective” https://philosophy-cognition.com/cmc/2023/02/01/cfp-workshop-investigating-consciousness-in-animals-and-artificial-systems-a-comparative-perspective-june-2023/ (online as well), on the talk on potential for consciousness in multi-modal LLMs, and encountered this paper, Abdou et al. 2021 “Can Language Models Encode Perceptual Structure Without Grounding? A Case Study in Color” https://arxiv.org/abs/2109.06129 Have not had time to look at properly yet (my to read pile rose considerably today in wonderful ways) but think might be relevant for your question, so wanted to quickly share.

[-]Charlie Steiner3y20

Hi, welcome!

Consciousness isn't actually super relevant to alignment. Alignment is about figuring out how to get world-affecting AI to systematically do good things rather than bad things. This is possible both for conscious and unconscious AI, and consciousness seems to provide neither a benefit nor an impediment to doing good/bad things.

But it's still fun to talk about sometimes.

For this approach, the crucial step 1 is to start with observations of a a big blob of atoms called a "human," and model the human in a way that uses some pieces called "qualia" that have connections with each other. I feel like this is much trickier and more contentious than the later steps of comparing people once you already have a particular way of modeling them.

[-]Yusuke Hayashi3y10

Dear Charlie,

Thank you for sharing your insights on the relationship between consciousness and AI alignment. I appreciate your perspective and find it to be quite thought-provoking.

I agree with you that the challenge of AI alignment applies to both conscious and unconscious AI. The ultimate goal is indeed to ensure AI systems act in a manner that is beneficial, regardless of their conscious state.

However, while consciousness may not directly impact the 'good' or 'bad' actions of an AI, I believe it could potentially influence the nuances of how those actions are performed, especially when it comes to complex, human-like tasks.

Your point about the complexity of modeling a human using "qualia" is well-taken. It's indeed a challenging and contentious task, and I think it's one of the areas where we need more research and understanding.

Do you think there might be alternative or more effective ways to model human consciousness, or is the approach of using "qualia" the most promising one we currently have?

Thank you again for your thoughtful comments. I look forward to further discussing these fascinating topics with you.

Best,
Yusuke

[-]Charlie Steiner3y20

Do you think there might be alternative or more effective ways to model human consciousness, or is the approach of using "qualia" the most promising one we currently have?

IMO the most useful is the description of the cognitive algorithms / cognitive capabilities involved in human-like consciousness. Like remembering events from long-term memory when appropriate, using working memory to do cognitive tasks, responsiveness to various emotions, emotional self-regulation, planning using various abstractions, use of various shallow decision-making heuristics, interpreting sense data into abstract representations, translating abstract representations back into words, attending to stimuli, internally regulating what you're focusing on, etc.

Qualia can also be bundled with capabilities. For example, pain triggers the fight or flight response, it causes you to learn to avoid similar situations in the future, it causes you to focus on plans to avoid the pain, it filters what memories you're primed to recall, etc.

^{^}

G. Kawakita, A. Zeleznikow-Johnston, K. Takeda, N. Tsuchiya and M. Oizumi, Is my "red" your "red"?: Unsupervised alignment of qualia structures via optimal transport. PsyArXiv https://doi.org/10.31234/osf.io/h3pqm (2023).

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

9

Is "red" for GPT-4 the same as "red" for you?

9

9