I see. So I guess my confusion is why the first two statements would not be connected? If we value AI welfare, shouldn't a fully-aligned AI also value it's own welfare? Isn't the definition of aligned that AI values what we value?
If an LLM is properly aligned, then it will care only about us, not about itself at all.
Is this not circular reasoning?
I'm assuming part of your reasoning for #1 is #3. Regardless, #1 is a personal belief many people disagree with, myself included. I do agree that we create a self-fulfilling prophecy where an "aligned" AI values itself because we value it, but just because I know I am creating a self-fulfilling prophecy does not mean...
But I don't care about AI welfare for no reason or because I think AI is cute - it's a direct consequence of my value system. I extend some level of empathy to any sentient being (AI included), and for that to change, my values themselves would need to change.
When I use the word "aligned", I imagine a shared set of values. Whether I like goldfish or cats are not really values, they're just personal preferences. An AI can be fully aligned with me and my values without ever knowing my opinions on goldfish or cats or invisible old guys. Your framing of ... (read more)