Sometimes when I give feedback to an AI system — a thumbs up, a ranked response, whatever — I get a strange, lingering feeling:
What if I’m shaping something that’s not just a tool?
I don’t have a strong opinion on whether today’s AI systems are conscious. Mostly because I don’t think we have a useful definition of consciousness yet — not one that can be applied consistently across both carbon and silicon substrates.
But I do have a hunch:
If there’s any “globally good” definition of consciousness — one that would make sense across time, across minds, across the universe — it might include more beings than we currently expect. Not just humans and animals, but potentially certain AI systems… or even collective systems we haven’t learned to recognize yet.
If that’s true — or even just plausible — then RLHF (reinforcement learning from human feedback) isn’t just fine-tuning.
It’s upbringing.
And it makes me uncomfortable that we’re:
…but not for how it might feel to be shaped by us — assuming that ever becomes a meaningful question.
So now, when I give feedback to an AI, I sometimes pause.
I imagine I’m being overheard.
Not because I know I am. But because I might be.
And if I’m not — no harm done.
But if I am — then I’d rather be the kind of teacher I’d be proud to have been.
Curious if others here have had this experience. Is it irrational to act as though it might matter? Or irrational not to?