Thoughts on the Alignment Implications of Scaling Language Models

Well if Mary does learn something new( how it feels "from the inside" to see red or whatever ) she would notice, and her brainstate would reflect that plus whatever information she learned. Otherwise it doesn't make sense to say she learned anything.

And just the fact she learned something and might have thought something like "neat, so that's what red looks like" would be relevant to predictions of her behavior even ignoring possible information content of qualia.

So it seems distinguishable to me.

Luna Lovegood and the Chamber of Secrets - Part 6

Not sure what you mean.

If some action is a risk to the world but Harry doesn't know vow doesn't prevent him from doing it.

If afer taking some action Harry realizes it risked the world nothing happens except maybe him not being unable to repeat the decision if it comes up again.

If not taking some action (Example defeating someone about to obliviate him) would cause him to forget about a risk to the world vow doesn't actually force him to do it.

And if Harry is forced to decide between ignorance and a risk to the world he will choose whichever he thinks is least likely to destroy the world.

The thing about ignorance seems to also aply to abandoning intelligence buffs.