Inner Alignment

I think the better phrasing would be "is the model going to do what the humans trained (or told) it to do?" (specifying a goal you really want is outer alignment).

Applied to Thank you for triggering me by Cissy 2mo ago