I think the better phrasing would be "is the model going to do what the humans trained (or told) it to do?" (specifying a goal you really want is outer alignment).
I think the better phrasing would be "is the model going to do what the humans trained (or told) it to do?" (specifying a goal you really want is outer alignment).