I wonder: "How it happens that people trust someone or something at all?" (in the context of: how can we ever get to trust an AI smarter than us?).
Epistemic status: this is just a dump of my thoughts (inspired by Babble & Prune Thoughts). Feel free to down-vote if you don't agree or don't like it. But, if that's not too much to ask, please leave a comment explaining what are you objecting to, as my goal in sharing this thoughts is to refine them.
I notice that:(this list is numbered only so it is easier to disagree with me by referring to a specific number - it's not meant to be some finite set of axioms or some ordering):
Hm, so far it looks like, I for one, would be unable to trust an AI more powerful than me, unless I saw how it has a stake too lose if I am not happy with our interactions. And the path to get to there from here, would usually go through alternate rounds of raising the stake by both of us if the two parties are almost-equal. It doesn't have to be alternate and symmetrical if there is asymmetry of power: it may require sacrifices just from the weaker side if in case of asymmetry, but in the scenario considered I'd play the role of the weaker party. I see no way I could trust something more powerful than me - except if it is a sum of smaller parts which can somehow be corrigible, because they are weaker, and have own interests, and coordinate with each other for similar reasons they might coordinate with me if they found this to better advance their goals.
Does the literature on the economics of reputation have ideas that are helpful?