I’d like an easy way to distinguish the behavior of payoff-maximizing players who would or would not play a strategy in an extensive form game that deviates from the __subgame-perfect__ (Bayesian) equilibrium strategy profile of the game (when their strategy is known to their opponent, and their opponent is also payoff-maximizing).

### Example

An example of what I’m interested in can be seen in an ultimatum game where a proposer presents a responder with an offer of the form , where and for are the respective payoffs of the proposer and the responder.

For now, let’s call a strategy **subgame-optimal** if for every subgame of the game, the strategy’s restriction to that subgame is still optimal within the subgame. In other words, at each decision node of the game, a payoff-maximizing player with a subgame-optimal strategy chooses the action which maximizes their expected payoff (as calculated at that decision node, rather than as calculated according to their prior). A payoff-maximizing player who can commit to a **subgame-suboptimal** strategy will play the action which maximizes the payoff they expected at their initial decision node, without the need to play a strategy that holds up to __backward induction__.

Say that the responder’s strategy is known to the proposer ahead of time, and the proposer is restricted to only subgame-optimal strategies. What strategy should the responder use? A (subgame-suboptimal) strategy to reject all proposals with a > ε for arbitrary ε would force the proposer to make an offer arbitrarily in the responder’s favor. But this is impossible for a responder who can only play subgame-optimal strategies. Updateless agents don’t have to worry about this problem, but updateful agents without commitment devices or values other than payoff maximization do.

### Possible terms for this distinction

*Ex interim*or*ex post*rationality vs.*ex ante*rationality

I like the clarity of focusing on from what perspective the expected payoff of an action is being calculated. A big advantage to these terms is that they are in use to some extent. For example,__Posner (1997)__classifies retaliation in a non-repeated interaction as*ex ante*rational but*ex post*irrational. On the other hand, these terms might sound overly jargon-y or be confused with different uses of “*ex ante*” and “*ex interim/post*”. I prefer “*ex interim*rational” to “*ex post*rational” because in games of imperfect information, it sounds to me like the*ex post*strategy is what the player should have played with perfect information. But in games of perfect information, people would be more likely to understand what*ex post*means than*ex interim*.

*Updateful*rationality vs.*updateless*rationality

I’ve talked about situations like this before by reference to what a UDT agent would do / what a CDT agent would do, which is probably most clear for people who have thought about updateless decision theory, but I think it might be useful to have a term that’s not tied to updateless decision theory for this. Rejecting suboptimal offers in the ultimatum game described above is not only the domain of updateless agents—for example, a CDT agent with a commitment device would do so as well.

*Episodic*rationality vs.*serial*rationality

I like this one, but it may be confused with rational play in one-shot vs. iterated games. “Serial rationality” seems to have been used a little in__discussing Sartre__, but this doesn’t seem to be a serious barrier to use.

*Subgame-perfect*rationality vs.*subgame-imperfect*rationality

I think this gets very close to capturing what I want here while using some kinda standard terminology, but it seems very jargon-y and is easily mixed up for people less familiar with game theory. Also, subgame-perfection seems to be pretty much only used to discuss a profile of both players’ strategies, rather than the strategy of a particular agent, so maybe its extension to cover something else here is confusing.

*Subgame-optimal*rationality vs.*subgame-suboptimal*rationality

Similar to the above, but the meaning of “optimal” might be a little more obvious than that of “perfect”, and it can be more concise. For example, it is briefer to write that an agent’s strategy is “subgame-optimal” than that their strategy is “subgame-imperfect rational”. Having “suboptimal” in the name might also create the impression that I think subgame-suboptimal rational play is bad, which is not the case.

*Sequential*rationality vs.*non-sequential*rationality

“Sequential rationality” is an extension of subgame-perfection to games of incomplete information, prescribing that actions at a decision node X be optimal in an expectation calculated from using the information set associated with that node. I like this a little better than "subgame-(im)perfect rationality" or "subgame-optimal" rationality.

*Action optimization vs. policy optimization*

This might be a little confusing, since in some contexts an agent that can only play subgame-optimal policies is still held to be selecting an “optimal” policy/strategy, just pursuant to its constraint on subgame-optimality. Maybe "action-based" vs. "policy-based" rationality would also work?

*Myopic rationality vs. non-myopic rationality*

This is out I think, since it would be confused with other uses of myopia, but an agent who is restricted to subgame-optimal play is myopic in the sense of having only partial agency.

*Backward-inductive rationality vs. non-backward inductive rationality*

I think this shares most of its pros and cons with “subgame-(im)perfect rationality”.