What I'm saying is that there is no upper bound for real world agents, the scale of "locally" in this weird sense can be measured in eons and galaxies.
The locally/globally distinction is suspicious, since "locally" here can persist at an arbitrary scale. If all the different embedded maps live within the same large legible computation, statistical arguments that apply to the present-day physical world will fail to clarify the dynamics of their interaction.
A norm is more effective when it acts at all the individual relatively insignificant steps, so that they don't add up. The question of whether the steps are pointing in the right direction is the same for all the steps, so could as well be considered seriously at the first opportunity, even when it's not a notable event on object level.
My point about computers-in-practice is that this is no longer an issue within the computers, indefinitely. You can outpace the territory within a computer using a smaller map from within the computer. Whatever "computational irreducibility" is, the argument doesn't apply for many computations that can be set up in practice, that is they can be predicted by smaller parts of themselves. (Solar flares from distant future can't be predicted, but even that is not necessarily an important kind of practical question in the real world, after the universe is overwritten with computronium, and all the stars are dismantled to improve energy efficiency.)
the smaller map cannot outpace the full time evolution of the territory
A program can predict another program regardless of when either of them is instantiated in the territory (neither needs to be instantiated for this to work, or they could be instantiated at many times simultaneously). Statistical difficulties need to be set up more explicitly, there are many ways of escaping them in principle (by changing the kind of territory we are talking about), or even in practice (by focusing on abstract behavior of computers).
If the claim is sufficiently true and becomes sufficiently legible to casual observers, this shifts the distribution of new users, and behavior of some existing users, in ways that seem bad overall.
A map is something that can answer queries, it doesn't need to be specifically a giant lookup table. If a map can perfectly describe any specific event when queried about it, it's already centrally a perfect map, even if it didn't write down all answers to all possible questions on stone tablets in advance. But even then, in a computational territory we could place a smaller map that is infinite in time, and it will be able to write down all that happens in that territory at all times, with explicit representations of events in the territory being located either in the past or in the future of the events themseves.
That's not the kind of thing that's good to legibly advertise.
Due to embedded agency, no embedded map can fully capture the entire territory.
s='s=%r;print(s%%s)';print(s%s)
Quines print the whole territory from a smaller template they hold as a map. There is a practical limitation, and a philosophical difficulty (the meaning of a map considered by itself is a different kind of thing from the territory). But it's not about embedded agency.
The relevance of extracting/formulating something "local" is that prediction by smaller maps within it remains possible, ignoring the "global" solar flares and such. So that is a situation that could be set up so that a smaller agent predicts everything eons in the future at galaxy scale. Perhaps a superintelligence predicts human process of reflection, that is it's capable of perfectly answering specific queries before the specific referenced event would take place in actuality, while the computer is used to run many independent possibilities in parallel. So the superintelligence couldn't enumerate them all in advance, but it could quickly chase and overtake any given one of them.
Even a human would be capable of answering such questions if nothing at all is happening within this galaxy scale computer, and the human is paused for eons after making the prediction that nothing will be happening. (I don't see what further "first sense" of locality or upper bound that is distinct from this could be relevant.)