I don't think this is quite what the paper shows. I will need to read more closely to be sure, so I'm not posting this as an answer.
If you know the exact last-token state for an unknown prompt (that is, the probabilities assigned to each possible next token), then just because there are countably many prompts and (abstractly, precision matters some amount here) uncountably many possible end states, in practice we should expect that that last-token state corresponds to only one possible prompt, and we can reverse-engineer what that prompt was without too much difficulty (there is some difficulty, we don't know prompt length, math is at least a bit hard here but it's not that hard).
But this doesn't do what you want it to do: most probability distributions on the next token are not the last-token state for any prompt, so we can't use this to find magic prompts. The "output" of the model is not just the token it selects, it's the full set of logits.
A recent paper shows an algorithm to invert an LLM to find the inputs (I think? I'm not an ML guy), does that mean you can now turn a predictor directly into a world-steerer? If you put in an output, and it finds the input most likely to cause that output, does that mean it will find the things it needs to say in order for the most likely next token to be the chosen one, even if that next token is something said by a human? If that is actually how it works, it really looks like this is a major breakthrough, and strong agents will be here shortly.