Regarding the 3x/1.25x copy/speedup multipliers, do you have any kind of back-of-the-envelop justification for why those are plausible, e.g. based on the trend of algorithmic-progress, such that you might expect about this much inference-efficiency gain paralleling the assumed 10x (or so) training-efficiency gain?
Regarding the 3x/1.25x copy/speedup multipliers, do you have any kind of back-of-the-envelop justification for why those are plausible, e.g. based on the trend of algorithmic-progress, such that you might expect about this much inference-efficiency gain paralleling the assumed 10x (or so) training-efficiency gain?