Reverse engineering the memory layout of GPU inference — LessWrong