Will Orion/Gemini 2/Llama-4 outperform o1 — LessWrong