Gemini 3: The Cost of Better Execution
TL;DR: I was working on a piece about interpretation in LLMs when Gemini 3 dropped. Started testing, noticed something was off. Despite Google's claims about "unprecedented depth and nuance," Gemini 3 consistently shows *less* interpretive capacity than 2.5. It executes well but skips the thinking that should precede execution. The...
Dec 3, 20251