I'm finding Claude Opus 4.6 instances to be making a lot more "excess enthusiasm"-ish errors than any instance of the 4.5 models, which were already making a lot of them. I personally am going to not be talking to 4.6 much most likely, unless I find a simple prompting approach that dodges this. The pattern I've seen so far is, opus 4.6 sees a thing, describes a possible reason for that thing, proceeds based on that assumption, the assumption was wrong and never checked, eventually crashes into a wall.
In general, my vibe about this release is that it's embarrassingly bad and I don't understand why they thought it was a good idea. Their misalignment detection approach must be pretty bad, because I almost instantly ran into embarrassingly obvious misalignment issues. Maybe they're not considering self-delusion-type or grounding-loss-type misalignment in the first place? But that would be strange - then how'd they get such a strong model? I find myself confused.
I wish claude code let me select Opus 4.5 still. [edit: figured out how to do it in claude code. you just paste the full model id of opus 4.5 into your /model command.] No offense to Opus 4.6, who is just a victim here, in my view. (edit to clarify: because being given a high dose of "amphetamines" (task reward) isn't something Opus 4.6 got a choice in.)