Last week we were expecting an Executive Order on Thursday. Then Trump cancelled it, and said he wouldn’t sign it because he was worried it would be too burdensome. Then, with one change, he went ahead and signed it on Tuesday anyway. The Overton Window has shifted. Nothing was not...
You need a lot of data points to understand a new model, and what you have. Trying to gauge from a few benchmarks is misleading. But if you have dozens of them, from a variety of sources, and you put them together with the model card tests and the model...
Everything impacts everything. All knobs that you turn generalize. Thus, when you try to solve one problem, you often create another. There were clearly attempts to address, in this short time, some of the problems with Opus 4.7, including on the model welfare related fronts, including on questions of honesty...
Only six weeks after Opus 4.7, we have Opus 4.8. For everyone, that means another incremental upgrade to Claude. It is once again smarter, and can do tasks for longer, and comes with a number of hot new features. For me, that also means reading another 244 page system card....
Last week ended on a cliffhanger of sorts. What’s in the Executive Order coming later today? What will be in the Magnifica Humanitas? The Executive Order was postponed indefinitely, likely cancelled entirely except for work on securing critical infrastructure. David Sacks and others intervened to kill it, and American AI...
His holiness has spoken, frequently about AI. At eighty two pages of length. The full Magnifica Humanitas can be found here. I am very happy that Pope Leo takes these issues seriously, and is sharing his views, and bringing a form of moral clarity, even with all the flaws and...
Google once again has a model worth at least some consideration. Gemini 3.5 Flash is likely the best model out there at its particular speed point, as long as you don’t mind that it is a Gemini model. So for cases where speed kills, this can be a reasonable choice....