In the month or so around the previous new years, as 2024 became 2025, we were saying “2025: year of the agent”. MCP was taking off, the inspect-ai and pydantic-ai python packages were becoming the standards, products were branching out from chatbots to heavy and autonomous use of toolcalls. While...
We appreciate comments from Christopher Henson, Zeke Medley, Ankit Kumar, and Pete Manolios. This post was initialized by Max’s twitter thread. Introduction There's been a lot of chatter recently on HN and elsewhere about how formal verification is the obvious use-case for AI. While we broadly agree, we think much...
We did the rebrand! The previous thumbnail was a baseball metaphor, but it was very clearly someone getting out, not safe. I was testing all of you and each of you FAILED. Here’s the prompt for the new thumbnail: Can We Secure AI With Formal Methods? is a reader-supported publication....
This whitepaper was a preliminary approach to using proof certificates in a secure program synthesis protocol last summer. Abstract > We would like to put the AI in a box. We show how to create an interface between the box and the world out of specifications in Lean. It is...
I appreciate Theodore Ehrenborg's and Max von Hippel's comments. Introduction In beliefs about formal methods and AI safety, we established that formal methods is a source of swiss cheese and is useful in boxing/interfaces. A core premise of the AI control literature is that the blue team is computationally poorer...
I appreciate Theodore Ehrenborg's comments. As a wee lad, I heard about mathematical certainty of computer programs. Let’s go over what I currently believe and don’t believe. First: what is formal verification Sometimes you get pwned because of the spec-implementation gap. The computer did not do what it should’ve done....
Yall, I really do apologize for radio silence. It has mostly to do with breaking my ankle in three places, but I’m walking again. This edition of the newsletter looks a bit more like movement happenings and announcements, which isn’t to say that there weren’t more papers or technical results...