Technical staff at Anthropic (views my own), previously #3ainstitute; interdisciplinary, interested in everything, ongoing PhD in CS, bets tax bullshit, open sourcerer, more at zhd.dev
To quote from Anthropic's letter to Govenor Newsom,
As you may be aware, several weeks ago Anthropic submitted a Support if Amended letter regarding SB 1047, in which we suggested a series of amendments to the bill. ... In our assessment the new SB 1047 is substantially improved, to the point where we believe its benefits likely outweigh its costs.
...
We see the primary benefits of the bill as follows:
- Developing SSPs and being honest with the public about them. The bill mandates the adoption of safety and security protocols (SSPs), flexible policies for managing catastrophic risk that are similar to frameworks adopted by several of the most advanced developers of AI systems, including Anthropic, Google, and OpenAI. However, some companies have still not adopted these policies, and others have been vague about them. Furthermore, nothing prevents companies from making misleading statements about their SSPs or about the results of the tests they have conducted as part of their SSPs. It is a major improvement, with very little downside, that SB 1047 requires companies to adopt some SSP (whose details are up to them) and to be honest with the public about their SSP-related practices and findings.
...
We believe it is critical to have some framework for managing frontier AI systems that roughly meets [requirements discussed in the letter]. As AI systems become more powerful, it's crucial for us to ensure we have appropriate regulations in place to ensure their safety.
Here I am on record supporting SB-1047, along with many of my colleagues. I will continue to support specific proposed regulations if I think they would help, and oppose them if I think they would be harmful; asking "when" independent of "what" doesn't make much sense to me and doesn't seem to follow from anything I've said.
My claim is not "this is a bad time", but rather "given the current state of the art, I tend to support framework/liability/etc regulations, and tend to oppose more-specific/exact-evals/etc regulations". Obviously if the state of the art advanced enough that I thought the latter would be better for overall safety, I'd support them, and I'm glad that people are working on that.
There's a big difference between regulation which says roughly "you must have something like an RSP", and regulation which says "you must follow these specific RSP-like requirements", and I think Mikhail is talking about the latter.
I personally think the former is a good idea, and thus supported SB-1047 along with many other lab employees. It's also pretty clear to me that locking in circa-2023 thinking about RSPs would have been a serious mistake, and so I (along with many others) am generally against very specific regulations because we expect they would on net increase catastrophic risk.
Improving the sorry state of software security would be great, and with AI we might even see enough change to the economics of software development and maintenance that it happens, but it's not really an AI safety agenda.
(added for clarity: of course it can be part of a safety agenda, but see point #1 above)
I'm sorry that I don't have time to write up a detailed response to (critique of?) the response to critiques; hopefully this brief note is still useful.
I remain frustrated by GSAI advocacy. It's suited for well-understood closed domains, excluding e.g. natural language, when discussing feasibility; but 'we need rigorous guarantees for current or near-future AI' when arguing for importance. It's an extension to or complement of current practice; and current practice is irresponsible and inadequate. Often this is coming from different advocates, but that doesn't make it less frustrating for me.
Claiming that non-vacuous sound (over)approximations are feasible, or that we'll be able to specify and verify non-trivial safety properties, is risible. Planning for runtime monitoring and anomaly detection is IMO an excellent idea, but would be entirely pointless if you believed that we had a guarantee!
It's vaporware. I would love to see a demonstration project and perhaps lose my bet, but I don't find papers or posts full of details compelling, however long we could argue over them. Nullius in verba!
I like the idea of using formal tools to complement and extend current practice - I was at the workshop where Towards GSAI was drafted, and offered co-authorship - but as much I admire the people involved, I just don't believe the core claims of the GSAI agenda as it stands.
I don't think Miles' or Richard's stated reasons for resigning included safety policies, for example.
But my broader point is that "fewer safety people should quit leading labs to protest poor safety policies" is basically a non-sequitor from "people have quit leading labs because they think they'll be more effective elsewhere", whether because they want to do something different or independent, or because they no longer trust the lab to behave responsibly.
I agree with Rohin that there are approximately zero useful things that don't make anyone's workflow harder. The default state is "only just working means working, so I've moved on to the next thing" and if you want to change something there'd better be a benefit to balance the risk of breaking it.
Also 3% of compute is so much compute; probably more than the "20% to date over four years" that OpenAI promised and then yanked from superalignment. Take your preferred estimate of lab compute spending, multiply by 3%, and ask yourself whether a rushed unreasonable lab would grant that much money to people working on a topic it didn't care for, at the expense of those it did.
My impression is that few (one or two?) of the safety people who have quit a leading lab did so to protest poor safety policies, and of those few none saw staying as a viable option.
Relatedly, I think Buck far overestimates the influence and resources of safety-concerned staff in a 'rushed unreasonable developer'.
This seems like a very long list of complicated and in many cases new and untested changes to the way schools usually work... which is not in itself bad, but does make the plan very risky. How many students do you imagine attend this school? Have you spoken to people who have founded a similar-sized school?
The good news is that outcomes for exciting new opt-in educational things tend to be pretty good; the bad news is that this is usually for reasons other than "the new thing works" - e.g. the families are engaged and care about education, the teachers are passionate, the school is responsive to changing conditions, etc. If your goal is large-scale educational reform I would not hold out much hope; if you'd be happy running a small niche school with flourishing students (eg) for however long it lasts, that seems achievable with hard work.
Largely via opposition to nuclear weapons, and some cost-benefit analysis which assumes nuclear proponents are too optimistic about both costs and risks of nuclear power (further reading). Personally I think this was pretty reasonable in the 70s and 80s. At this point I'd personally prefer to keep existing nuclear running and build solar panels instead of new reactors, though if SMRs worked in a sane regulatory regime that'd be nice too.