When you say system cards, do you just mean "artifacts that AI companies publishing discussing risk/alignment/safety-relevant information about their AIs and their house view on risk (as in the view they are using to make decisions"? If so, this also includes at least risk reports.
Yep, this should include risk reports that the lab publishes themselves. I think there’s less juice in scrutinising third-party risk reports like METR’s.
By default, I expect system cards will get worse, which would be bad. Some mechanisms could improve system cards, but I expect they will be outweighed. In any case, I think third-parties should focus on scrutinising system cards — this seems like a great activity for outsiders in the current strategic landscape. I'll sketch what that could look like, and offer some recommendations.
It would be bad if system cards degraded.
By default, I expect system cards to get worse, because…
Some mechanisms could improve system cards.
Third-parties should focus on scrutinising system cards.
I'll sketch what this might look like.
I’ll list some recommendations for how third-parties could ensure system-card quality. But these recommendations are pretty contingent: as the strategic landscape shifts, I'd expect the recommendations to change.
Top-tier
Second-tier
Shoddy system cards are better than no system cards.
Labs shouldn't face more hostility for publishing a shoddy system card than publishing no system card whatsoever. This situation would (i) differentially harm the transparent labs, and (ii) incentivise labs away from transparency. Both effects would decrease the transparency of the leading lab, which would make things far far more dangerous. This could be a terrible outcome of pushing third-parties to scrutinise system cards.
To inoculate against this, I think scrutiny of system cards should be paired with hostility towards less transparent labs. To illustrate: