72 New scorecard evaluating AI companies on safety

by Zach Stein-Perlman

26th May 2025

1 min read

8

72

AI

Frontpage

72

New Comment

8 comments, sorted by

top scoring

Click to highlight new comments since: Today at 9:27 PM

[-]Zac Hatfield-Dodds6mo124

If you don't feel great about the numbers, why are there so many of them on the website? The presentation seems much more focused on the scores than a collection of information.

Reply

[-]Zach Stein-Perlman6mo*84

I think the numbers are much better than nothing and much better than any substitute that currently exists, and I'm not aware of a better design or a great way to deemphasize them while preserving their value.

Edit: like, they convey a lot of real info, and more conservative alternatives would fail to do so.

Reply

[-]Caleb Biddulph6mo82

Off-topic: thanks for commenting in the same thread so I can see your names side-by-side. Until now, I thought you were the same person.

Now that I know Zach does not work at Anthropic, it suddenly makes more sense that he runs a website comparing AI labs and crossposts model announcements from various companies to LW

Reply

7

[-]habryka6mo72

A quick take from me (I did some of the design on the site, though Ray did most of it): I think the numbers are helpful for organizing the content into meaningful categories, and helps people figure out where the interesting content is. Otherwise you would be dealing with a huge amount of prose. I currently think the numbers/table is a pretty decent way to get a sense of what the content is, and where it makes sense to pay attention to (usually the places where there is the most variance in numbers across a category, or where scores are particularly low or high).

Reply

[-]Against Moloch6mo10

I definitely find the presentation useful. In particular, the ability to drill down on each block is great (though it took me a moment to figure out how that worked).

Reply

[-]Raemon6mo20

If you have any thoughts on what would be more intuitive while accomplishing the goal, let me know.

Reply

[-]Anders Lindström6mo1-14

Feedback: I really like the breakdown of each companies stance on safety, but please skíp the percentage numbers. Its just silly to give a score based on guesswork.

Reply

[-]Anders Lindström6mo20

Since a lot of people disagree with this, please tell me what a score of 100% mean or say 50% or 37%? I am not writing this to provoke, I am genuinely interested to know.

Reply

Moderation Log

LESSWRONG
LW

LESSWRONG
LW

72

New scorecard evaluating AI companies on safety

72

72