x

LESSWRONG

LW

Cosin V — LessWrong

Cosin V

Cosin V

Message

2

2

2y

Cosin V

2

2y

The case for more ambitious language model evals

The reason truesight works (more than one might naively expect) is probably mostly that there's mountains of evidence everywhere (compared to naively expected)

Yes, long before LLMs existed, there were some "detective" sites that were scary good at inferring all sorts of stuff, from demographics, ethnicity, to financial status of reddit accounts, based on which subreddits they were on, where and (more importantly) what they posted

Humans are leaky

The case for more ambitious language model evals

I googled and couldn't find any info