x

LESSWRONG
LW

Coagulopath

Subscribe

Message

1

2

3mo

Coagulopath

Subscribe

Message

1

2

3mo

ryan_greenblatt's Shortform

Coagulopath14h10

Current LLMs are just not that "smart" (yet).

I agree. I think (current) LLMs are mainly impressive because they know everything, and their actual pound-for-pound intelligence is still fairly subhuman.

When I see the reasoning of a LLM, I am struck by how "unsmart" it seems. Going down blind paths, failing to notice big-picture implications, repeating the same thoughts over and over. They do a lot of thinking, but it's still not high quality thinking.

Yes, I know reasoning is not really an analogue for human thinking. But whatever it is—reasoning, daydreaming... (read more)

Reply

Mainstream approach for alignment evals is a dead end

Coagulopath3mo21

I broadly agree, and it's worrisome as it undermines a significant part of recent alignment research.

Anthropic (and others) release papers from time to time. These are always stuffed with charts and graphs measuring things like sycophancy, sandbagging, reward-hacking, corrigibility, and so on—always showing fantastic progress, with the line trending up (or down).

So it's dismaying to see things like AI Village, where models (outside their usual testing environments) seem to collapse back on their old ways: sycophantic, dishonest, gullible, manipulativ... (read more)

Reply