LESSWRONG
LW

2306
Chris Merck
3020
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No posts to display.
No wikitag contributions to display.
Foom & Doom 1: “Brain in a box in a basement”
Chris Merck4mo30

The power of LLMs comes almost entirely from imitation learning on human text. This leads to powerful capabilities quickly, but with a natural ceiling (i.e., existing human knowledge), beyond which it’s unclear how to make AI much better.

 

What do we make of RLVR on top of strong base models? Doesn’t this seem likely to learn genuinely new classes of problem currently unsolvable by humans? (I suppose it require us to be able to write reward functions, but we have Lean and the economy and nature that are glad to provide rewards even if we don’t know the solution ahead of time.)

Reply
New York City, NY – ACX Meetups Everywhere 2021
Chris Merck4y20

Ok. For what it’s worth, it was clear to me from the site UX where I could see the others’ names. But I did find it a bit surprising. Looking forward to meeting y’all.

Reply