LLMs combined with Reinforcement Learning (RL) have unlocked new impressive capabilities. But do we simply need more scaling to reach the next step: AI for science and research? If not, what are the limitations, and what else is required?
In this talk, Yongjin Yang will share research on three fundamental bottlenecks of reinforcement learning for LLMs: skewed queries, limited exploration, and sparse reward signals. We will also discuss potential solutions to these challenges, as well as safety concerns.
Event Schedule
6:00 to 6:30 - Food and introductions
6:30 to 7:30 - Presentation and Q&A
7:30 to 9:00 - Open Discussions
If you can't attend in person, join our live stream starting at 6:30 pm via this link.
This is part of our weekly AI Safety Thursdays series. Join us in examining questions like:
Posted on: