Nadav Brandes

Message

AGI with RL is Bad News for Safety

I haven’t found many credible reports on what algorithms and techniques have been used to train the latest generation of powerful AI models (including OpenAI’s o3). Some reports suggest that reinforcement learning (RL) has been a key part, which is also consistent with what OpenAI officially reported about o1 three...

Dec 21, 202419

Language Models are a Potentially Safe Path to Human-Level AGI

The core argument: language models are more transparent and less prone to develop agency and superintelligence I argue that compared to alternative approaches such as open-ended reinforcement learning, the recent paradigm of achieving human-level AGI with language models has the potential to be relatively safe. There are three main reasons...

Apr 20, 202328

LESSWRONG
LW

LESSWRONG
LW

Nadav Brandes

Nadav Brandes

Nadav Brandes

Nadav Brandes

AGI with RL is Bad News for Safety

Language Models are a Potentially Safe Path to Human-Level AGI

AGI with RL is Bad News for Safety

Language Models are a Potentially Safe Path to Human-Level AGI