x

LESSWRONG
LW

Archana Vaidheeswaran

Message

1

11mo

The Practical Imperative for AI Control Research

As we continue to develop increasingly capable AI systems, many of us have noted the striking contrast between two areas of research. On one hand, we see tremendous energy devoted to making AI systems more capable - better at following instructions, solving problems, and generally doing what we want. On...

Apr 16, 20251

Archana Vaidheeswaran

Subscribe

Message

1

11mo

The Practical Imperative for AI Control Research

As we continue to develop increasingly capable AI systems, many of us have noted the striking contrast between two areas of research. On one hand, we see tremendous energy devoted to making AI systems more capable - better at following instructions, solving problems, and generally doing what we want. On...

Apr 16, 20251

The Practical Imperative for AI Control Research

Archana Vaidheeswaran

10mo

As we continue to develop increasingly capable AI systems, many of us have noted the striking contrast between two areas of research. On one hand, we see tremendous energy devoted to making AI systems more capable - better at following instructions, solving problems, and generally doing what we want. On the other hand, we see comparatively less attention paid to a question that seems increasingly important: how do we ensure these systems don't do what we don't want, especially if they're actively trying to subvert our safeguards?

This is the core challenge of AI control - developing techniques that remain robust even against an AI system that might be actively working to circumvent... (read 932 more words →)

1