[ Question ]

What AI safety problems need solving for safe AI research assistants?

by John_Maxwell 14d5th Nov 20191 min read8 comments

15


In his AI Safety “Success Stories” post, Wei Dai writes:

[This] comparison table makes Research Assistant seem a particularly attractive scenario to aim for, as a stepping stone to a more definitive success story. Is this conclusion actually justified?

I share Wei Dai's intuition that the Research Assistant path is neglected, and I want to better understand the safety problems involved in this path.

Specifically, I'm envisioning AI research assistants, built without any kind of reinforcement learning, that help AI alignment researchers identify, understand, and solve AI alignment problems. Some concrete examples:

Possible with yesterday's technology: Document clustering that automatically organizes every blog post about AI alignment. Recommendation systems that find AI alignment posts similar to the one you're reading & identify connections between the thinking of various authors.

May be possible with current or near future technology: An AI chatbot, trained on every blog post about AI alignment, which makes the case for AI alignment to skeptics or attempts to shoot down FAI proposals. Text summarization software that compresses a long discussion between two forum users in a way that both feel is accurate and fair. A NLP system that automatically organizes AI safety writings into a problem/solution table as I described in this post.

May be possible with future breakthroughs in unsupervised learning, generative modeling, natural language understanding, etc.: An AI system that generates novel FAI proposals, or writes code for an FAI directly, and tries to break its own designs. An AI system that augments the problem/solution table from this post with new rows and columns generated based on original reasoning.

What safety problems are involved in creating research assistants of this sort? I'm especially interested in safety problems which haven't yet received much attention, and safety problems with advanced assistants based on future breakthroughs.

New Answer
Ask Related Question
New Comment
Write here. Select text for formatting options.
We support LaTeX: Cmd-4 for inline, Cmd-M for block-level (Ctrl on Windows).
You can switch between rich text and markdown in your user settings.