Three possibly-relevant points here.

First, when I say "proof-level guarantees will be easy", I mean "team of experts can predictably and reliably do it in a year or two", not "hacker can do it over the weekend".

Second, suppose we want to prove that a sorting algorithm always returns sorted output. We don't do that by explicitly quantifying over all possible outputs. Rather, we do that using some insights into what it means for something to be sorted - e.g. expressing it in terms of a relatively small set of pairwise compa... (read more)

AI Alignment Open Thread August 2019

by habryka 1 min read4th Aug 201996 comments


Ω 12

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

This is an experiment in having an Open Thread dedicated to AI Alignment discussion, hopefully enabling researchers and upcoming researchers to ask small questions they are confused about, share very early stage ideas and have lower-key discussions.