These objections are all reasonable, and 3 is especially interesting to me -- it seems like the biggest objection to the structure of the argument I gave. Thanks.

I'm afraid that the point I was trying to make didn't come across, or that I'm not understanding how your response bears on it. Basically, I thought the post was prematurely assuming that schemes like Paul's are not amenable to any kind of argument for confidence, and we will only ever be able to say "well, I ran out of ideas for how to break it", so I wanted to sketch an argument structure to explain why I thought we might be able to make positive arguments for safety.

Do you think it's unlikely that we'll be able to make positive arguments for the safety of schemes like Paul's? If so, I'd be really interested in why -- apologies if you've already tried to explain this and I just haven't figured that out.

Where's the first benign agent?

by IAFF-User-214 1 min read15th Apr 2017No comments


Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.