Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

New to LessWrong?

New Comment
1 comment, sorted by Click to highlight new comments since: Today at 1:08 AM

I like this suggestion of a more feasible form of steganography for NNs to figure out! But I think you'd need further advances in transparency to get useful informed oversight capabilities from (transformed or not) copies of the predictive network.