Sandbagging: distinguishing detection of underperformance from incrimination, and the implications for downstream interventions. — LessWrong