Relevance of 'Harmful Intelligence' Data in Training Datasets (WebText vs. Pile) — LessWrong