I guarantee someone's thinking about this, but I haven't seen the scammers selling it yet, so I don't know how transparent or discoverable the scraping/input methods are for LLM source data.

Is there any indication that websites or publishers are modifying their pages/data in ways that give themselves more weight in future GPT accessibility/prediction for related prompts?

New to LessWrong?

New Answer
New Comment