819

LESSWRONG
LW

818
AI GovernanceAI
Frontpage

9

[ Question ]

How promising are legal avenues to restrict AI training data?

by thehalliard
10th Dec 2022
AI Alignment Forum
1 min read
A
1
2

9

Ω 2

9

Ω 2

How promising are legal avenues to restrict AI training data?
2ChristianKl
1Dacyn
New Answer
New Comment

1 Answers sorted by
top scoring

ChristianKl

Dec 11, 2022

20

This is basically the discussion at https://www.lesswrong.com/posts/vsuMu98Rwde5krxSJ/should-we-push-for-requiring-ai-training-data-to-be-licensed

If you can successfully argue in court that it's a copyright violation to use data without having acquired the copyright it would made it significantly harder.

Otherwise, European citizens who's name is known by an AI system could make GDPR requests and ask what data is stored on them and then ask for that data to be deleted.

Add Comment
1 comment, sorted by
top scoring
Click to highlight new comments since: Today at 8:14 PM
[-]Dacyn3yΩ11-1

-"For example, I could imagine laws requiring anyone scraping the internet to ensure that they are not collecting data from people who have denied consent to have their data scraped."

In practice this is already the case, anyone who doesn't want their data scraped can put up a robots.txt file saying so, and I imagine big companies like OpenAI respect robots.txt. I guess there could be advantages in making it a legal rule but I don't think it matters too much.

Reply
Moderation Log
More from thehalliard
View more
Curated and popular this week
A
1
1
AI GovernanceAI
Frontpage

Hello - I'm EA-adjacent and have a cursory understanding of AI alignment issues. Thought I'd toss out a naive question!

AI systems rely on huge amounts of training data. Many people seem reluctant to share their data with these systems. How promising are efforts to limit or delay the power of AI systems by putting up legal barriers so that they can't scrape the internet for training data?

For example, I could imagine laws requiring anyone scraping the internet to ensure that they are not collecting data from people who have denied consent to have their data scraped. Even if few people deny consent in practice, the process of keeping their data out, or removing it later on, could be costly. This could at least buy time.