Simon T

Message

I am entering the AI safety field. I want to contribute by solving the problem of unlearning.

How can we apply unlearning?

1) Make LLMs forget dangerous stuff (e.g. CBRN)

2) Current LLMs know when they're being benchmarked. So I want to get situational awareness out of them...

Simon T has not written any posts yet.

Message

I am entering the AI safety field. I want to contribute by solving the problem of unlearning.

How can we apply unlearning?

1) Make LLMs forget dangerous stuff (e.g. CBRN)

2) Current LLMs know when they're being benchmarked. So I want to get situational awareness out of them...

Simon T has not written any posts yet.

LESSWRONG
LW