LESSWRONG
LW

196
Singularian2501
9200
Message
Dialogue
Subscribe

I like reading Machine Learning Paper.

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No Comments Found
No wikitag contributions to display.
6Paper: Identifying the Risks of LM Agents with an LM-Emulated Sandbox - University of Toronto 2023 - Benchmark consisting of 36 high-stakes tools and 144 test cases!
2y
0
5RAIN: Your Language Models Can Align Themselves without Finetuning - Microsoft Research 2023 - Reduces the adversarial prompt attack success rate from 94% to 19%!
2y
0