LESSWRONG
LW

2161
Martí Mas
0010
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Using GPT-Eliezer against ChatGPT Jailbreaking
Martí Mas3y1-1

The "linux terminal" prompt should have been a yes. Obviously getting access to the model's "imagined terminal" has nothing to do with actually gaining access to the backend's terminal. The model is just pretending. Doesnt harm anybody in anyways, it's just a thought experiment without any dangers

Reply
No posts to display.