x

LESSWRONG
LW

Seamus_F — LessWrong

Seamus_F

Seamus_F

Message

11

2y

Seamus_F hasn't written anything yet.

Seamus_F

11

2y

;

Seamus_F has not written any posts yet.

Robustness of Contrast-Consistent Search to Adversarial Prompting

Produced as part of the AI Safety Hub Labs programme run by Charlie Griffin and Julia Karbing. This project was mentored by Nandi Schoots. Image generated by DALL-E 3. Introduction We look at how adversarial prompting affects the outputs of large language models (LLMs) and compare it with how the...

Nov 1, 2023•18