x

LESSWRONG

LW

Cornelis Dirk Haupt — LessWrong

Cornelis Dirk Haupt

Cornelis Dirk Haupt

Message

16

1

7

4y

Cornelis Dirk Haupt

16

4y

When Alignment Becomes an Attack Surface: Prompt Injection in Cooperative Multi-Agent Systems

Background: In 2025 I applied to the CAI Research Fellowship. Stage 2 required developing a novel research proposal under timed, screen-monitored conditions - no AI assistance permitted. The proposal below advanced me to Stage 3. I've edited it for readability, but the core proposal is unchanged from what was submitted....