Nice C. Ineza's Shortform

Nice C. Ineza

Nice C. Ineza's Shortform

27th Mar 2026

1 min read

1

This is a special post for quick takes by Nice C. Ineza. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

1 comment, sorted by

top scoring

Click to highlight new comments since: Today at 1:06 AM

[-]Nice C. Ineza19d00

I am increasingly inclined to look into treating model representations as directions in activation space rtaher than individuals neurons is where maybe i can uncover more on mech. interp.
Wondering if there could be ''feature directions"that is corresponding to when a model could go nuts or just to an unsafe code generation or jailbreak like behavior.

Geometry could be our solution, just a thought!

Reply

Moderation Log