Chamod Kalupahana's Shortform

LESSWRONG

Chamod Kalupahana's Shortform — LessWrong

5th Jun 2026

1 min read

This is a special post for quick takes by Chamod Kalupahana. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

3 comments, sorted by

Click to highlight new comments since: Today at 4:19 PM

Great post on using measuring logit distribution for eval awareness. Surprisingly effective, even for unverbalised eval awareness! ✍️

saw an interesting post for those also trying to transition into technical AI safety :)

Greenblatt post trying out open weight NLAs showing the limitations for reconstructing activations