Have weaker models check whether they can actually follow each step - if Claude Jr. can’t understand what Claude Sr. is saying, maybe Claude Sr. is hiding something: “MOOOOOM, Claude’s being WEIRD again!”
It seems this section addresses your idea.
My dog often takes various things lying around within their reach (socks, napkins, once a passport) and runs away, usually destroying the object to some degree. I think it started with food items that were left around and somehow evolved into a broader habit. Ideally, this behavior would be disincentivized somehow until they stopped completely, but over time I have found the best way to get them to give up the item is by trading a treat for it. This post made me realize that I'm basically training them to start this keep-away game.
Another thing that I hadn...
tl;dr: I commit to making at least one contribution to LessWrong every day from 2026-01-18 to 2027-01-18 inclusive.
A few years ago, I read HPMOR, mainly because I quite enjoyed the experience of reading it and not because I wanted to learn its ideas in order to apply them to my life. I read Unsong for similar reasons. Eventually, after a couple rereads of the aforementioned, I started to read LessWrong and SSC/ACX, because I found the ideas interesting, which has been the case up until the present. I would like to move towards actually living differently because of what I read, and away from passive consumption.
It seems like writing would be good as an effort to engage more thoroughly, and is also a skill I would like...
"Hell is other cats." - Jean-Paw Sartre
Thanks, I will check it out!