Critiquing Risks From Learned Optimization, and Avoiding Cached Theories
What I'm doing with this post and why I've been told that it's a major problem (this post, point 4), of alignment students just accepting the frames they're given without question. The usual advice (the intro of this) is to first do a bunch of background research and thinking on...
Jul 11, 20231