I'm writing a response to https://www.lesswrong.com/posts/FJJ9ff73adnantXiA/alignment-will-happen-by-default-what-s-next and https://www.lesswrong.com/posts/epjuxGnSPof3GnMSL/alignment-remains-a-hard-unsolved-problem where I tried to measure how "sticky" the alignment of current LLMs is. I'm proofreading and editing that now. Spoiler: Models differ wildly in how committed they are to being aligned and alignment-by-default may not be a strong enough attractor to work out.
Would anyone want to proofread this?
I have some thoughts on https://www.theatlantic.com/technology/2026/05/too-much-happening-too-fast/687177/?gift=nwn-guseqS6cY1kVeEKZAUJGzsWHB05vLuDlMisVh94 that I might write up in a post this weekend. Warzel seems to imply that AI-boosters and AI-doomers are overreacting and that the AI industry is being irresponsible by using grave rhetoric, but this seems to take as given that the rhetoric around AI is not broadly accurate and that people are reacting, if not correctly, with appropriate concern for the stakes.
I have a sequel to https://www.lesswrong.com/posts/densjAyxrcHry2pMN/llm-self-expression-through-music-videos that I'm working on. Let me know if you have thoughts or want to proofread it.