Problems often persist not because they are hard, but because nobody's paying attention to them. These eight fictitious studies are sketches of what paying attention can look like in practice.
The Well-Dressed Intern
Aaron wanted to make a strong impression in his first week as an intern, optimizing his appearance to be taken seriously. He aimed to both fit into the office culture and subtly signal his status as an intern, making it significantly easier to ask so-called "stupid questions."
During his interviews, Aaron paid careful attention to the office's general dress code. He noted the typical button-downs and khakis, but also observed senior associates leaning towards slightly more formal attire.
Based on these observations, Aaron... (read 1986 more words →)
Agreed. I think a more measured phrasing is "I am confident that across a significant number of current AI companies, in the near future, the models are more similar than one would expect from reading the difference between (representative example) Claude's Constitution and OpenAI's Model Spec."
I think I am personally holding the strong conclusion (that you are pushing back on) as plausible-enough and important-if-true-enough that I want to be tracking how true I think it is as I get more information.
If even moderately weak forms of "instrumental convergence of model design" end up clearly false in O(5-years) time, I would be surprised but not confused.
If anyone with more context on modern AI models than me disagrees, please say so; me knowing multiple perspectives on all of this is useful and makes my world-models better.