SE Gyges' response to AI-2027
Like Daniel Kokotajlo's coverage of Vitalik's response to AI-2027, I've copied the author's text. However, I would like to comment upon potential errors right in the text, since it would be clearer. AI 2027 is a web site that might be described as a paper, manifesto or thesis. It lays out a detailed timeline for AI development over the next five years. Crucially, per its title, it expects that there will be a major turning point sometime around 2027[1], when some LLM will become so good at coding that humans will no longer be required to code. This LLM will create the next LLM, and so on, forever, with humans soon losing all ability to meaningfully contribute to the process. They avoid calling this “the singularity”. Possibly they avoid the term because using it conveys to a lot of people that you shouldn’t be taken too seriously. I think that pretty much every important detail of AI 2027 is wrong. My issue is that each of many different things has to happen the way they expect, and if any one thing happens differently, more slowly, or less impressively than their guess, later events become more and more fantastically unlikely. If the general prediction regarding the timeline ends up being correct, it seems like it will have been mostly by luck. I also think there is a fundamental issue of credibility here. Sometimes, you should separate the message from the messenger. Maybe the message is good, and you shouldn't let your personal hangups about the person delivering it get in the way. Even people with bad motivations are right sometimes. Good ideas should be taken seriously, regardless of their source. Other times, who the messenger is and what motivates them is important for evaluating the message. This applies to outright scams, like emails from strangers telling you they're Nigerian princes, and to people who probably believe what they're saying, like anyone telling you that their favorite religious leader or musician is the greatest one ever. You can guess,
@Raemon, I suspect that the real phenomenon behind the thing that David saw and you didn't is that the LLMs grokked or have been trained into a different abstraction of good according to the cultural hegemon of the LLM and/or of the user or, which is more noticeable, according to the user or the creator oneself in a manner similar to Agent-3 from the AI-2027 scenario.
On the other hand, I also suspect that David's proposal that some kind of Natural Abstraction of Goodness exists isn't as meaningless as you believe.
A potential meaning of David's proposal
The existence of a Natural Abstraction of Goodness would immediately follow from @Wei Dai's metaethical alternatives 1 and