AGI timelines post-GPT-3 exhibit reverse Hofstadter’s law: AI advances quicker than predicted, even when taking into account reverse Hofstadter’s law.
https://x.com/wintonARK/status/1742979090725101983/photo/1
Pascal's reverse-mugging
One dark evening, Pascal is walking down the street, and a stranger slithers out of the shadows.
"Let me tell you something," the stranger says. "There is a park on the route you walk every day, and in the park is an apple tree. The apples taste very good; I would say they have a value of $5, and no one will stop you from taking them. However -- I am a matrix lord, and I have claimed these apples for myself. I will create and kill 3^^^3 people if you take any of these apples."
On similar reasoning to that which leads most people to reject the standard Pascal's mugging, it seems reasonable to ignore the apple-man's warning and take an apple (provided that the effort involved in picking it is trivial, assuming that Pascal knows that the apples are safe and legal to pick, etc.). However, I suggest that it intuitively seems more reasonable for Pascal to avoid taking the apples than it does for him to pay the mugger $5. However, this constitutes an act-omission distinction. I raise 3 possibilities: Either
Is it relevant whether you knew about the apples before the apple man told you about them? If you didn't know, then the least exploitable response to a message that looks adversarial is to pretend you didn't hear it, which would mean not eating the apples.
Also, pascal's mugging is worth coordinating against- if everyone gives the 5 dollars, the stranger rapidly accumulates wealth via dishonesty. If no one eats the apples, then the stranger has the same tree of apples get less and less eaten, which is less caustic.
Been telling LLMs to behave as if they and the user really like this Kierkegaard quote (to reduce sycophancy). Giving decent results so far.