i agree with the essay that natural selection only comes into play for entities that meet certain conditions (self-replicate, characteristics have variation, etc) , though I think it defines replication a little too rigidly. i think replication can sometimes look more like persistence than like producing a fully new version of itself. (eg a government's survival from one decade to the next).
does anyone think now that it's still possible to prevent recursively self-improving agents? esp now that r1 is open-source... materials for smart self-iterating agents seem accessible to millions of developers.
prompted in particular by the circulation of this essay in past three days https://huggingface.co/papers/2502.02649
As far as I can tell, OAI's new current safety practices page only names safety issues related to current LLMs, not agents powered by LLMs. https://openai.com/index/openai-safety-update/
Am I missing another section/place where they address x-risk?
Though, future sama's power, money, and status all rely on GPT-(T+1) actually being smarter than them.
I wonder how he's balancing short-term and long-term interests
Evolutionary theory is intensely powerful.
It doesn't just apply to biology. It applies to everything—politics, culture, technology.
It doesn't just help understand the past (eg how organisms developed). It helps predict the future (how organisms will).
It's just this: the things that survive will have characteristics that are best for helping it survive.
It sounds tautological, but it's quite helpful for predicting.
For example, if we want to predict what goals AI agents will ultimately have, evolution says: the goals which are most helpful for the AI to survive. The core goal therefore won't be serving people or making paperclips. It will likely just be "survive." This is consistent with the predictions of instrumental convergence.
Generalized, predictive evolutionary theory is the best tool I have for making predictions in complex domains.
i agree but think its solvable and so human content will be duper valuable. these are my additional assumptions
3. for lots of kinds of content (photos/stories/experiences/adr), people'll want it to be a living being on the other end
4. insofar as that's true^, there will be high demand for ways to verify humanness, and it's not impossible to do so (eg worldcoin)
and still the fact that it is human matters to other humans
Two things lead me to think human content online will soon become way more valuable.
The implication: make tons of digital stuff. Write/Draw/Voice-record/etc
Agree that individual vs. group selection usually unfolds on different timescales. But a superintelligence might short-circuit the slow, evolutionary "group selection" process by instantly realizing its own long-term survival depends on the group's. In other words, it's not stuck waiting for natural selection to catch up; it can see the big picture and "choose" to identify with the group from the start.
This is why it's key that AGI makers urge it to think very long term about its survival early on. If it thinks short-term, then I too think doom is likely.
I somewhat agree with the nuance you add here—especially the doubt you cast on the claim that effective traits will usually become popular but not necessarily the majority/dominant. And I agree with your analysis of the human case: in random, genetic evolution, a lot of our traits are random and maybe fewer than we think are adaptive.
Makes me curious what the conditions in a given thing's evolution that determine the balance between adaptive characteristics and detrimental characteristics.
I'd guess that randomness in mutation is a big factor. The way human genes evolve over generations seem to me a good example of random mutations. But the way an individual person evolves over the course of their life, as they're parented/taught... "mutations" to their person are still somewhat random but maybe relatively more intentional/intelligently designed (by parents, teacher, etc). And I could imagine the way a self-improving superintelligence would evolve to be even more intentional, where each self-mutation has some sort of smart reason for being attempted.
All to say, maybe the randomness vs. intentionality of an organism's mutations determine what portion of their traits end up being adaptive. (hypothesis: mutations more intentional > greater % of traits are adaptive)