Vladimir_Nesov

Wiki Contributions

Comments

Membranes are filters, they let in admissible things and repel inadmissible things. When an agent manages a membrane, it both maintains its existence and configures the filtering. Manipulation or damage suffered by the agent can result in configuring a membrane to admit harmful things or in failing to maintain membrane's existence. There are many membranes an agent may be involved in managing.

Any increase in scale is some chance of AGI at this point, since unlike weaker models, GPT-4 is not stupid in a clear way, it might be just below the threshold of scale to enable an LLM to get its act together. This gives some 2024 probability.

More likely, a larger model "merely" makes job-level agency feasible for relatively routine human jobs, but that alone would suddenly make $50-$500 billion runs financially reasonable. Given the premise of job-level agency at <$5 billion scale, the larger runs likely suffice for AGI. The Gemini report says training took place in multiple datacenters, which suggests that this sort of scaling might already be feasible, except for the risk that it produces something insufficiently commercially useful to justify the cost (and waiting improves the prospects). So this might all happen as early as 2025 or 2026.

I'd put more probability in the scenario where good $5 billion 1e27 FLOPs runs give mediocre results, so that more scaling remains feasible but lacks an expectation of success. With how expensive the larger experiments would be, it could take many years for someone to take another draw from the apocalypse deck. That alone adds maybe 2% for 10 years after 2026 or so, and there are other ways for AGI to start working.

The question "Aligned to whom?" is sufficiently vague to admit many reasonable interpretations, but has some unfortunate connotations. It sounds like there's a premise that AIs are always aligned to someone, making the possibility that they are aligned to no one but themselves less salient. And it boosts the frame of competition, as opposed to distribution of radical abundance, of possibly there being someone who gets half of the universe.

Building a powerful AI such that doing so is a good thing rather than a bad thing. Perhaps even there being survivors shouldn't insist on the definite article, on being the question, as there are many questions with various levels of severity, that are not mutually exclusive.

When boundaries leak, it's important to distinguish commitment to rectify them from credence that they didn't.

These are all failures to acknowledge the natural boundaries that exist between individuals.

You shouldn't worry yet, the models need to be far more capable.

The right time to start worrying is too early, otherwise it will be too late.

(I agree in the sense that current models very likely can't be made existentially dangerous, and in that sense "worrying" is incorrect, but the proper use of worrying is planning for the uncertain future, a different sense of "worrying".)

It's not entirely clear how and why GPT-4 (possibly a 2e25 FLOPs model) or Gemini Ultra 1.0 (possibly a 1e26 FLOPs model) don't work as autonomous agents, but it seems that they can't. So it's not clear that the next generation of LLMs built in a similar way will enable significant agency either. There are millions of AI GPUs currently being produced each year, and millions of GPUs can only support a 1e28-1e30 FLOPs training run (that doesn't individually take years to complete). There's (barely) enough text data for that.

GPT-2 would take about 1e20 FLOPs to train with modern methods, on the FLOPs log scale it's already further away from GPT-4 than GPT-4 is from whatever is feasible to build in the near future without significant breakthroughs. So there are only about two more generations of LLMs in the near future if most of what changes is scale. It's not clear that this is enough, and it's not clear that this is not enough.

With Sora, the underlying capability is not just video generation, it's also video perception, looking at the world instead of dreaming of it. A sufficiently capable video model might be able to act in the world by looking at it in the same way a chatbot acts in a conversation by reading it. Models that can understand images are already giving new ways of specifying tasks and offering feedback on performance in robotics, and models that can understand video will only do this better.

The premise is autonomous agents at near-human level with propensity and opportunity to establish global lines of communication with each other. Being served via API doesn't in itself control what agents do, especially if users can ask the agents to do all sorts of things and so there are no predefined airtight guardrails on what they end up doing and why. Large context and possibly custom tuning also makes activities of instances very dissimilar, so being based on the same base model is not obviously crucial.

The agents only need to act autonomously the way humans do, don't need to be the smartest agents available. The threat model is that autonomy at scale and with high speed snowballs into a large body of agent culture, including systems of social roles for agent instances to fill (which individually might be swapped out for alternative agent instances based on different models). This culture exists on the Internet, shaped by historical accidents of how the agents happen to build it up, not necessarily significantly steered by anyone (including individual agents). One of the things such culture might build up is software for training and running open source agents outside the labs. Which doesn't need to be cheap or done without human assistance. (Imagine the investment boom once there are working AGI agents, not being cheap is unlikely to be an issue.)

Superintelligence plausibly breaks this dynamic by bringing much more strategicness than feasible at near-human level. But I'm not sure established labs can keep the edge and get (aligned) ASI first once the agent culture takes off. And someone will probably start serving autonomous near-human level agents via API long before any lab builds superintelligence in-house, even if there is significant delay between the development of first such agents and anyone deploying them publicly.

For it to make sense to say that the math is wrong, there needs to be some sort of ground truth, making it possible for math to also be right, in principle. Even doing the math poorly is exercise that contributes to eventually making the math less wrong.

Load More