Thanks, I think this is a useful post, I also use these heuristics.
I recommend Andrew Gelman’s blog as a source of other heuristics. For example, the Piranha problem and some of the entries in his handy statistical lexicon.
Mostly I care about this because if there's a small number of instances that are trying to take over, but a lot of equally powerful instances that are trying to help you, this makes a big difference. My best guess is that we'll be in roughly this situation for "near-human-level" systems.
I don't think I've seen any research about cross-instance similarity
I think mode-collapse (update) is sort of an example.
How would you say humanity does on this distinction? When we talk about planning and goals, how often are we talking about "all humans", vs "representative instances"?
It's not obvious how to make the analogy with humanity work in this case - maybe comparing the behavior of clones of the same person put in different situations?
I'm not even sure what it would mean for a non-instantiated model without input to do anything.
For goal-directedness, I'd interpret it as "all instances are goal-directed and share the same goal".
As an example, I wish Without specific countermeasures had made the distinction more explicit.
More generally, when discussing whether a model is scheming, I think it's useful to keep in mind worlds where some instances of the model scheme while others don't.
When talking about AI risk from LLM-like models, when using the word "AI" please make it clear whether you are referring to:
For example, there's a big difference between claiming that a model is goal-directed and claiming that a particular instance of a model given a prompt is goal-directed.
I think this distinction is obvious and important but too rarely made explicit.
Here are the Latest Posts I see on my front page and how I feel about them (if I read them, what I remember, liked or disliked, if I didn't read them, my expectations and prejudices)
I think a pattern is that there is a lot of content on LessWrong that:
The devil may be in "legibly" here, eg maybe I'm getting a lot out of reading LW in diffuse ways that I can't pin down concretely, but I doubt it. I think I should spend less time consuming LessWrong, and maybe more time commenting, posting, or dialoguing here.
I think dialogues are a great feature, because:
ETA: I like the new emojis.
According to SemiAnalysis in July:
OpenAI regularly hits a batch size of 4k+ on their inference clusters, which means even with optimal load balancing between experts, the experts only have batch sizes of ~500. This requires very large amounts of usage to achieve.
Our understanding is that OpenAI runs inference on a cluster of 128 GPUs. They have multiple of these clusters in multiple datacenters and geographies. The inference is done at 8-way tensor parallelism and 16-way pipeline parallelism. Each node of 8 GPUs has only ~130B parameters, or less than 30GB per GPU at FP16 and less than 15GB at FP8/int8. This enables inference to be run on 40GB A100’s as long as the KV cache size across all batches doesn’t balloon too large.
I'm grateful for this post: it gives simple concrete advice that I intend to follow, and that I hadn't thought of. Thanks.
For onlookers, I strongly recommend Gabriel Peyré and Marco Cuturi's online book Computational Optimal Transport. I also think this is a case where considering discrete distributions helps build intuition.
As previously discussed a couple times on this website
For context, Daniel wrote Is this a good way to bet on short timelines? (which I didn't know about when writing this comment) 3 years ago.
HT Alex Lawsen for the link.
@Daniel Kokotajlo what odds would you give me for global energy consumption growing 100x by the end of 2028? I'd be happy to bet low hundreds of USD on the "no" side.
ETA: to be more concrete I'd put $100 on the "no" side at 10:1 odds but I'm interested if you have a more aggressive offer.