LESSWRONG
LW

1744
Davidmanheim
5766Ω1237812641
Message
Dialogue
Subscribe

Sequences

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
Modeling Transformative AI Risk (MTAIR)
7Davidmanheim's Shortform
Ω
10mo
Ω
18
AIs should also refuse to work on capabilities research
Davidmanheim8d20

This is a good question, albeit only vaguely adjacent. 

My answer would be that winners curse only applies if firms aren't actively minimizing the extent to which they overbid. In the current scenario, firms are trying (moderately) hard to prevent disaster, just not reliably enough to succeed indefinitely. However, once they fail, we could easily be far past the overhang point for the AI succeeding.

Assuming to start, implausibly, that the AI itself is not strategic enough to consider its chances of succeeding, we'll assume AI capabilities nonetheless keep increasing. The firms can also detect and prevent it from trying with some probability, but their ability to monitor and stop it from trying is decreasing. The better AI firms are at stopping the models, and the slower that their ability declines relative to model capability, the more likely it is that when they do fail, the AI will succeed. And if the AIs are strategic, they will be much less likely to try if they are likely to either fail or be detected, so they ill wait even longer.

Reply
Musings on Reported Cost of Compute (Oct 2025)
Davidmanheim8d70

Data. Find out the answer.

https://www.wevolver.com/article/tpu-vs-gpu-a-comprehensive-technical-comparison

Looks like they arehwitin 2x of the H200s, albeit with some complexity in details.

Reply
Musings on Reported Cost of Compute (Oct 2025)
Davidmanheim9d20

Because it's what they can get. A factor of two or more in compute is plausibly less important than a delay of a year.

This may or may not be the case, but the argument for why it can't be very different fails.

Reply
Introducing the Epoch Capabilities Index (ECI)
Davidmanheim9d40

As I mentioned elsewhere, I'm interested in the question of how you plan to re-base the index over time.

The index excludes models from before 2023, which is understandable, since they couldn't use benchmark released after that date, which are now the critical ones. Still, it seems like a mistake, since I don't have any indication of the adaptability of the method for the future when current metrics are saturated. The obvious way to do this seems (to me) to be by including earlier benchmarks that are now saturated so that the time series can be extended backwards. And I understand that this data may be harder to collect, but as noted, it seems important to show future adaptability.

Reply
AIs should also refuse to work on capabilities research
Davidmanheim10d30

I think the space of possible futures is, in fact, almost certainly deeply weird from our current perspective. But that's been true for some time already; imagine trying to explain current political memes to someone from a couple decades ago.

Reply1
Maybe Use BioLMs To Mitigate Pre-ASI Biorisk?
Davidmanheim10d20

Yes - they made a huge number of mistakes, despite having sophisticated people and tons of funding. It's been used over and over to make the claim that bioweapons are really hard - but I do wonder how much using an LLM for help would avoid all of these classes of mistake. (How much prosaic utility is there for project planning in general? Some, but at high risk if you need to worry about detection, and it's unclear that most people are willing to offload or double check their planning, despite the advantages.)

Reply
AIs should also refuse to work on capabilities research
Davidmanheim11d31

Sure, and if a machine just slightly smarter than us deployed by an AI company solves alignment instead of doing what it's been told to do, which is capabilities research, the argument will evidently have succeeded.

Reply
Maybe Use BioLMs To Mitigate Pre-ASI Biorisk?
Davidmanheim11d20

A marginal bioterrorist could probably just brew up a vat of anthrax which technically counts.


Perhaps worth nothing that they've tried in the past, and failed.

Reply
Cancer has a surprising amount of detail
Davidmanheim11d31

One the first point, there's a long way to go to get from the current narrow multimodal models for specific tasks to the type of general multimodal aggregation you seemed to suggest.

On the second point, thank you - I think you are correct that it's a mistake/poorly written, and I'm checking with the coauthor who wrote that section.

Reply
Open Thread Autumn 2025
Davidmanheim11d00

Or, phrasing it differently; "read the sequences"

Reply
Load More
150AIs should also refuse to work on capabilities research
11d
20
2112 Angry Agents, or: A Plan for AI Empathy
24d
4
16Messy on Purpose: Part 2 of A Conservative Vision for the Future
1mo
3
67The Counterfactual Quiet AGI Timeline
1mo
5
25A Conservative Vision For AI Alignment
3mo
34
22Semiotic Grounding as a Precondition for Safe and Cooperative AI
3mo
0
45No, We're Not Getting Meaningful Oversight of AI
4mo
4
20The Fragility of Naive Dynamism
6mo
1
15Therapist in the Weights: Risks of Hyper-Introspection in Future AI Systems
6mo
1
9Grounded Ghosts in the Machine - Friston Blankets, Mirror Neurons, and the Quest for Cooperative AI
7mo
0
Load More
Garden Onboarding
4 years ago
(+28)