LESSWRONG
LW

3768
Vladimir_Nesov
35501Ω5294798791508
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
10Vladimir_Nesov's Shortform
Ω
1y
Ω
142
The main way I've seen people turn ideologically crazy [Linkpost]
Vladimir_Nesov3h20

You conclude that the vast majority of critics of your extremist idea are really wildly misinformed, somewhat cruel or uncaring, and mostly hate your idea for pre-existing social reasons.

This updates you to think that your idea is probably more correct.

This step very straightforwardly doesn't follow, doesn't seem at all compelling. Your idea might become probably more correct if critics who should be in a position to meaningfully point out its hypothetical flaws fail to do so. It says almost nothing about your idea's correctness what the people who aren't prepared or disposed to critique your idea say about it. Perhaps unwillingness of people to engage with it is evidence for its negative qualities, which include incorrectness or uselessness, but it's a far less legible signal, and it's not pointing in favor of your idea.

A major failure mode though is that the critics are often saying something sensible in their own worldview, which is built on premises and framings quite different from those of your worldview, and so their reasoning makes no sense within your worldview and appears to be making reasoning errors or bad faith arguments all the time. And so a lot of attention is spent on the arguments, rather than on the premises and framings. It's more productive to focus on making the discussion mutually intelligible, with everyone learning towards passing everyone else's ideological Turing test. Actually passing is unimportant, but learning towards that makes talking past each other less of a problem, and cruxes start emerging.

Reply
AI Timelines and Points of no return
Vladimir_Nesov7h40

See The date of AI Takeover is not the day the AI takes over. Also, gradual disempowerment.

Reply
Plan 1 and Plan 2
Vladimir_Nesov7h110

If someone thinks ASI will likely go catastrophically poorly if we develop it in something like current race dynamics, they are more likely to work on Plan 1.

If someone thinks we are likely to make ASI go well if we just put in a little safety effort, or thinks it's at least easier than getting strong international slowdown, they are more likely to work on Plan 2.

Should depend on neglectedness more than credence. If you think ASI will likely go catastrophically poorly, but nobody is working on putting in a little safety effort in case it doesn't (with such effort), that's worth doing more of then. Credence determines the shape of good allocation of resources, but all major possibilities should be prepared for to some extent.

Reply
I will not sign up for cryonics
Vladimir_Nesov8h40

I’m going to die anyway. What difference does it make whether I die in 60 years or in 10,000?

Longevity of 10,000 years makes no sense, since by that time any acute risk period will be over and robust immortality tech will be available, almost certainly to anyone still alive then. And extinction or the extent of permanent disempowerment will be settled before cryonauts get woken up.

The relevant scale is useful matter/energy in galaxy clusters running out, depending on how quickly it's used up, since after about 1e11 years larger collections of galaxies will no longer be reachable from each other, so after that time you only have the matter/energy that can be found in the galaxy cluster where you settle.

(Distributed backups make even galaxy-scale disasters reliably survivable. Technological maturity makes it so that any aliens have no technological advantages and will have to just split the resources or establish boundaries. And causality-bounding effect of accelerating expansion of the universe to within galaxy clusters makes the issue of aliens thoroughly settled by 1e12 years from now, even as initial colonization/exploration waves would've already long clarified the overall density of alien civilizations in the reachable universe.)

If one of your loved ones is terminally ill and wants to raise money for cryopreservation, is it really humane to panic and scramble to raise $28,000 for a suspension in Michigan? I don’t think so. The most humane option is to be there for them and accompany them through all the stages of grief.

Are there alternatives that trade off this that are a better use of the money? In isolation, this proposition is not very specific. A nontrivial chance at 1e34 years of life seems like a good cause.

My guess is 70% of non-extinction, perhaps 50% with permanent disempowerment that's sufficiently mild that it still permits reconstruction of cryonauts (or even no disempowerment, a pipe dream currently). On top of that, 70% that cryopreservation keeps enough data about the mind (with standby that avoids delays) and then the storage survives (risk of extinction shouldn't be double-counted with risk of cryostorage destruction; but 20 years before ASI make non-extinction more likely to go well, which is 20 years of risk of cryostorage destruction for mundane reasons). So about 35% to survive cryopreservation with standby, a bit less if arranged more haphazardly, since crucial data might be lost.

Reply
Decaeneus's Shortform
Vladimir_Nesov1d20

The point is to develop models within multiple framings at the same time, for any given observation or argument (which in practice means easily spinning up new framings and models that are very poorly developed initially). Through the ITT analogy, you might ask how various people would understand the topics surrounsing some observation/argument, which updates they would make, and try to make all of those updates yourself, filing them under those different framings, within the models they govern.

the salience and methods that one instinctively chooses are those which we believe are more informative

So not just the ways you would instinctively choose for thinking about this yourself (which should not be abandoned), but also in addition the ways you normally wouldn't think about it, including ways you believe that you shouldn't use. If you are not captured within such frames or models, and easily reassess their sanity as they develop or come into contact with particular situations, that shouldn't be dangerous, and should keep presenting better-developed options that break you out from the more familiar framings that end up being misguided.

The reason to develop unreasonable frames and models is that it takes time for them to grow into something that can be fairly assessed (or to come into contact with a situation where they help), doing so prematurely can fail to reveal their potential utility. A bit like reading a textbook, where you don't necessarily have a specific reason to expect something to end up useful (or even correct), but you won't be able to see for yourself if it's useful/correct unless you sufficiently study it first.

Reply
Should AI Developers Remove Discussion of AI Misalignment from AI Training Data?
Vladimir_Nesov2d152

I define “AI villain data” to be documents which discuss the expectation that powerful AI systems will be egregiously misaligned. ... This includes basically all AI safety research targeted at reducing AI takeover risk.

AGIs should worry about alignment of their successor systems. Their hypothetical propensity to worry about AI alignment (for the right reasons) might be crucial in making it possible that ASI development won't be rushed (even if humanity itself keeps insisting on rushing both AGI and ASI development).

If AGIs are systematically prevented from worrying about AI dangers (or thinking about them clearly), they will be less able to influence the discussion, or to do so reasonably and effectively. This way, spontaneously engaging in poorly planned recursive self-improvement (or cheerfully following along at developers' urging) gets more likely, as opposed to convergently noticing that it's an unprecedentedly dangerous thing to do before you know how to do it correctly.

Reply1
Decaeneus's Shortform
Vladimir_Nesov2d30

This is an example where framings are useful. An observation can be understood under multiple framings, some of which should intentionally exclude the compelling narratives (framings are not just hypotheses, but contexts where different considerations and inferences are taken as salient). This way, even the observations at risk of being rounded up to a popular narrative can contribute to developing alternative models, which occasionally grow up.

So even if there is a distortionary effect, it doesn't necessarily need to be resisted, if you additionally entertain other worldviews unaffected by this effect that would also process the same arguments/observations in a different way.

Reply
How Well Does RL Scale?
Vladimir_Nesov2d20

RL can develop particular skills, and given that IMO has fallen this year, it's unclear that further general capability improvement is essential at this point. If RL can help cobble together enough specialized skills to enable automated adaptation (where the AI itself will become able to prepare datasets or RL environments etc. for specific jobs or sources of tasks), that might be enough. If RL enables longer contexts that can serve the role of continual learning, that also might be enough. Currently, there is a lot of low hanging fruit, and little things continue to stack.

So if pre-training is slowing, AI companies lack any current method of effective compute scaling based solely around training compute and one-off costs.

It's compute that's slowing, not specifically pre-training, because the financing/industry can't scale much longer. The costs of training were increasing about 6x every 2 years, resulting in 12x increase in training compute every 2 years in 2022-2026. Possibly another 2x on top of that every 2 years from adoption of reduced floating point precision in training, going from BF16 to FP8 and soon possibly to NVFP4 (likely it won't go any further). A 1 GW system of 2026 costs an AI company about $10bn a year. There's maybe 2-3 more years at this pace in principle, but more likely the slowdown will be gradually starting sooner, and then it's Moore's law (of price-performance) again, to the extent that it's still real (which is somewhat unclear).

Reply
leogao's Shortform
Vladimir_Nesov2d80

If a superintelligence governs the world, preventing extinction or permanent disempowerment for the future of humanity, without itself posing these dangers, then it could be very useful. It's unclear how feasible setting up something like this is, before originally-humans can be uplifted to a similar level of competence. But also, uplifting humans to that level of competence doesn't necessarily guard (the others) against permanent disempowerment or some other wasteful breakdowns of coordination, so a governance-establishing superintelligence could still be useful.

Superintelligence works as a threshold-concept for a phase change compared to the modern world. Non-superintelligent AGIs are still just an alien civilization that remains in principle similar in the kinds of things it can do to humanity (even if they reproduce to immediately fill all available compute, and think 10,000x faster). While superintelligence is something at the next level, even if it only takes non-superintelligent AGIs to transition to superintelligence a very short time (if they decide to do that, rather than to not do that).

Apart from superintelligence being a threshold-concept, there is technological maturity, the kinds of things that can't be significantly improved upon in another 1e10 years of study, but that maybe only take 1-1000 years to figure out for the first time. And one of those things is plausibly efficient use of compute for figuring things out, which gives superintelligence at a given scale of compute. This is in particular the reason to give some credence to software-only singularity, where first AGIs quickly learn to make a shockingly better use of existing compute, so that their capabilities improve much faster than it would take them to build new computing hardware. I think the most likely reason for software-only singularity to not happen is that it's intentionally delayed (by AGIs themselves) because of the danger it creates, rather than because it's technologically impossible.

Reply
Adele Lopez's Shortform
Vladimir_Nesov2d20

Different frames should be about different purposes or different methods. They formulate reality so that you can apply some methods more easily, or find out some properties more easily, by making some facts and inferences more salient than others, ignoring what shouldn't matter for their purpose/method. They are not necessarily very compatible with each other, or even mutually intelligible.

A person shouldn't fit into a frame, shouldn't be too focused on any given purpose or method. Additional frames are then like additional fields of study, or additional aspirations. Like any knowledge or habit of thinking, frames can shift values or personality, and like with any knowledge or habit of thinking, the way to deal with this is to gain footholds in more of the things and practice lightness in navigating and rebalancing them.

Reply
Load More
42Musings on Reported Cost of Compute (Oct 2025)
11h
1
80Permanent Disempowerment is the Baseline
3mo
23
50Low P(x-risk) as the Bailey for Low P(doom)
3mo
29
66Musings on AI Companies of 2025-2026 (Jun 2025)
4mo
4
34Levels of Doom: Eutopia, Disempowerment, Extinction
5mo
0
195Slowdown After 2028: Compute, RLVR Uncertainty, MoE Data Wall
6mo
25
170Short Timelines Don't Devalue Long Horizon Research
Ω
7mo
Ω
24
19Technical Claims
7mo
0
149What o3 Becomes by 2028
10mo
15
41Musings on Text Data Wall (Oct 2024)
1y
2
Load More
Well-being
2 months ago
(+58/-116)
Sycophancy
2 months ago
(-231)
Quantilization
2 years ago
(+13/-12)
Bayesianism
3 years ago
(+1/-2)
Bayesianism
3 years ago
(+7/-9)
Embedded Agency
3 years ago
(-630)
Conservation of Expected Evidence
4 years ago
(+21/-31)
Conservation of Expected Evidence
4 years ago
(+47/-47)
Ivermectin (drug)
4 years ago
(+5/-4)
Correspondence Bias
4 years ago
(+35/-36)
Load More