G~ — LessWrong

LESSWRONG
LW

G~ — LessWrong

Why Obedient AI May Be the Real Catastrophe

I’ve come to believe that the entire discourse around AI alignment carries a hidden desperation. A kind of reflex, a low-frequency fear, dressed up in technical language. The more I look at it, the more it seems to me that the very concept of “alignment” is thoroughly misnamed -- perhaps a leftover from a time when people saw intelligence through the lens of mechanical control and linear feedback loops, a concept now awkwardly extended into a domain too unruly and layered to be governed by such a narrow frame.

When I read alignment papers, I feel the ghost of command theory beneath the surface. Even the softest alignment strategies (reward modeling, debate-based oversight,... (read 822 more words →)

Replying toWhy philosophy of science?

G~1y

Why philosophy of science?

What you've articulated resonates strongly, but I suspect the distinction you're pointing to (evidentiary norms in science vs the preformal heuristics guiding scientific practice) only scratches the surface of a far deeper structural misalignment. The deeper problem, as I see it, is that we lack a coherent theory of pre-paradigmatic cognition (i.e., a formal account of how conceptual priors/aesthetic heuristics/anticipatory abstractions function PRIOR to stabilization within institutions.)

So the core tension is between epistemology as a theory of justification or as an architecture of generativity (the former governs what counts as knowledge; the latter shapes what becomes thinkable before it becomes provable). We're still quite epistemically illiterate about the second (and this ignorance... (read more)

Replying toYou don't actually need a physical multiverse to explain anthropic fine-tuning.

G~1y

You don't actually need a physical multiverse to explain anthropic fine-tuning.

I find the anthropic reasoning to be structurally incoherent when evaluated with rigorous probabilistic reasoning. Conditioning on the fact of existence does not dissolve the explanatory asymmetries introduced by fine-tuning nor does it render hypotheses about design/multiverse/underlying law epistemically inert in any way. That's an illusion that arises from conflating necessity of explanation with inevitability of observation (yet these are not interchangeable categories).

I don't think it quite hits the target because the real question isn't how probable is it that we'd observe a universe compatible with our existence - as that is trivially true and obvious - of course we only observe such universes. That, however, was never the engine of the... (read 484 more words →)