BryceStansfield's Shortform

BryceStansfield

BryceStansfield's Shortform

1st Feb 2026

1 min read

3

This is a special post for quick takes by BryceStansfield. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

BryceStansfield's Shortform

3BryceStansfield

3papetoast

1Kongo Landwalker

3 comments, sorted by

top scoring

Click to highlight new comments since: Today at 1:38 AM

[-]BryceStansfield2d3-2

A world where alignment is impossible should be safer than a world where alignment is very difficult.

Here's why I think this:

Suppose we have two worlds. In world A, alignment is impossible.

In this world, suppose an ASI is invented. This ASI wants to scale in power as quickly and thoroughly as possible, this ASI has the following options:

Scale horizontally.
Algorithmic improvements that can be mathematically guaranteed to produce identical outcomes.
Chip/wafer improvements.

Notably, the agent cannot either retrain itself, or train another more powerful agent to act on its behalf, since it can't align the resulting agent. This should restrict the vast majority of potential growth (even if it might still be easily enough to overpower humans in a given scenario).

In world B, the ASI agent can do all of the above, but can also train a successor agent, we should expect the ASI to be able to get vastly more intelligent vastly quicker.

[-]papetoast2d34

Yeah, ASI's growth will probably be asymptotically slower, but I think it probably won't matter that much for human's safety.

[-]Kongo Landwalker10h10

I think we live in a world where alignment is impossible. All attention based models in my opinion are complex enough systems to be computationally irreducable (There is no shorter way to know the outcome than to run the system itself, like with rule 110). If it is impossible to predict the outcome with certainty, the impossibility to force some desired outcome follows logically.

Humanity has not solved even the allignment of humans (children).

Moderation Log

Curated and popular this week

LESSWRONG
LW

LESSWRONG
LW

BryceStansfield's Shortform

3