Sjlver — LessWrong

Vegan Nutrition Testing Project: Interim Report

How does urine testing for iron deficiency work? I was under the impression that one needs to measure blood ferritin levels to get a meaningful result... but I might be wrong?

What does it take to defend the world against out-of-control AGIs?

Sjlver3y10

Absolutely agree that the diagram omits lots of good parts from your post!

Is "do big damage" above or below human level? It probably depends on some unspecified assumptions, like whether we include groups of humans or just individuals...; anyway, the difficulty can't be much higher, since we have examples like nuclear weapon states that reach it. It can't be much lower, either, since it is lower-bounded by the ability of "callously-power-seeking humans". Somewhere in that range.

What I like about the diagram:

It visualizes the dangerous zone between "do big damage" and "prevent big damage". If this zone indeed exists, some levels of power are risky no matter what.
It highlights broad classes of approaches: those that prevent the "AGI" arrow from entering the danger zone (by limiting the power of AGI, making AGI corrigible, attempting pivotal acts, ...) and those that make the danger zone more narrow (making society more resilient, improving defense, nanobots against gray goo?). Of course, that is a gross simplification and is not intended to dismiss the other insights in your post.

What does it take to defend the world against out-of-control AGIs?

Sjlver3y*40

I tried to make a summary of how I understood your post, and systemize it a bit. I hope this is helpful for others, but I also want to say that I really liked the many examples in the post.

The central points that I understood from what you wrote:

It takes more power to prevent big damage than to do big damage. You give several examples where preventing extinction requires solutions that are illegal or way out of the Overton window, and examples like engineered pandemics that seem easier.
I believe this point is basically correct for a variety of reasons, from my experience all the way to thermodynamics.
AGI might become powerful enough to do big damage, but not powerful enough to prevent big damage from other AGIs (for a variety of reasons, including humans imposing limits on the good AGI and not trusting it).
(Less central but relevant): Doing big damage requires superhuman power, since no humans have done it yet.

Edited to add: The diagram above is not to scale, and distances between the arrows might be very big or small... the order of the arrows is more important, and the sense for how various actions could move each of the arrows.

There are several solution avenues:

Prevent AGIs from being powerful enough to do big damage.
Difficult for reasons such as convergence. And other reasons; in your post, you mention the desire to do red-teaming or to implement powerful protection mechanisms.
I have some hope in things like legislation, treaties, nations and the like... at least they have so far been successful in preventing humans from seizing too much power.
Increase the difficulty of doing big damage.
In your post, you describe a possible world where critical infrastructure is resilient and society is less interdependent. I share your skepticism.
Move AGI power all the way to the "prevent big damage" level.
This is, like you write, risky. It might also be impossible, since preventing all avenues to big damage might be arbitrarily hard.

While all avenues seem difficult, there is some time left while the "AGI" arrow moves rightwards (currently it's not even to the right of "humans" yet). Of course, no one knows how long, and it might surpass "do big damage" surprisingly early... but my hope is that humanity will have many good solution ideas during this time.

There is also probably going to be an exciting window where AGI is super-human but not yet able to do big damage. During that period, there might be particularly many opportunities to find and evaluate solutions. Could we prepare already to make that period as long and fruitful as possible?

Lotteries: A Waste of Hope

Sjlver7y50

Someone once told me that the odds of winning the lottery are smaller than the odds of dying before the draw. Thanks to this insight, lottery tickets have become a kind of "memento mori" for me, and I'm no longer tempted to buy any.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

Posts

Wikitag Contributions

Comments

Posts

Wikitag Contributions

Comments