p4rziv4l — LessWrong

A 2032 Takeoff Story

it's not meant to be qualitative, just a first draft, because i haven't found any other intexp simulators. do you know of any interactive intexp curve pages?

ryan_greenblatt's Shortform

p4rziv4l2mo-32

i super agree, i al so think that the value is in debating the models of intelligence explosion.

which is why i made my website: ai-2028.com or intexp.xyz

A case for courage, when speaking of AI danger

p4rziv4l3mo10

Scott Alexander and Daniel Kokotajlo's article about rationally defining: "why it's OK to talk about misaligned AI"
aka
"painting dark scenarios may increase the chance of them coming true but the benefits outweigh this possibility"

the original blog post:
https://blog.ai-futures.org/p/against-misalignment-as-self-fulfilling

the video I made about that article:

When AI 10x's AI R&D, What Do We Do?

p4rziv4l3mo10

Great article,
it asks important questions, but not the MOST important:

what if you dedicate huge amounts of compute to do superhuman hacking?

The gap between the best group of humans hackers and the best group of human hackers using giant swarms of cybersecurity finetuned agents will widen.

This will inevitably drive more and more AI adoption in cyberdefense and -offense.

How much do the best human group of hackers need to "trust" that swarm of 10k superhuman hackers?

I define the Singularity and AI takeover by the answer to this latter question being "a lot".

7+ tractable directions in AI control

p4rziv4l6mo30

Malign AI agent substitution

This is so Covid.
Let's make misaligned AIs and have a bunch of people use them to see if it can do the misaligned thing.
So that we can develop a vaccine before it actually escapes.

Make it go viral.

AI 2027: What Superintelligence Looks Like

p4rziv4l7mo50

I'd love to play the wargame in Munich, our local LW community.
You have a link to the rules?

PS: huge fan, love the AI 2027 website, keep being a force for good

Is AI alignment a purely functional property?

p4rziv4l9mo10

in a world where mechinterp is not 100%, the answer is logically: input/output is what matters.

we won't be able to read the thoughts anyways, so why base our judgment on it?

but see my comment on why survival fitness in cyberspace is the one axis where most of the relevant input/output will be generated.

Is AI alignment a purely functional property?

p4rziv4l11mo10

What it says: irrelevant
How it thinks: irrelevant

It has always been about what it can do in the real world.

If it can generate substantial amounts of money and buy server capacity or
hack into computer systems

then we got cyberlife, aka autonomous, rogue, self-sufficient AI, subject to darwinian forces on the internet, leading to more of those qualities, which improve its online fitness, all the way into a full-blown takeover.

p4rziv4l's Shortform

p4rziv4l1y10

What do you mean by corrigibility?
Also, what do you mean by "alignment win"?

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments