Eldho Kuriakose

Message

Eldho Kuriakose has not written any posts yet.

Eldho Kuriakose

Message

Eldho Kuriakose has not written any posts yet.

Eldho Kuriakose

Replying toSecurity Mindset: Lessons from 20+ years of Software Security Failures Relevant to AGI Alignment

Eldho Kuriakose3y

Security Mindset: Lessons from 20+ years of Software Security Failures Relevant to AGI Alignment

Awesome piece! Isn't it fascinating that our existing incentives and motives are already un-aligned with the priority of creating aligned systems? This then raises the question of whether alignment is even the right goal if our bigger goal is to avoid ruin.

Stepping back a bit, I can't convince myself that Aligned AI will or will not result in societal ruin. It almost feels like a "don't care" in the karnaugh map.

The fundamental question is whether we collectively are wise enough to wield power without causing self harm. If the last 200+ years are a testament, and if the projections of climate change and bio diversity loss are accurate, the answer appears that... (read more)

Replying toThe bullseye framework: My case against AI doom

Eldho Kuriakose3y

The bullseye framework: My case against AI doom

When we talk about concepts like "take over" and "enslavement", it's important to have a baseline. Takeover and enslavement encapsulate the idea of Agency and Cognitive and Physical Independence. The salient question is not necessarily whether all of humanity will be taken over or enslaved, but more subtle. Specifically:

Is there a future in which there are more humans or less humans (P') than are currently alive (P).
Did the from P-to-P' happen over natural rates of change or the result of some 'acceleration'?
Is there a greater degree of agency for a greater number of people in the future than there is today?
Is there a greater degree of agency for non-human life than there is today?
Is there a reduction in the amount of agency asymmetry between humans?

Arguably the greatest risk if mis-alignment comes from ill formed success criteria. Some of these questions, I believe are necessary to form the right types of success criteria.

LESSWRONG
LW

LESSWRONG
LW

Eldho Kuriakose

Eldho Kuriakose

Eldho Kuriakose

Eldho Kuriakose