owencb

Decomposing Agency — capabilities without desires

What is an agent? It’s a slippery concept with no commonly accepted formal definition, but informally the concept seems to be useful. One angle on it is Dennett’s Intentional Stance: we think of an entity as being an agent if we can more easily predict it by treating it as having some beliefs and desires which guide its actions. Examples include cats and countries, but the central case is humans. The world is shaped significantly by the choices agents make. What might agents look like in a world with advanced — and even superintelligent — AI? A natural approach for reasoning about this is to draw analogies from our central example. Picture what a really smart human might be like, and then try to figure out how it would be different if it were an AI. But this approach risks baking in subtle assumptions — things that are true of humans, but need not remain true of future agents. One such assumption that is often implicitly made is that “AI agents” is a natural class, and that future AI agents will be unitary — that is, the agents will be practically indivisible entities, like single models. (Humans are unitary in this sense, and while countries are not unitary, their most important components — people — are themselves unitary agents.) This assumption seems unwarranted. While people certainly could build unitary AI agents, and there may be some advantages to doing so, unitary agents are just an important special case among a large space of possibilities for: * Components which contain important aspects of agency (without necessarily themselves being agents); * Ways to construct agents out of separable subcomponents (none, some, or all of which may be reasonably regarded agents in their own right). We’ll begin an exploration of these spaces. We’ll consider four features we generally expect agents to have[1]: * Goals * Things they are trying to achieve * e.g. I would like a cup of tea * Implementation capacity * The ability to act in the world *

156Jul 11, 2024

owencb

Message

2405

242

13y

Strategic awareness tools: design sketches

This post is part of a sequence. Previous post: Design sketches for angels-on-the shoulder We’ve recently published a set of design sketches for tools for strategic awareness. We think that near-term AI could help a wide variety of actors to have a more grounded and accurate perspective on their situation,...

Feb 1118

Design sketches for a more sensible world

We don’t think that humanity knows what it’s doing when it comes to AI progress. More and more people are working on developing better systems and trying to understand what their impacts will be — but our foresight is just very limited, and things are getting faster and faster. Imagine...

Feb 925

Design sketches for angels-on-the-shoulder

This post is part of a sequence. Previous post: Design sketches: collective epistemics | Next post: Strategic awareness tools: design sketches We’ve recently published a set of design sketches for technological analogues to ‘angels-on-the-shoulder’: customized tools that leverage near-term AI systems to help people better navigate their environments and handle...

Feb 923

How (and why) to read Drexler on AI

I have been reading Eric Drexler’s writing on the future of AI for more than a decade at this point. I love it, but I also think it can be tricky or frustrating. More than anyone else I know, Eric seems to tap into a deep vision for how the...

Jan 2155

Human Dignity: a review

I have in my possession a short document purporting to be a manifesto from the future. That’s obviously absurd, but never mind that. It covers some interesting ground, and the second half is pretty punchy. Let’s discuss it. > Principles for Human Dignity in the Age of AI > Humanity...

Dec 8, 202532

Embedded Altruism [slides]

How should we think about doing good, when we're a part of a world which is too complex for us to fully understand? Pragmatically, I think the answer should look less like figure out the best thing to do and then do that, and more like a combination of working...

Jul 1, 202522

The crucible — how I think about the situation with AI

The basic situation The world is wild and terrible and wonderful and rushing forwards so so fast. Modern economies are tremendous things, allowing crazy amounts of coordination. People have got really very good at producing stuff. Long-term trends are towards more affluence, and less violence. The enlightenment was pretty fantastic...

May 5, 202525

Load More (7/46)

LESSWRONG
LW

LESSWRONG
LW

owencb

owencb

owencb

Decomposing Agency — capabilities without desires

In favour of exploring nagging doubts about x-risk

On the future of language models

Essay competition on the Automation of Wisdom and Philosophy — $25k in prizes

owencb

Strategic awareness tools: design sketches

Design sketches for a more sensible world

Design sketches for angels-on-the-shoulder

How (and why) to read Drexler on AI

Human Dignity: a review

Embedded Altruism [slides]

The crucible — how I think about the situation with AI

Decomposing Agency — capabilities without desires

In favour of exploring nagging doubts about x-risk

On the future of language models

Essay competition on the Automation of Wisdom and Philosophy — $25k in prizes

Strategic awareness tools: design sketches

Design sketches for a more sensible world

Design sketches for angels-on-the-shoulder

How (and why) to read Drexler on AI

Human Dignity: a review

Embedded Altruism [slides]

The crucible — how I think about the situation with AI