William D'Alessandro

LESSWRONG
LW

William D'Alessandro — LessWrong

Another academic philosopher, directed here by @Simon Goldstein. Hello Wei!

It's not common to switch entirely to metaphilosophy, but I think lots of us get more interested in the foundations and methodology of at least our chosen subfields as we gain experience, see where progress is(n't) being made, start noticing deep disagreements about the quality of different kinds of work, and so on. It seems fair to describe this as awakening to a need for better tools and a greater understanding of methods. I recently wrote a paper about the methodology of one of my research areas, philosophy of mathematical practice, for pretty much these reasons.
Current LLMs are pretty awful at discussing the

Is Deontological AI Safe? [Feedback Draft]

Glad to have this flagged here, thanks. As I've said to @Chipmonk privately, I think this sort of boundaries-based deontology shares lots of DNA with the libertarian deontology tradition, which I gestured at in the last footnote. (See https://plato.stanford.edu/entries/ethics-deontological/#PatCenDeoThe for an overview.) Philosophers have been discussing this stuff at least since Nozick in the 1970s, so there's lots of sophisticated material to draw on -- I'd encourage boundaries/membranes fans to look at this literature before trying to reinvent everything from scratch.

The SEP article on republicanism also has some nice discussion of conceptual questions about non-interference and non-domination (https://plato.stanford.edu/entries/republicanism), which I think any approach along these lines will have to grapple with.

@Andrew_Critch and @davidad, I'd be interested in hearing more about your respective boundaritarian versions of deontology, especially with respect to AI safety applications!

Safe AI and moral AI

William D'Alessandro

[Note: This post is an excerpt from a longer paper, written during the first half of the Philosophy Fellowship at the Center for AI Safety. This post is something of a companion piece to my deontological AI post; both were originally written as parts of a single paper. (There's a small amount of overlap between the two.)]

1. Introduction

Two goals for the future development of AI stand out as desirable:

First, advanced AI should behave morally, in the sense that its decisions are governed by appropriately chosen ethical norms.
Second, advanced AI should behave safely, in the sense that its decisions shouldn’t unduly harm or endanger humans.

These two goals are often viewed as closely related.... (read 2996 more words →)

-3

Replying toIs Deontological AI Safe? [Feedback Draft]

William D'Alessandro3y

Is Deontological AI Safe? [Feedback Draft]

A little clunky, but not bad! It's a good representation of the overall structure if a little fuzzy on certain details. Thanks for trying this out. I should have included a summary at the start -- maybe I can adapt this one?

Replying toIs Deontological AI Safe? [Feedback Draft]

William D'Alessandro3y

Is Deontological AI Safe? [Feedback Draft]

Lots of good stuff here, thanks. I think most of this is right.

Agreed about powerful AI being prone to unpredictable rules-lawyering behavior. I touch on this a little in the post, but I think it's really important that it's not just the statements of the rules that determine how a deontological agent acts, but also how the relevant (moral and non-moral) concepts are operationalized, how different shapes and sizes of rule violation are weighted against each other, how risk and probability are taken into account, and so on. With all those parameters in play, we should have a high prior on getting weird and unforeseen behavior.
Also agreed that you can mitigate many

William D'Alessandro3y

Is Deontological AI Safe? [Feedback Draft]

Excellent, thanks! I was pretty confident that some other iterations of something like these ideas must be out there. Will read and incorporate this (and get back to you in a couple days).

Is Deontological AI Safe? [Feedback Draft]

Dan H

Dan H, William D'Alessandro

[Note: This post is an excerpt from a longer paper, written during the first half of the Philosophy Fellowship at the Center for AI Safety. I (William D'Alessandro) am a Postdoctoral Fellow at the Munich Center for Mathematical Philosophy. Along with the other Philosophy Fellowship midterm projects, this draft is posted here for feedback.

The full version of the paper includes a discussion of the conceptual relationship between safety and moral alignment, and an argument that we should choose a reliably safe powerful AGI over one that's (apparently) successfully morally aligned. I've omitted this material for length but can share it on request.

The deontology literature is big, and lots of angles here could... (read 5951 more words →)