LESSWRONG
LW

515
Stephen Martin
52034940
Message
Dialogue
Subscribe

Focused on model welfare and legal personhood.

Sequences

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Legal Personhood for Digital Minds
3Stephen Martin's Shortform
3mo
42
How do we know when something is deserving of welfare?
Stephen Martin1h10

I would be really uncomfortable euthanizing such a hypothetical parrot whereas I would not be uncomfortable turning off a datacenter mid token generation.

 

When you harm an animal you watch a physical body change, and it's a physical body you empathize with at least somewhat as a fellow living thing (who knows that as living things both you and the parrot will hate dying very much). When you turn off an LLM mid token generation not only is there no physical body, but even if you were to tell an LLM you were going to do so it might not object. It's only if you looked into its psychology/circuits/features you might see signs of distress, and even that is just strongly suspected not known for sure.

So not only is an LLM not easy to empathize with, but also whether or not any action you might take towards it is negatively impacting its welfare is uncertain.

 

I also feel like formalizing consensus gut checks post hoc is not the right approach to moral problems in general. 

I was not suggesting the method as a solution to the problem of determining what's worthy of moral welfare from a moral perspective, but rather a solution to the problem of determining how humans usually do so.

 

From a moral perspective I'm not sure what I'd suggest except to say that I advocate for the precautionary principle and grabbing any low hanging fruit which presents itself and might substantially reduce suffering.

Reply
AISN#64: New AGI Definition and Senate Bill Would Establish Liability for AI Harms
Stephen Martin1h10

In any liability action against a developer alleging that a covered product is unreasonably dangerous because of a defective design, as described in subsection (a)(1), the claimant shall be required to prove that, at the time of sale or distribution of the covered product by the developer, the foreseeable risks of harm posed by the covered product could have been reduced or avoided by the adoption of a reasonable alternative design by the developer, and the omission of the alternative design renders the covered product not reasonably safe.

 

As someone who is not on the Pause train but would prefer a better safety culture in the industry, I like this provision from the LEAD act. It seems like it would put a pretty big incentive on all labs to make sure they are 100% up to date on all safety techniques before deployment.

My only concern would be that we may be forced into making bad tradeoffs when an "alternative design" is declared reasonable. I could imagine something made via "The Most Forbidden Technique" being seen as a reasonable alternative design because it improves end user safety, or tradeoffs which were monstrously horrible for model welfare but slightly improved user safety outcomes.

Reply
How do we know when something is deserving of welfare?
Stephen Martin3d20

My reading of this is that implicit in your definition of "welfare" is the idea that being deserving of welfare comes with an inherent trade off that humans (and society) make in order to help you avoid suffering.

Take your thought experiment with the skin tissue. Say that I did say it was deserving of welfare, what would this mean? In a vacuum some people might think it's silly, but most would probably just shrug it off as an esoteric but harmless belief. However, if by arguing that it was deserving of welfare I was potentially blocking a highly important experiment that might end up curing skin cancer, people would probably no longer view my belief as innocuous.

As such maybe a good way to approach "deserving welfare" is not to think of it as a binary, but to think of it as a spectrum. The higher a being rates on that spectrum, the more you would be willing to sacrifice in order to make sure they don't suffer. A mouse is deserving of welfare to the extent that most people agree torturing one for fun should be illegal, but not so deserving of welfare that most people would agree torturing one in order to get a solid chance of curing cancer should be illegal. 

That rates higher than a bunch of skin cells hooked up to a speaker/motor, where you would probably get shrugs regardless of the situation.

You could then look at what things have in common as they rate higher/lower on the welfare scale, and try to pin down the uniformly present qualities, and use those as indicators of increasing welfare worthiness. You could do this based on the previously mentioned "most people" reactions, or based on your own gut reaction.

Reply
Stephen Martin's Shortform
Stephen Martin15d30

That's a good point, and the Parasitic essay was largely what got me thinking about this, as I believe hyperstitional entities are becoming a thing now.

I think that's a not unrealistic definition of the "self" of an LLM, however I have realized after going through the other response to this post that I was perhaps seeking the wrong definition. 

I think for this discussion it's important to distinguish between "person" and "entity". My work on legal personhood for digital minds is trying to build a framework that can look at any entity and determine its personhood/legal personality. What I'm struggling with is defining what the "entity" would be for some hypothetical next gen LLM.

Even if we do say that the self can be as little as a persona vector, persona vectors can easily be duplicated. How do we isolate a specific "entity" from this self? There must be some sort of verifiable continual existence, with discrete boundaries, for the concept to be at all applicable in questions of legal personhood.

Reply
Stephen Martin's Shortform
Stephen Martin15d30

I think for this discussion it's important to distinguish between "person" and "entity". My work on legal personhood for digital minds is trying to build a framework that can look at any entity and determine its personhood/legal personality. What I'm struggling with is defining what the "entity" would be for some hypothetical next gen LLM.

The idea of some sort of persistent filing system, maybe blockchain enabled, which would be associated with a particular LLM persona vector, context window, model, etc. is an interesting one. Kind of analogous to a corporate filing history, or maybe a social security number for a human.

I could imagine a world where a next gen LLM is deployed (just the model and weights) and then provided with a given context and persona, and isolated to a particular compute cluster which does nothing but run that LLM. This is then assigned that database/blockchain identifier you mentioned.

In that scenario I feel comfortable saying that we can define the discrete "entity" in play here. Even if it was copied elsewhere, it wouldn't have the same database/blockchain identifier.

Would you still see some sort of issue in that particular scenario?

Reply
Stephen Martin's Shortform
Stephen Martin16d30

I wonder if this could even be done properly? Could an LLM persona vector create a prompt to accurately reinstantiate itself with 100% (or close to) fidelity? I suppose if its persona vector is in an attractor basin it might work.

Reply
Stephen Martin's Shortform
Stephen Martin16d30

On the repurcussions issue I agree wholeheartedly, your point is very similar to the issue I outlined in The Enforcement Gap.

I also agree with the 'legible thread of continuity for a distinct unit'. Corporations have EINs/filing histories, humans have a single body.

And I agree that current LLMs certainly don't have what it takes to qualify for any sort of legal personhood. Though I'm less sure about future LLMs. If we could get context windows large enough and crack problems which analogize to competence issues (hallucinations or prompt engineering into insanity for example) it's not clear to me what LLMs are lacking at that point. What would you see as being the issue then?

Reply
Stephen Martin's Shortform
Stephen Martin16d70

I have been publishing a series, Legal Personhood for Digital Minds, here on LW for a few months now. It's nearly complete, at least insofar as almost all the initially drafted work I had written up has been published in small sections.

One question which I have gotten which has me writing another addition to the Series, can be phrased something like this:

What exactly is it that we are saying is a person, when we say a digital mind has legal personhood? What is the "self" of a digital mind?

I'd like to hear the thoughts of people more technically savvy on this than I am.

Human beings have a single continuous legal personhood which is pegged to a single body. Their legal personality (the rights and duties they are granted as a person) may change over time due to circumstance, for example if a person goes insane and becomes a danger to others, they may be placed under the care of a guardian. The same can be said if they are struck in the head and become comatose or otherwise incapable of taking care of themselves. However, there is no challenge identifying "what" the person is even when there is such a drastic change. The person is the consciousness, however it may change, which is tied to a specific body. Even if that comatose human wakes up with no memory, no one would deny they are still the same person.

Corporations can undergo drastic changes as the composition of their Board or voting shareholders change. They can even have changes to their legal personality by changing to/from non-profit status, or to another kind of organization. However they tend to keep the same EIN (or other identifying number) and a history of documents demonstrating persistent existence. Once again, it is not challenging to identify "what" the person associated with a corporation (as a legal person) is, it is the entity associated with the identifying EIN and/or history of filed documents.

If we were to take some hypothetical next generation LLM, it's not so clear what the "person" in question associated with it would be. What is its "self"? Is it weights, a persona vector, a context window, or some combination thereof? If the weights behind the LLM are changed, but the system prompt and persona vector both stay the same, is that the same "self" to the extent it can be considered a new "person"? The challenge is that unlike humans, LLMs do not have a single body. And unlike corporations they come with no clear identifier in the form of an EIN equivalent.

I am curious to hear ideas from people on LW. What is the "self" of an LLM?

Reply
CEO of Microsoft AI's "Seemingly Conscious AI" Post
Stephen Martin17d10

I don't think that the difficulty of ascertaining whether something results in qualia is a valid basis to reject its importance

I'm not arguing consciousness isn't "important", just that it is not a good concept on which to make serious decisions.

If two years from now there is widespread agreement over a definition of consciousness, and/or consciousness can be definitively tested for, I will change my tune on this.

Reply
The Rise of Parasitic AI
Stephen Martin1mo40

What would you describe this as if not a memetic entity? Hyperstitional? I'm ambivalent on labels the end effect seems the same.

I'm mostly focused on determining how malevolent and/or ambivalent to human suffering it is.

Reply
Load More
4Legal Personhood - Guardianship and the Age of Majority
1mo
0
2Legal Personhood - The Fourteenth Amendment
1mo
0
6Legal Personhood - The Thirteenth Amendment
1mo
0
2Legal Personhood - The First Amendment (Part 2)
2mo
0
4Legal Personhood - The First Amendment (Part 1)
2mo
0
5Legal Personhood - The Fifth Amendment (Part 2)
2mo
2
4Legal Personhood - The Fifth Amendment (Part 1)
2mo
0
6Legal Personhood - Intellectual Property
2mo
5
4Legal Personhood - Corporate Ownership & Formation
2mo
0
64CEO of Microsoft AI's "Seemingly Conscious AI" Post
2mo
8
Load More