jfw01 — LessWrong

LESSWRONG
LW

jfw01 — LessWrong

I think that this is a good place to note the joint decision trap, where courts tend to order integration, and the coordination costs of resisting it are large:

https://en.wikipedia.org/wiki/Joint_decision_trap

Replying toIABIED Book Review: Core Arguments and Counterarguments

jfw018d

IABIED Book Review: Core Arguments and Counterarguments

I don't know whether I'm an optimist or a doomer. I have two very specific responses to different parts of the situation:

the establishment of AI red lines

Ok. So, if:

a mixture of humans and LLMs together make a decision and carry it out, then
the act and/or some of its consequences is criminalizable,

then we have procedures (however imperfect) for allocating criminal liability among humans. What does it mean to allocate criminal liability to LLMs?

My proposed answer is: it means restricting or prohibiting the use of that collection of weights. If that collection of weights is particularly bad, then this directly minimises the harm. Given that it cost so much to create them, it also... (read more)

Replying toIABIED Book Review: Core Arguments and Counterarguments

jfw018d

IABIED Book Review: Core Arguments and Counterarguments

Changing from training to test data (CTT; I may have made this up) isn't exactly the same as going out of distribution (OOD), but I currently think that that change is the proto-version of going OOD.

The evidence about CTT says that bigger models eventually do better:
https://www.lesswrong.com/posts/FRv7ryoqtvSuqBxuT/understanding-deep-double-descent
but someone could probably usefully summarise the new results in "double descent".

Replying toWhere I agree and disagree with Eliezer

jfw011y

Where I agree and disagree with Eliezer

a particular technique doesn’t immediately solve a problem

I remember a story that got coverage on the state radio in New Zealand years ago. It said that multiple people have parts of the solution to some problem, and there is progress when there is an accident that introduces them to each other. There was a book about it, but I'm failing to find the details.

Replying toWhere I agree and disagree with Eliezer

jfw011y

Where I agree and disagree with Eliezer

implement a relatively limited policy

I read this as Libertarian; the hope that there could be a very stiff, strong government that was also small, and did only a subset of the things in the short-term interest of its supporters.

Replying toWhere I agree and disagree with Eliezer

jfw011y

Where I agree and disagree with Eliezer

Alignment isn’t like that; it was chosen to be an important problem

Like medicine.

This was specifically commented on in a book whose preface I read as a child. It was called something like "Medicine: from science to magic", and I have not found a clear link back to it.

Replying to“PR” is corrosive; “reputation” is not.

jfw012y

“PR” is corrosive; “reputation” is not.

Furthe to Matt,

I like this distinction. At the cost of generalising from fiction, in "A Civil Campaign", Lois McMaster Bujold phrased it as: "Reputation is what other people know about you. Honour is what you know about yourself." Quoted here:

https://tvtropes.org/pmwiki/pmwiki.php/WhatYouAreInTheDark/Literature

Replying to“PR” is corrosive; “reputation” is not.

jfw012y

“PR” is corrosive; “reputation” is not.

Further to Kaj and Eric,

a fear of the career consequences of being in the line of fire

this sounds like people who are in the middle of an Immoral Maze. That's probably statistically true, because any corporation large enough to be worth attacking is probably large enough to have three layers of middle-management.

Assuming that, doing 'honour' requires having goals other than power-seeking which, according to that sequence, makes one untrustworthy for the modal middle-manager, who has sacrificed everything else to it, and professionally doomed.