rvnnt - LessWrong

Upvoted and disagreed. ^[1]

One thing in particular that stands out to me: The whole framing seems useless unless Premise 1 is modified to include a condition like

[...] we can select a curriculum and reinforcement signal which [...] and which makes the model highly "useful/capable".

Otherwise, Premise 1 is trivially true: We could (e.g.) set all the model's weights to 0.0; thereby guaranteeing the non-entrainment of any ("bad") circuits.

I'm curious: what do you think would be a good (...useful?) operationalization of "useful/capable"?

Another issue: K and epsilon might need to be unrealistically small: Once the model starts modifying itself (or constructing successor models) (and possibly earlier), a single strategically-placed sign-flip in the model's outputs might cause catastrophe. ^[2]

I think writing one's thoughts/intuitions out like this is valuable --- for sharing frames/ideas, getting feedback, etc. Thus: thanks for writing it up. Separately, I think the presented frame/case is probably confused, and almost useless (at best). ↩︎
Although that might require the control structures (be they Shards or a utility function or w/e) of the model to be highly "localized/concentrated" in some sense. (OTOH, that seems likely to at least eventually be the case?) ↩︎

Forecasting: the way I think about it

rvnnt4mo10

In Fig 1, is the vertical axis P(world) ?

AI Clarity: An Initial Research Agenda

rvnnt4mo10

Possibly a nitpick, but:

The development and deployment of AGI, or similarly advanced systems, could constitute a transformation rivaling those of the agricultural and industrial revolutions.

seems like a very strong understatement. Maybe replace "rivaling" with e.g. "(vastly) exceeding"?

AI #56: Blackwell That Ends Well

rvnnt5mo50

Referring to the quote-picture from the Nvidia GTC keynote talk: I searched the talk's transcript, and could not find anything like the quote.

Could someone point out time-stamps of where Huang says (or implies) anything like the quote? Or is the quote entirely made up?

7. Evolution and Ethics

rvnnt6mo10

That clarifies a bunch of thing. Thanks!

7. Evolution and Ethics

rvnnt7mo21

I'm not sure I understand what the post's central claim/conclusion is. I'm curious to understand it better. To focus on the Summary:

So overall, evolution is the source of ethics,

Do you mean: Evolution is the process that produced humans, and strongly influenced humans' ethics? Or are you claiming that (humans') evolution-induced ethics are what any reasonable agent ought to adhere to? Or something else?

and sapient evolved agents inherently have a dramatically different ethical status than any well-designed created agents [...]

...according to some hypothetical evolved agents' ethical framework, under the assumption that those evolved agents managed to construct the created agents in the right ways (to not want moral patienthood etc.)? Or was the quoted sentence making some stronger claim?

evolution and evolved beings having a special role in Ethics is not just entirely justified, but inevitable

Is that sentence saying that

evolution and evolved beings are of special importance in any theory of ethics (what ethics are, how they arise, etc.), due to Evolution being one of the primary processes that produce agents with moral/ethical preferences ^[1]

or is it saying something like

evolution and evolved beings ought to have a special role; or we ought to regard the preferences of evolved beings as the True Morality?

I roughly agree with the first version; I strongly disagree with the second: I agree that {what oughts humans have} is (partially) explained by Evolutionary theory. I don't see how that crosses the is-ought gap. If you're saying that that somehow does cross the is-ought gap, could you explain why/how?

I.e., similar to how one might say "amino acids having a special role in Biochemistry is not just entirely justified, but inevitable"? ↩︎

Conversation Visualizer

rvnnt8mo30

I wonder how much work it'd take to implement a system that incrementally generates a graph of the entire conversation. (Vertices would be sub-topics, represented as e.g. a thumbnail image + a short text summary.) Would require the GPT to be able to (i.a.) understand the logical content of the discussion, and detect when a topic is revisited, etc. Could be useful for improving clarity/productivity of conversations.

Vote on Interesting Disagreements

rvnnt10mo10

One of the main questions on which I'd like to understand others' views is something like: Conditional on sentient/conscious humans^[1] continuing to exist in an x-risk scenario^[2], with what probability do you think they will be in an inescapable dystopia^[3]?

(My own current guess is that dystopia is very likely.)

or non-human minds, other than the machines/Minds that are in control ↩︎
as defined by Bostrom, i.e. "the permanent and drastic destruction of [humanity's] potential for desirable future development" ↩︎
Versus e.g. just limited to a small disempowered population, but living in pleasant conditions? Or a large population living in unpleasant conditions, but where everyone at least has the option of suicide? ↩︎

Vote on Interesting Disagreements

rvnnt10mo30

That makes sense; but:

so far outside the realm of human reckoning that I'm not sure it's reasonable to call them dystopian.

setting aside the question of what to call such scenarios, with what probability do you think the humans^[1] in those scenarios would (strongly) prefer to not exist?

or non-human minds, other than the machines/Minds that are in control ↩︎

Vote on Interesting Disagreements

rvnnt10mo20

non-extinction AI x-risk scenarios are unlikely

Many people disagreed with that. So, apparently many people believe that inescapable dystopias are not-unlikely? (If you're one of the people who disagreed with the quote, I'm curious to hear your thoughts on this.)

LESSWRONG
LW

Posts

Wiki Contributions

Comments