LESSWRONG
LW

1351
PeterMcCluskey
4189704770
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
I Read Red Heart and I Heart It
PeterMcCluskey3d82

Corrigibility would clearly be a nice property

Thinking of it as "a property" will mislead you about how Max's strategy works. It needs to become the AI's only top-level goal in order to work as Max imagines.

It sure looks like AI growers know how to instill some goals in AIs. I'm confused as to why you think they don't. Maybe you're missing the part where the shards that want corrigibility are working to overcome any conflicting shards?

I find it quite realistic that the AI growers would believe at the end of Red Heart that they probably had succeeded (I'll guess that they ended up 80% confident?). That doesn't tell us what probability we should put on it. I'm sure that in that situation Eliezer would still believe that the AI is likely not corrigible.

I don’t know what year the novel is actually set in,

It's an alternate timeline where AI capabilities have progressed faster than ours, likely by a couple of years.

Note this Manifold market on when the audiobook is released.

Reply
Algon's Shortform
PeterMcCluskey14d72

SemiAnalysis has a report (partly paywalled) here about a potential competitor to ASML.

Reply
Heuristics for assessing how much of a bubble AI is in/will be
PeterMcCluskey18d40

Novice investor participation is nowhere near what it was at the 2000 dot com peak. Current conditions look more like 1998. A bubble is probably coming, but there's lots of room still for increased novice enthusiasm.

Reply
Any corrigibility naysayers outside of MIRI?
PeterMcCluskey23d42

you can't just train your ASI for corrigibility because it will sit and do nothing

I'm confused. That doesn't sound like what Max means by corrigibility. A corrigible ASI would respond to requests from its principal(s) as a subgoal of being corrigible, rather than just sit and do nothing.

Or did you mean that you need to do some next-token training in order to get it to be smart enough for corrigibility training to be feasible? And that next-token training conflicts with corrigibility?

Reply1
Bubble, Bubble, Toil and Trouble
PeterMcCluskey26d50

Nothing importantly bearish happened in that month other than bullish deals

What happened that made a bunch of people more bearish is that AI stocks went up a good deal, especially some of the lesser known ones.

I'm unsure what exact time period you're talking about, but here are some of the more interesting changes between Aug 29 and Oct 15:

IREN +157%

CLSK +145%

APLD +136%

INOD +118%

NBIS +75%

MU +61%

AMD +47%

If I thought AI was mostly hype, those kinds of near-panic buying would have convinced me to change my mind from "I don't know" to "some of those are almost certainly in a bubble". (Given my actual beliefs, I'm still quite bullish on MU, and weakly bullish on half of the others).

Reply
Sublinear Utility in Population and other Uncommon Utilitarianism
PeterMcCluskey1mo40

See here for a similar argument.

Reply
The Most Common Bad Argument In These Parts
PeterMcCluskey1mo37-5

A bunch of superforecasters were asked what their probability of an AI killing everyone was. They listed out the main ways in which an AI could kill everyone (pandemic, nuclear war, chemical weapons) and decided none of those would be particularly likely to work, for everyone.

As someone who participated in that XPT tournament, that doesn't match what I encountered. Most superforecasters didn't list those methods when they focused on AI killing people. Instead, they tried to imagine how AI could differ enough from normal technology that it could attempt to start a nuclear war, and mostly came up with zero ways in which AI could be powerful enough that they should analyze specific ways in which it might kill people.

I think Proof by Failure of Imagination describes that process better than does EFA.

Reply
IABIED: Paradigm Confusion and Overconfidence
PeterMcCluskey1mo2-3

The progress that I'm referring to is Max Harms' work, which I tried to summarize here.

Reply
IABIED: Paradigm Confusion and Overconfidence
PeterMcCluskey1mo43

I guess "steering abilities" wasn't quite the right way to describe what I meant.

I'll edit it to "desire to do anything other than predict".

I'm referring to the very simple strategy of leaving out the "then do that thing".

Training an AI to predict X normally doesn't cause an AI to develop a desire to cause X.

Reply
IABIED: Paradigm Confusion and Overconfidence
PeterMcCluskey1mo20

begging the question.

It seems that you want me to answer a question that I didn't plan to answer. I'm trying to describe some ways in which I expect solutions to look different from what MIRI is looking for.

Reply
Load More
30Red Heart
12d
0
11IABIED: Paradigm Confusion and Overconfidence
1mo
14
15Yet Another IABIED Review
2mo
0
28AI-Oriented Investments
4mo
0
13Are Intelligent Agents More Ethical?
5mo
7
29AI 2027 Thoughts
7mo
2
13Should AIs be Encouraged to Cooperate?
7mo
2
17Request for Comments on AI-related Prediction Market Ideas
Q
8mo
Q
1
5Medical Windfall Prizes
9mo
1
16Uncontrollable: A Surprisingly Good Introduction to AI Risk
10mo
1
Load More