LESSWRONG
LW

owencb
2234Ω67382300
Message
Dialogue
Subscribe

Sequences

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
On Wholesomeness
No wikitag contributions to display.
6owencb's Shortform
1y
1
Embedded Altruism [slides]
owencb1d83

Of course I'm into trying to understand things better (and that's a good slice of what I recommend!), but: 

  • You need to make decisions in the interim
  • There is a bunch of detail that won't be captured by whatever your high level models are (like what will be the impacts of wording an email this way versus that)
  • I think that for complete decisions you'd have a model of the whole future unfolding of civilization, and this is hard enough that we're not going to do it with "a few years of study"
Reply
Not all capabilities will be created equal: focus on strategically superhuman agents
owencb3mo30

It seems fine to me to have the goalposts moving, but then I think it's important to trace through the implications of that. 

Like, if the goalposts can move then this seems like perhaps the most obvious way out of the predicament; to keep the goalposts ever ahead of AI capabilities. But when I read your post I get the vibe that you're not imagining this as a possibility?

Reply
Not all capabilities will be created equal: focus on strategically superhuman agents
owencb3mo50

If we are going to build these agents without "losing the game", either (a) they must have goals that are compatible with human interests, or (b) we must (increasingly accurately) model and enforce limitations on their capabilities. If there's a day when an AI agent is created without either of these conditions, that's the day I'd consider humanity to have lost.

Something seems funny to me here.

It might be to do with the boundaries of your definition. If humans agents are getting empowered by strategically superhuman (in an everyday sense) AI systems (agentic or otherwise), perhaps that raises the bar for what counts as superhuman for the purposes of this post? If so I think the argument would make sense to me, but it feels a bit funny to me to have this definition which is such a moving goalpost, and also might never get crossed even as AI gets arbitrarily powerful.

Alternatively, it might be that your definition is kind of an everyday one, but in that case your conclusion seems pretty surprising. Like it seems easy to me to imagine worlds where there are some agents without either of those conditions, but that they're not better than the empowered humans.

Or perhaps something else is going on. Just trying to voice my confusions. 

I do appreciate the attempt to analyse which kinds of capabilities are actually crucial.

Reply
The Choice Transition
owencb7mo20

It's been a long time since I read those books, but if I'm remembering roughly right: Asimov seems to describe a world where choice is in a finely balanced equilibrium with other forces (I'm inclined to think: implausibly so -- if it could manage this level of control at great distances in time, one would think that it could manage to exert more effective control over things at somewhat less distance).

Reply
Essay competition on the Automation of Wisdom and Philosophy — $25k in prizes
owencb9mo50

I've now sent emails contacting all of the prize-winners.

Reply
AI, centralization, and the One Ring
owencb10mo20

Actually, on 1) I think that these consequentialist reasons are properly just covered by the later sections. That section is about reasons it's maybe bad to make the One Ring, ~regardless of the later consequences. So it makes sense to emphasise the non-consequentialist reasons.

I think there could still be some consequentialist analogue of those reasons, but they would be more esoteric, maybe something like decision-theoretic, or appealing to how we might want to be treated by future AI systems that gain ascendancy.

Reply
AI, centralization, and the One Ring
owencb10mo20
  1. Yeah. As well as another consequentialist argument, which is just that it will be bad for other people to be dominated. Somehow the arguments feel less natively consequentialist, and so it seems somehow easier to hold them in these other frames, and then translate them into consequentialist ontology if that's relevant; but also it would be very reasonable to mention them in the footnote.
  2. My first reaction was that I do mention the downsides. But I realise that that was a bit buried in the text, and I can see that that could be misleading about my overall view. I've now edited the second paragraph of the post to be more explicit about this. I appreciate the pushback.
Reply
AI, centralization, and the One Ring
owencb10mo30

Ha, thanks!

(It was part of the reason. Normally I'd have made the effort to import, but here I felt a bit like maybe it was just slightly funny to post the one-sided thing, which nudged against linking rather than posting; and also I thought I'd take the opportunity to see experimentally whether it seemed to lead to less engagement. But those reasons were not overwhelming, and now that you've put the full text here I don't find myself very tempted to remove it. :) )

Reply
Essay competition on the Automation of Wisdom and Philosophy — $25k in prizes
owencb10mo82

The judging process should be complete in the next few days. I expect we'll write to winners at the end of next week, although it's possible that will be delayed. A public announcement of the winners is likely to be a few more weeks.

Reply1
A computational complexity argument for many worlds
owencb11mo40

I don't see why (1) says you should be very early. Isn't the decrease in measure for each individual observer precisely outweighed by their increasing multitudes?

Reply
Load More
20Embedded Altruism [slides]
2d
3
25The crucible — how I think about the situation with AI
2mo
1
74Disempowerment spirals as a likely mechanism for existential catastrophe
3mo
7
21Knowledge, Reasoning, and Superintelligence
3mo
1
22AI Tools for Existential Security
4mo
4
50The Choice Transition
7mo
4
26A brief history of the automated corporation
8mo
1
40Winners of the Essay competition on the Automation of Wisdom and Philosophy
8mo
3
22AI safety tax dynamics
8mo
0
31Safety tax functions
8mo
0
Load More