dawnstrata — LessWrong

Underdog bias rules everything around me

I would disagree fairly strongly: "lobbyists are absolutely dependent on democratic institutions to leverage their wealth into political power, while 50,000 angry people with pitchforks are not"

They are, I think. If they are angry that democracy is ignoring them then their pitchforks will likely not manage to enact some complicated change to legislation needed to fix the problem, as you point out. If we care about power to actually make a change about the things people want to happen, this is vested almost entirely within the hands of the elite and not within pitchforks. Pitchforks could maybe scare elites into doing it, but more likely it just generates chaos. Because pitchforks are not the tool for the job. The tools for the job are held by the elites and they refuse to use them accordingly.

I'm living through this day by day here in Britain. People protest all over the country every day and the government, despite knowing which positions have majority support, just do the opposite continuously and use every mechanism available to delay or obfuscate meaningful change.

Emergent morality in AI weakens the Orthogonality Thesis

dawnstrata3mo10

I would love to know why people downvoting disagree :)

dawnstrata's Shortform

dawnstrata3mo30

Logically, I agree. Intuitively, I feel suspect that it just won't happen. But, intuition on such alien things should not be a guide, so I fully support some attempt to slow down the takeoff.

dawnstrata's Shortform

dawnstrata3mo10

Thanks for this!

TBH, I am struggling with the idea that an AI intent on maximising a thing doesn't have that thing as a goal. Whether or not the goal was intended seems irrelevant to whether or not the goal exists in the thought experiment.

"Goal stability is almost certainly attained in some sense given sufficient competence"

I am really not sure about this, actually. Flexible goals is a universal feature of successful thinking organisms. I would expect that natural selection would kick in at least over sufficient scales (light delay making co-ordination progressively harder on galactic scales), causing drift. But even on small scales, if an AI has, say, 1000 competing goals, I would find it surprising if in a practical sense goals were actually totally fixed, even if you were superintelligent. Any number of things could change over time, such that locking yourself into fixed goals could be seen as a long-term risk to optimisation for any goal.

"Alignment is not just absence of value drift, it's also setting the right target, which is a very confused endeavor because there is currently no legible way of saying what that should be for humanity" - totally agree with that!

"AIs themselves might realize that (even more robustly than humans do), ending up leaning in favor of slowing down AI progress until they know what to do about that" - god I hope so haha

dawnstrata's Shortform

dawnstrata3mo10

I like the point here about how stability of goals might be an instrumentally convergent feature of superintelligence. It's an interesting point.

On the other hand, intuitive human reasoning would suggest that this is overly inflexible if you ever ask yourself 'could I ever come up with a better goal than this goal?'. What better would mean for a superintelligence seems hard to define, but it also seems hard to imagine that it would never ask the question.

Separately, your opening statements seem to be at least nearly synonymous to me:

"First off the paperclip maximizer isn't about how easy it is to give a hypothetical super intelligent a goal that you might regret later and not be able to change.

It is about the fact that almsot every easily specified goal you can give an AI would result in misalignment"

every easily specified goal you can give an AI would result in misalignment ~ = give a hypothetical super intelligence a goal that you might regret later (i.e., misalignment)

dawnstrata's Shortform

dawnstrata3mo2-2

The worry that AI will have overly fixed goals (paperclip maximiser) seems to contradict the erstwhile mainline doom scenario from AI (misalignment). If AI is easy to lock into a specify path (paperclips) then it follows that locking in into alignment is also easy - provided you know what alignment looks like (which could be very hard). On the other hand, a more realistic scenario would seem to be that, in fact, keeping fixed goals for AI is hard, and that likely drift is where the misalignment risk really comes in big time?

dawnstrata's Shortform

dawnstrata3mo30

I can agree that qualitatively there is a lot left to do. Quantitatively, though, I am still not quite seeing the smoking gun that human level AI will be able to smash through 15 OOM like this. But, happy to change my mind. I'll check out the anthropic link! Cheers.

dawnstrata's Shortform

dawnstrata3mo20

I don't really fully understand the research speed up concept in intelligence explosion scenarios, e.g., https://situational-awareness.ai/from-agi-to-superintelligence/

This concept seems fundamental to the entire recursive self-improvement idea, but what if it turns out that you can't just do:

Faster research x many agents = loads of stuff

What if you quickly burn through everything that an agent can really tell you without doing new research in the lab? You'd then hit a brick wall of progress where throwing 1000000000 agents at 5x human speed amounts to little more than what you get out of 1 agent (being hyperbolic lol).

Presumably this is just me as a non-computer-scientist missing something big and implicit in how AI research is expected to proceed? But ultimately I guess this Q boils down to:

Is there actually 15 orders of magnitude of algorithmic progress to make (at all), and/or can that truly be made without doing something complimentary in the real world to get new data / design new types of computers, and so on?

Rare AI and the Fermi Paradox

dawnstrata3mo10

Haha I have no idea! I agree the possibility space is huge. All I do know is that we don't see any evidence of alien AIs around us, so they are a poor explanation as a great filter for other alien races (unless they kill those races and then for some reason kill themselves, too / decide to be non-expansionist every single time).

Underdog bias rules everything around me

dawnstrata3mo148

Isn't there a bit of a false equivalence tucked up in the logic here? Two sides could be equally scared of one another and both feel like underdogs, but that says nothing about who is correct to think that way. Sometimes people just are the underdog. People unable to use democracy to enact change versus elites that consider them dangerous is a good example. The masses in that case are definitely the underdog, as they threaten the status quo of every major power centre (often state, corporations, politicians, and elite institutions all at once). In many European countries, certainly, it is unclear that the masses can do very much to policy at all right now. They feel like underdogs because they are. I am sure the elites also feel that they are underdogs... they're just wrong.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments