637

LESSWRONG
LW

636
AI
Frontpage

12

[ Question ]

What other problems would a successful AI safety algorithm solve?

by DirectedEvolution
13th Jun 2021
1 min read
A
1
4

12

AI
Frontpage

12

What other problems would a successful AI safety algorithm solve?
3MSRayne
2Pattern
1MSRayne
2Charlie Steiner
New Answer
New Comment

1 Answers sorted by
top scoring

MSRayne

Jun 14, 2021

30

I think you've got it the wrong way around. In fact, that's probably my biggest issue with the whole field of alignment. I think that it's probably easier to solve the problem of human institution alignment than AI alignment, and that to do so would help solve the AI alignment problem as well.

The reason I say this is because individual humans are already aligned to human values, and it should be possible by some means to preserve this fact even while scaling up to entire organizations. There is no a priori reason that this would be more difficult than literally reverse engineering the entire human mind from scratch! That is, it doesn't actually matter what human values are, if you believe that humans actually can be trusted to have them - all that matters is that you can be certain they are preserved by the individual-to-organization transition. So my position can be summarized as "designing an organization which preserves human values already present is easier than figuring out what they are to begin with and injecting them into something with no humanity already in it."

As a matter of fact, this firm belief is the basis of my whole theory of what we ought to be doing as a species - I think AI should not be allowed to gain general intelligence, and that instead we should focus on creating an aligned superintelligence out of humans (with narrow AIs as "mortar", mere extensions to human capacities) - first in the form of an organization, later on a "hive mind" using brain computer interfaces to achieve varying degrees of voluntary mind-to-mind communication.

Add Comment
[-]Pattern4y20
reverse engineering the entire human mind from scratch!

That might not necessarily be required for AGI, though that does seem to be what figuring out how to program values is.

Reply
1MSRayne4y
The latter is more what I was pointing to.
1 comment, sorted by
top scoring
Click to highlight new comments since: Today at 12:08 PM
[-]Charlie Steiner4y20

The best technical solution might just be "use the FAI to find the solution." Friendly AI is already, at its core, just a formal method for evaluating which actions are good for humans.

It's plausible we could use AI alignment research to "align" corporations, but only in a weakened sense where there's some process that returns good answers in everyday contexts. But for "real" alignment where the corporation somehow does what's best for humans with high generality... well, that means using some process to evaluate actions, so this is the case of using FAI.

Reply
Moderation Log
More from DirectedEvolution
View more
Curated and popular this week
A
1
1

Corporations and governments are in some ways like superintelligences, and in others ways not. Much of economics, political science, and sociology seem to tackle the problem of why institutions fail to align with human interests. Yet the difference in architecture and capabilities between brains and computer programs suggests to me that aligning collective bio-superintelligence is a quite different problem from aligning AI superintelligence. It might be that because we can see and control the building blocks of AI, and have no hard ethical limits on shaping it as we please, aligning AI with human values is an easier problem to solve than aligning human institutions with human values. A technical solution to AI safety might even be a necessary precursor to understanding the brain and human relationships well enough to provide a technical solution to aligning human institutions.

 If we had a solid technical solution to AI safety, would it also give us technical solutions to the problem of human collective organization and governance? Would it give us solutions to other age-old problems? If so, is that reason to doubt that a technical solution to AI safety is feasible? If not, is that reason for some optimism? Finally, is our lack of a technical account of what human value is a hindrance to developing safe AI?