Kevin Temple — LessWrong

Thanks for sharing the link to ARC. It seems to me the kinds of things they are testing for and worried about are analogous to the risks of self-driving cars: when you incorporate ML systems into a range of human activities, their behaviour is unpredictable and can be dangerous. I am glad ARC is doing the work they are doing. People are using unpredictable tools and ARC is investigating the risks. That's great.

I don't think these capabilities ARC is looking at are "similar" to runaway intelligence, as you suggest. They clearly do not require it. They are far more mundane (but dangerous nonetheless, as you rightly point out).

At one point in the ARC post, they hint vaguely at being motivated by Yudkowsky-like worries: "As AI systems improve, it is becoming increasingly difficult to rule out that models might be able to autonomously gain resources and evade human oversight – so rigorous evaluation is essential." They seem to be imagining a system giving itself goals, such that it is motivated to engage in tactical deception to carry out its goals--a behaviour we find in a range of problem-solving non-human animals. It strikes me as a worry that is extraneous to the good work ARC is doing. And the end of the quote is odd, since rigorous evaluation is clearly essential regardless of autonomous resource gains or oversight evasion.

"Carefully Bootstrapped Alignment" is organizationally hard

Kevin Temple3y21

I'm skeptical that hard take off is something to worry about anytime soon, but setting that aside, I think it's extremely valuable to think through these questions of organizational culture, because there are a lot of harms that can come from mere AI (as opposed to AGI) and all of these reflections pertain to less exotic but still pressing concerns about trustworthy AI.

These reflections very nicely cover what is hard about self-regulation, particularly in for-profit organizations. I think what is missing though is the constitutive role of the external, regulatory environment on the risk management structures and practices an organization adopts. Legislation and regulations create regulatory risks--financial penalties and public embarrassment for breaking the rules--that force companies (from the outside) to create the cultures and organs of responsibility this post describes. It is external force--and probably only external force--creates these internal shapes.

To put this point in the form of a prediction: show me a company with highly developed risk management practices and a culture of responsibility, and I will show you the government regulations that organization is answerable to. (This won't be true all of the time, but will be true for most HROs, for banking (in non-US G20 countries, at least), for biomedical research, and other areas.)

In fairness, I have to acknowledge that specific regulations for AI are not here yet, but they are coming soon. Pure self-regulation of AI companies is probably a futile goal. By contrast, operating under a sane, stable, coherent regulatory environment would actually bring a lot advantages to every company working on AI.

Widening Overton Window - Open Thread

Kevin Temple3y-2-8

I agree that AI absolutely needs to be regulated ASAP to mitigate the many potential harms that could arise from its use. So, even though the FLI letter is flimsy and vague, I appreciate its performance of concern.

Yudkowsky’s worry about runaway intelligence is, I think, an ungrounded distraction. It is ungrounded because Yudkowsky does not have a coherent theory of intentionality that makes sense of the idea of an algorithm gaining a capacity to engage in its own goal directed activity. It is a distraction from the public discourse about the very real, immediate, tangible risks of harms caused AI systems we have today.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments