LESSWRONG
LW

Personal Blog

12

Un-optimised vs anti-optimised

by Stuart_Armstrong
14th Apr 2015
1 min read
3

12

Personal Blog

12

Un-optimised vs anti-optimised
0[anonymous]
0Stuart_Armstrong
-1drethelin
New Comment
3 comments, sorted by
top scoring
Click to highlight new comments since: Today at 12:42 AM
[-][anonymous]10y00

Would minimising the number of CPU cycles work as a lazy incentive.

This assumes that lesser CPU cycles will produce an outcome that is satisified rather than optimised, though in our current state of understanding any optimisation routines take a lot more computing effort than 'rough enough' solutions.

Perhaps getting the AGI's to go Green will kill two birds with one stone.

Reply
[-]Stuart_Armstrong10y00

This has problems with the creation of subagents: http://lesswrong.com/lw/lur/detecting_agents_and_subagents/

You can use a few CPU cycles to create subagents without that restriction.

Reply
[-]drethelin10y-10

It can be difficult to impossible to know how many CPU cycles a problem will take to solve before you solve it.

Reply
Moderation Log
More from Stuart_Armstrong
View more
Curated and popular this week
3Comments

A putative new idea for AI control; index here.

This post contains no new insights; it just puts together some old insights in a format I hope is clearer.

Most satisficers are unoptimised (above the satisficing level): they have a limited drive to optimise and transform the universe. They may still end up optimising the universe anyway: they have no penalty for doing so (and sometimes it's a good idea for them). But if they can lazily achieve their goal, then they're ok with that too. So they simply have low optimisation pressure.

A safe "satisficer" design (or a reduced impact AI design) needs to be not only un-optimised, but specifically anti-optimised. It has to be setup so that "go out and optimise the universe" scores worse that "be lazy and achieve your goal". The problem is that these terms are undefined (as usual), that there are many minor actions that can optimise the universe (such as creating a subagent), and the approach has to be safe against all possible ways of optimising the universe - not just the "maximise u" for a specific and known u.

That's why the reduced impact/safe satisficer/anti-optimised designs are so hard: you have to add a very precise yet general (anti-)optimising pressure, rather than simply removing the current optimising pressure.

Mentioned in
34New(ish) AI control ideas