AGI Chaining

Ruby
Ruby (+2)
Kaj_Sotala (+356/-629)
TerminalAwareness /* References */
TerminalAwareness (+119/-14)
TerminalAwareness (+1949/-677) Expanded issues and rewritten
Alex_Altair (+1001) Created page with "'''Chaining God''' is Stuart Armstrong's term for his proposed method of maintaining control over a superhuman AGI. If an agent we create is vastly more intelligent than ourselve..."

Chaining God is Stuart Armstrong's term for his proposed method of maintaining control over a superhuman AGI. TraditionalIt involves a chain of AGIs, each more advanced than the next. The idea is that even though humans might not be able to understand the most sophisticated AGI plans involve the creation of a Seed AI which recursively improves itself. As it self-improves, human abilities will become drastically insufficient to comprehend the AGI's programwell enough to trust it, However, the original Seed AI was simple enough for us tothey can understand and trust, and ittrust the first AGI in the chain, which will likewise be capable of understanding and trustingin turn verify the successor it built. By having each iterationtrustworthiness of the AGI improve a copy of itself only, a chain of AGI's which understand the AGI they creatednext AGI, and are understood by their creator can be used to trust even the most complex AGI.so on.

Armstrong proposesmentions a number of issues with the chain design:considerations:

  • If an AGI at any level ever claims or is claimed to be untrustworthy, the chain should be instructed to gather diagnostic information, then start from scratch.
  • If the AGI chain passes integrity checks yet acts untrustworthy, restart from scratch.
  • If the AGI chain refuses to and can prevent us from shutting it down, we're screwedwe are in trouble and we can only workattempt to negotiate with it, and hope for the learning systems of the chain.best.
  • If after repeated attempts the chain continues to fail, or a level of intelligence is reached that claims the chain slows it down too much for further progress, at least some safe research has been conducted. We may choose to accept that limitation, or to simply accept an untrustworthy AGI.
  • If the AGI chain breaks invisibly, we're probably screwed.doomed.

This is a very conservative approach to AGI design, and presents a large opportunity cost. Armstrong believes the chain approach would be unlikely to produce anywhere near the best possible future, since the AGI chain would only learn from present human values only.values. Each improved layer of AGI would be limited in improvement to ensure its creator could understand it. With supervision happening at each level, an AGI would take longer to develop and when starting over repeatedly the seed AI would always have to be humanly comprehensible. He believe an AGI chain is a simple way to create Friendly Artificial Intelligence, but enumerates a number of ways the concept might never work.

Chaining God is Stuart Armstrong's term for his proposed method of maintaining control over a superhuman AGI.AGI. Traditional AGI plans involve the creation of a Seed AI which recursively improves itself. As it self-improves, human abilities will become drastically insufficient to comprehend the AGI's program enough to trust it, However, the original Seed AI was simple enough for us to understand and trust, and it will likewise be capable of understanding and trusting the successor it built. By having each iteration of the AGI improve a copy of itself only, a chain of AGI's which understand the AGI they created and are understood by their creator can be used to trust even the most complex AGI.

This is a very conservative approach to AGI design, and presents a large opportunity cost. Armstrong believes the chain approach would be unlikely to produce anywhere near the best possible future, since the AGI chain would learn from present human values only. Each improved layer of AGI would be limited in improvement to ensure its creator could understand it. With supervision on happening at each level, an AGI would take longer to develop and when starting over repeatedly the seed AI would always have to be humanly comprehensible. He does believe aan AGI chain is a simple way to create Friendly Artificial Intelligence., but enumerates a number of ways the concept might never work.

Chaining God is Stuart Armstrong's term for his proposed method of maintaining control over a superhuman AGI. Traditional AGI plans involve the creation of a Seed AI which recursively improves itself. As it self-improves, human abilities will become drastically insufficient to comprehend the AGI's program enough to trust it, However, the original Seed AI was simple enough for us to understand and trust, and it will likewise be capable of understanding and trusting the successor it built. By having each iteration of the AGI improve a copy of itself only, a chain of AGI's which understand the AGI they created and are understood by their creator can be used to trust even the most complex AGI.

Armstrong proposes a number of issues with the chain design:

  • If an agentAGI at any level ever claims or is claimed to be untrustworthy, the chain should be instructed to gather diagnostic information, then start from scratch.
  • If the AGI chain passes integrity checks yet acts untrustworthy, restart from scratch.
  • If the AGI chain refuses to and can prevent us from shutting it down, we're screwed and we create is vastly more intelligent than ourselves, then we would have difficulty comprehending its thought processes and plans, leading uscan only work with the learning systems of the chain.
  • If after repeated attempts the chain continues to distrust it. However, we may be able to meaningfully communicate withfail, or a mildly superhuman intelligence. When that AGI produces its successor, it would remain at its current level in order to communicate with its human creators. As this step is iterated toward the level of intelligence is reached that claims the chain slows it down too much for further progress, at least some safe research has been conducted. We may choose to accept that limitation, or to simply accept an untrustworthy AGI.
  • If the AGI chain breaks invisibly, we're probably screwed.

This is a "godlike" AGI,very conservative approach to AGI design, and presents a large opportunity cost. Armstrong believes the entitychain approach would be unlikely to produce anywhere near the best possible future, since the AGI chain would learn from present human values only. Each improved layer of AGI would be limited in improvement to ensure its creator could understand it. With supervision on happening at each level could understandlevel, an AGI would take longer to develop and trustwhen starting over repeatedly the entity above itself.

Some major potential issues of this approach include breakdown of theseed AI would always have to be humanly comprehensible. He does believe a AGI chain of trust, and opportunity costs dueis a simple way to the inefficiency of requiring consent from the bottom of the chain before the top of the chain may act.create Friendly Artificial Intelligence.

Chaining God is Stuart Armstrong's term for his proposed method of maintaining control over a superhuman AGI. If an agent we create is vastly more intelligent than ourselves, then we would have difficulty comprehending its thought processes and plans, leading us to distrust it. However, we may be able to meaningfully communicate with a mildly superhuman intelligence. When that AGI produces its successor, it would remain at its current level in order to communicate with its human creators. As this step is iterated toward the level of a "godlike" AGI, the entity at each level could understand and trust the entity above itself.

Some major potential issues of this approach include breakdown of the chain of trust, and opportunity costs due to the inefficiency of requiring consent from the bottom of the chain before the top of the chain may act.

See also

References