AGI Chaining

Chaining God is Stuart Armstrong's term for his proposed method of maintaining control over a superhuman AGI. If an agent we create is vastly more intelligent than ourselves, then we would have difficulty comprehending its thought processes and plans, leading us to distrust it. However, we may be able to meaningfully communicate with a mildly superhuman intelligence. When that AGI produces its successor, it would remain at its current level in order to communicate with its human creators. As this step is iterated toward the level of a "godlike" AGI, the entity at each level could understand and trust the entity above itself.

Some major potential issues of this approach include breakdown of the chain of trust, and opportunity costs due to the inefficiency of requiring consent from the bottom of the chain before the top of the chain may act.

References

Armstrong, Stuart (2007) Chaining God: A qualitative approach to AI, trust and moral systems

Chaining God is Stuart Armstrong's term for his proposed method of maintaining control over a superhuman AGI. Traditional AGI plans involve the creation of a Seed AI which recursively improves itself. As it self-improves, human abilities will become drastically insufficient to comprehend the AGI's program enough to trust it, However, the original Seed AI was simple enough for us to understand and trust, and it will likewise be capable of understanding and trusting the successor it built. By having each iteration of the AGI improve a copy of itself only, a chain of AGI's which understand the AGI they created and are understood by their creator can be used to trust even the most complex AGI.

Armstrong proposes a number of issues with the chain design:

If an ~~agent~~AGI at any level ever claims or is claimed to be untrustworthy, the chain should be instructed to gather diagnostic information, then start from scratch.
If the AGI chain passes integrity checks yet acts untrustworthy, restart from scratch.
If the AGI chain refuses to and can prevent us from shutting it down, we're screwed and we ~~create is vastly more intelligent than ourselves, then we would have difficulty comprehending its thought processes and plans, leading us~~can only work with the learning systems of the chain.
If after repeated attempts the chain continues to ~~distrust it. However, we may be able to meaningfully communicate with~~fail, or a ~~mildly superhuman intelligence. When that AGI produces its successor, it would remain at its current level in order to communicate with its human creators. As this step is iterated toward the~~ level of intelligence is reached that claims the chain slows it down too much for further progress, at least some safe research has been conducted. We may choose to accept that limitation, or to simply accept an untrustworthy AGI.
If the AGI chain breaks invisibly, we're probably screwed.

This is a ~~"godlike" AGI,~~very conservative approach to AGI design, and presents a large opportunity cost. Armstrong believes the ~~entity~~chain approach would be unlikely to produce anywhere near the best possible future, since the AGI chain would learn from present human values only. Each improved layer of AGI would be limited in improvement to ensure its creator could understand it. With supervision on happening at each ~~level could understand~~level, an AGI would take longer to develop and ~~trust~~when starting over repeatedly the ~~entity above itself.~~

~~Some major potential issues of this approach include breakdown of the~~seed AI would always have to be humanly comprehensible. He does believe a AGI chain ~~of trust, and opportunity costs due~~is a simple way to ~~the inefficiency of requiring consent from the bottom of the chain before the top of the chain may act.~~create Friendly Artificial Intelligence.

Chaining God is Stuart Armstrong's term for his proposed method of maintaining control over a superhuman ~~AGI.~~AGI. Traditional AGI plans involve the creation of a Seed AI which recursively improves itself. As it self-improves, human abilities will become drastically insufficient to comprehend the AGI's program enough to trust it, However, the original Seed AI was simple enough for us to understand and trust, and it will likewise be capable of understanding and trusting the successor it built. By having each iteration of the AGI improve a copy of itself only, a chain of AGI's which understand the AGI they created and are understood by their creator can be used to trust even the most complex AGI.

This is a very conservative approach to AGI design, and presents a large opportunity cost. Armstrong believes the chain approach would be unlikely to produce anywhere near the best possible future, since the AGI chain would learn from present human values only. Each improved layer of AGI would be limited in improvement to ensure its creator could understand it. With supervision on happening at each level, an AGI would take longer to develop and when starting over repeatedly the seed AI would always have to be humanly comprehensible. He ~~does~~ believe aan AGI chain is a simple way to create Friendly Artificial Intelligence., but enumerates a number of ways the concept might never work.

Armstrong, Stuart (2007) Chaining God: A qualitative approach to AI, trust and moral systems
A review of proposals toward safe AI by Joshua Fox

Chaining God is Stuart Armstrong's term for his proposed method of maintaining control over a superhuman AGI. ~~Traditional~~It involves a chain of AGIs, each more advanced than the next. The idea is that even though humans might not be able to understand the most sophisticated AGI ~~plans involve the creation of a~~ ~~Seed AI~~ ~~which recursively~~ ~~improves itself. As it self-improves, human abilities will become drastically insufficient to comprehend the AGI's program~~well enough to trust it, ~~However, the original Seed AI was simple enough for us to~~they can understand and ~~trust, and it~~trust the first AGI in the chain, which will ~~likewise be capable of understanding and trusting~~in turn verify the ~~successor it built. By having each iteration~~trustworthiness of the ~~AGI improve a copy of itself only, a chain of AGI's which understand the AGI they created~~next AGI, and ~~are understood by their creator can be used to trust even the most complex AGI.~~so on.

Armstrong ~~proposes~~mentions a number of ~~issues with the chain design:~~considerations:

If an AGI at any level ever claims or is claimed to be untrustworthy, the chain should be instructed to gather diagnostic information, then start from scratch.
If the AGI chain passes integrity checks yet acts untrustworthy, restart from scratch.
If the AGI chain refuses to and can prevent us from shutting it down, ~~we're screwed~~we are in trouble and we can only ~~work~~attempt to negotiate with it, and hope for the ~~learning systems of the chain.~~best.
If after repeated attempts the chain continues to fail, or a level of intelligence is reached that claims the chain slows it down too much for further progress, at least some safe research has been conducted. We may choose to accept that limitation, or to simply accept an untrustworthy AGI.
If the AGI chain breaks invisibly, we're probably ~~screwed.~~doomed.

This is a very conservative approach to AGI design, and presents a large opportunity cost. Armstrong believes the chain approach would be unlikely to produce anywhere near the best possible future, since the AGI chain would only learn from present human ~~values only.~~values. Each improved layer of AGI would be limited in improvement to ensure its creator could understand it. With supervision happening at each level, an AGI would take longer to develop and when starting over repeatedly the seed AI would always have to be humanly comprehensible. He believe an AGI chain is a simple way to create Friendly Artificial Intelligence, but enumerates a number of ways the concept might never work.

			v1.6.0Sep 16th 2020 GMT
			v1.5.0Sep 16th 2020 GMT	(+2)
			v1.4.0Oct 22nd 2012 GMT	(+356/-629)
			v1.3.0Jul 10th 2012 GMT	/* References */
			v1.2.0Jul 10th 2012 GMT	(+119/-14)
			v1.1.0Jul 10th 2012 GMT	(+1949/-677) Expanded issues and rewritten
			v1.0.0Jun 29th 2012 GMT	(+1001) Created page with "'''Chaining God''' is Stuart Armstrong's term for his proposed method of maintaining control over a superhuman AGI. If an agent we create is vastly more intelligent than ourselve..."

			v1.6.0Sep 16th 2020 GMT
			v1.5.0Sep 16th 2020 GMT	(+2)
			v1.4.0Oct 22nd 2012 GMT	(+356/-629)
			v1.3.0Jul 10th 2012 GMT	/* References */
			v1.2.0Jul 10th 2012 GMT	(+119/-14)
			v1.1.0Jul 10th 2012 GMT	(+1949/-677) Expanded issues and rewritten
			v1.0.0Jun 29th 2012 GMT	(+1001) Created page with "'''Chaining God''' is Stuart Armstrong's term for his proposed method of maintaining control over a superhuman AGI. If an agent we create is vastly more intelligent than ourselve..."

LESSWRONG
LW

LESSWRONG
LW

AGI Chaining

See also

References