The research field AGI Limits of Engineerable Control & Safety Impossibility Theorems (AGILECSIT) has the purpose of verifying (both the empirical soundness of premises and formal validity of reasoning of):
- Theoretical limits of engineerable control of AGI.
- Threat models of non-controllable convergent dynamics of AGI.
- Impossibility theorems, by contradiction of 'long-term AGI safety' proposition with results 1. & 2.
~ ~ ~
Definitions and Distinctions
'Non-controllable convergent dynamics of AGI':
Iterated interactions of AGI internals (with connected surroundings of environment) that converge on (unsafe) conditions, where the space of interactions falls outside even one theoretical limit of control.
'Control:'
- In theory, modelling control of system A over system B means that A can influence system B to achieve A’s desired subset of state space (Source: https://arxiv.org/pdf/2109.00484.pdf).
- In practice, engineering control of AGI requires simulating or detecting any unsafe effects internally, and then preventing or correcting those effects externally.
'Long term':
- In theory: into perpetuity.
- In practice: over a thousand years.
'AGI safety':
Ambient conditions/contexts around planet Earth that are caused by the operation of AGI fall within the environmental range that humans need to survive (a minimum-threshold definition).
'AGI':
That the notion of 'artificial intelligence' (AI) can be either "narrow" or "general":
That the notion of 'narrow AI' specifically implies:
- a single domain of sense and action.
- no possibility for self base-code modification.
- a single well-defined meta-algorithm.
- that all aspects of its own self agency/intention are fully defined by its builders/developers/creators.
That the notion of 'general AI' specifically implies:
- multiple domains of sense/action.
- intrinsic non-reducible possibility for self-modification;.
- and that/therefore; that the meta-algorithm is effectively arbitrary; hence;.
- that it is inherently undecidable as to whether all aspects of its own self agency/intention are fully defined by only its builders/developers/creators.
(Source: https://mflb.com/ai_alignment_1/si_safety_qanda_out.html#p3)