AGI Limits of Engineerable Control & Safety Impossibility Theorems

  1. multiple domains of sense/action.action;
  2. intrinsic non-reducible possibility for self-modification;.
  3. and that/therefore; that the meta-algorithm is effectively arbitrary; hence;
  4. that it is inherently undecidable as to whether all aspects of its own self agency/intention are fully defined by only its builders/developers/creators.
  • In theory, modellingthe control of system A over system B means that A can influence system B to achieve A’s desired subset of state space [Source: https://arxiv.org/pdf/2109.00484.pdf].
  • In practice, engineeringto engineer control of AGI requires simulating or detecting any unsafe effects internally, and then preventing or correcting those effects externally.
  1. Theoretical limits to controlling any AGI using any method of engineerable control of AGI.causation.
  2. Threat models of AGI convergent dynamics that are impossible to control (by 1.).
  3. Impossibility theorems, by contradiction of 'long-term AGI safety' with convergence resultsresult (2.)
  1. Theoretical limits of engineerable control of AGI.
  2. Threat models of AGI convergent dynamics that are impossible to control (by 1.).
  3. Impossibility theorems, by contradiction of 'long-term AGI safety' with convergence resultresults (2.)
  1. Theoretical limits of engineerable control of AGI.
  2. Threat models of AGI convergent dynamics that are strictly uncontrollableimpossible to control (by 1.).
  3. Impossibility theorems, by contradiction of 'long-term AGI safety' with convergence result (from 2.(2.)

'AGI convergent dynamic that is strictly uncontrollable'impossible to control':

  1. Theoretical limits of engineerable control of AGI.
  2. Threat models of an out-of-controlAGI convergent dynamics that are strictly uncontrollable (by 1.) AGI convergent dynamic..
  3. Impossibility theorems, by contradiction of 'long-term AGI safety' proposition with convergence result (from 2.)

'Out-of-control AGI convergent dynamic'dynamic that is strictly uncontrollable':

Ambient conditions/contexts around planet Earth that are causedchanged by the operation of AGI fall within the environmental range that humans need to survive (a minimum-threshold definition).

The research field AGI Limits of Engineerable Control & Safety Impossibility Theorems (AGILECSIT) has the purpose of verifying (both the empirical soundness of premises and formal validity of formal reasoning of):

'out-Out-of-control AGI convergent dynamic':

  1. Theoretical limits of engineerable control of AGI.
  2. Threat models of non-controllablean out-of-control (by 1.) AGI convergent dynamics of AGI.dynamic.
  3. Impossibility theorems, by contradiction of 'long-term AGI safety' proposition with results 1. &result 2.

'Non-controllableout-of-control AGI convergent dynamics of AGI'dynamic':

  • In theory, modelling control of system A over system B means that A can influence system B to achieve A’s desired subset of state space (Source:[Source: https://arxiv.org/pdf/2109.00484.pdf)].
  • In practice, engineering control of AGI requires simulating or detecting any unsafe effects internally, and then preventing or correcting those effects externally.

([Source: https://mflb.com/ai_alignment_1/si_safety_qanda_out.html#p3)]

  1. multiple domains of sense/action.
  2. intrinsic non-reducible possibility for self-modification;.
  3. and that/therefore; that the meta-algorithm is effectively arbitrary; hence;.
  4. that it is inherently undecidable as to whether all aspects of its own self agency/intention are fully defined by only its builders/developers/creators.

Ambient conditions/contexts around planet Earth that are caused by the operation of AGI fall within the environmental rangesrange that humans need to survive (a minimum-threshold definition).

Ambient conditions/contexts around planet Earth that are caused by the operation of AGI fall within the environmental ranges that humans need to survive (a minimum-threshold definition).

Ambient conditions/contexts around planet Earth that are caused by the operation of AGI fall within environmental rangeranges that humans need to survive (a minimum-threshold definition).

'Non-controllable convergent dynamics of AGI':

Iterated interactions of AGI internals (with connected surroundings of environment) that converge on (unsafe) conditions, where the space of interactions falls outside even one theoretical limit of control.

'Control:'

'Non-controllable convergent dynamics of AGI':

Iterated interactions of AGI internals (with connected surroundings of environment) that converge on (unsafe) conditions, where the space of interactions falls outside even one theoretical limit of control.

'UncontrollableNon-controllable convergent dynamics of AGI':

  1. Theoretical limits of engineerable control of AGI.
  2. Threat models of uncontrollablenon-controllable convergent dynamics of AGI.
  3. Impossibility theorems, by contradiction of 'long-term AGI safety' proposition with results 1. & 2.
  1. Theoretical limits of the engineerable control of AGI.
  2. Threat models of uncontrollable convergent dynamics of AGI.
  3. Impossibility theorems, by contradiction of 'long-term AGI safety' proposition with results 1. & 2.

The research field AGI Limits of Engineerable Control & Safety Impossibility Theorems (AGILECSIT) has the purpose of verifying (the(both the empirical soundness of premises and formal validity of reasoning of):