A Butlerian genie or AI Butler is an AI that understands material objects and technology extremely well, but has been somehow averted from modeling human minds, or possibly its own mind, overly well.beyond some point. (Presumably via a corrigible injunction that averts it wanting to model such minds.)
This is because many of the dangers in AI seem to be associated with the AI having a sufficiently advanced model of human minds or itself,AI minds, including:
ThisLimiting the degree to which the AI can understand cognitive science, other minds, its own programmers, and itself is a very severe restriction and might be overly difficultthat would prevent a number of obvious ways to do coherently.make progress on the AGI subproblem.
However, if we could in fact have an AI with an excellent grasp of how to accomplish material goals not involving the direct modeling and manipulation of human minds, it would not be irrelevant; it could carry out instructions that were game-changing for the value achievementlarger dilemma.
WhileAnd while a Butlerian AI would still require most of genie theory and corrigibility to be solved, it's plausible that the restriction away from modeling humans, programmers, and some types of reflectivity, would collectively make it significantly easier to make a safe form of this genie.
Thus, a Butlerian AI is one of few open candidates for "AI that is restricted in a way that actually makes it safer to build, without it being irrelevant"so restricted as to be incapable of game-changing achievements".
A Butlerian genie or AI Butler is an AI that understands material objects and technology extremely well, but has been somehow averted from modeling human minds, or possibly its own mind, overly well.
This is because many of the dangers in AI seem to be associated with the AI having a sufficiently advanced model of human minds or itself, including:
This is a very severe restriction and might be overly difficult to do coherently.
However, if we could in fact have an AI with an excellent grasp of how to accomplish material goals not involving the direct modeling and manipulation of human minds, it would not be irrelevant; it could carry out instructions that were game-changing for the value achievement dilemma.
While a Butlerian AI would still require most of genie theory and corrigibility to be solved, it's plausible that the restriction away from modeling humans, programmers, and some types of reflectivity, would collectively make it significantly easier to make a safe form of this genie.
Thus, a Butlerian AI is one of few open candidates for "AI that is restricted in a way that actually makes it safer to build, without being irrelevant".
The term 'Butlerian AI' comes from Frank Herbert's Dune, in which the Orange Catholic Bible contains the injunction, "Thou shalt not make a machine in the likeness of a human mind."
A behaviorist genie is an AI that has been averted from modeling minds in more detail than some allowed cap.whitelisted class of models.
Nonetheless, limiting the degree to which the AI can understand cognitive science, other minds, its own programmers, and itself is a very severe restriction that would prevent a number of obvious ways to make progress on the AGI subproblem and the value identification problem even for genie commands given to Task AGIs (Genies). Furthermore, there could perhaps be easier types of genies to build, or there might be grave difficulties in restricting the model class to some space that is useful without being dangerous. Behaviorist genies are being considered tentatively, rather than as a settled research path.
A mindblind genie or AI Butler is an AI that understands material objects and technology extremely well, but has been somehow averted from modeling human minds, or possibly its own mind, beyond some point. (Presumably via a corrigible injunction that averts it wanting to model such minds.)
A Butlerianmindblind genie or AI Butler is an AI that understands material objects and technology extremely well, but has been somehow averted from modeling human minds, or possibly its own mind, beyond some point. (Presumably via a corrigible injunction that averts it wanting to model such minds.)
And while a Butlerianmindblind AI would still require most of genie theory and corrigibility to be solved, it's plausible that the restriction away from modeling humans, programmers, and some types of reflectivity, would collectively make it significantly easier to make a safe form of this genie.
Thus, a Butlerianmindblind AI is one of few open candidates for "AI that is restricted in a way that actually makes it safer to build, without it being so restricted as to be incapable of game-changing achievements".
The term 'Butlerian AI' comes from Frank Herbert's Dune, in which the Orange Catholic Bible contains the injunction, "Thou shalt not make a machine in the likeness of a human mind."
A mindblindbehaviorist genie is an AI that understands material objects and technology extremely well, but has been somehow averted from modeling human minds, or possibly its own mind, beyond minds in more detail than some point. (Presumably via a corrigible injunction that averts it wanting to model such minds.)allowed cap.
This is possibly a good idea because many of the dangers in AIpossible difficulties seem to be associated with the AI having a sufficiently advanced model of human minds or AI minds, including:
Limiting...and yet an AI that is extremely good at understanding material objects and technology (just not other minds) would still be capable of some important classes of pivotal achievement.
A behaviorist genie would still require most of genie theory and corrigibility to be solved. But it's plausible that the restriction away from modeling humans, programmers, and some types of reflectivity, would collectively make it significantly easier to make a safe form of this genie.
Thus, a behaviorist genie is one of fairly few open candidates for "AI that is restricted in a way that actually makes it safer to build, without it being so restricted as to be incapable of game-changing achievements".
Nonetheless, limiting the degree to which the AI can understand cognitive science, other minds, its own programmers, and itself is a very severe restriction that would prevent a number of obvious ways to make progress on the AGI subproblem.subproblem and the value identification problem even for genie commands. Furthermore, there could perhaps be easier types of genies to build, or there might be grave difficulties in restricting the model class to some space that is useful without being dangerous. Behaviorist genies are being considered tentatively, rather than as a settled research path.
However, ifBroadly speaking, two possible clusters of behaviorist-genie design are:
Breaking the first case down into more detail, the potential desiderata for a behavioristic design are:
These are different goals, but with some overlap between them. Some of the things we could in fact have anmight need: