Design-space traps: mapping the utility-design trajectory space
This is a small section on a paper I'm writing on moral enhancement. I'm trying to briefly summarize some of the points which were already made concerning local optima in evolutionary process and safety regarding taking humanity out of those local optima. You might find the text helpful in that it summarizes a very important concept. I don't think there's nothing new here, but I hope the way I tried to more properly phrase the utility-design trajectory space topology at the end can be fruitful. I would appreciate any insights you might have about that formulation in the end, how to better develop it more rigorously and some consequences. I do have some ideas, but I would want to hear what you have to say first. Any other kind of general feedback on the text is also welcomed. But keep in mind this is just a section of a larger paper and I'm mainly interested in how to develop and what are the consequences of the framework at the end, rather than in properly developing any points in the middle.
Local optima are points where every nearby reachable positions are worse off, but there is at least one far away position which is vastly better. A strong case has been made that evolution often gets stuck on such local optima. In evolutionary processes, fitness is a monotonic function, i.e., it will necessarily increase or be maintained, any decrease in fitness will always be selected against. If there are vastly better solutions (for, e.g., solving cooperation problems) but in order to achieve those solutions organisms would have to pass through a lesser fit step, evolution will never reach that vastly better configuration. Evolutionary processes are limited by the topology of the fitness-design trajectory space, it can only go from design x to design y if there is at least one trajectory from x to y which is flat or ascendant, any trajectory momentarily descendent cannot be taken by the evolutionary processes. Say one is on the cyan ring ridge of the colored graphic. Although there is a vastly better configuration on the red peak, one would have to travel through the blue moat in order to get there. Unless one is a process who could pass through a sharp decrease in fitness, there would be no way of improving towards the red peak. Evolution is particularly prone to local optima due to fitness monotonicity. Enhancing human beings with the use of technology does not fall prey to the fitness monotonicity or any sort of utility monotonicity in general, we could initially make changes which would be harmful in order to latter achieve a vastly better configuration. Therefore, it seems plausible there would be a technological path out of evolution’s local optimum whereby we could rescue our species from these evolutionary imprisonments. Moreover, it is considered evolutionary local optima can be easily identifiable provided a careful, evolutionary and technical informed analysis is made. Hence these would be low-hanging fruits in the task for improving evolutionary products such as humans, easily accessible and able to produce great advances to humanity with little effort.
Nevertheless, it should be noted getting out of evolutionary local optima might not always be easy or even possible. Fitness does have a relatively strong correlation with overall human utility. And although human intelligence is not so dull as evolutionary process and does accept a decrease in utility in order to achieve a better design in the end, if the downward moat is deep enough, the risk of catastrophe - or much worse, extinction -, might not be worth taking. At least by being monotonic on a dimension correlated with utility, evolution was able to rightly avoid extreme losses. Perform widespread willy-nilly human enhancement, and we might fall on the moat guarding utility-design space garden’s delicious low-hanging fruits and not come back up. Particularly so in the case of moral enhancement, there is a self-reinforcing aspect of changing morality, motivations, values and desires. It might be the case tampering with deep and fundamental human morality is irreversible, because once we fundamentally value something else, we would not have any compelling reason for wanting to come back to our old values, desires or aspirations. Thus, it seems there are indeed cases where a small step past the edge of the moat will lead us to an irreversible path. To correctly map how each technology shapes utility-design trajectory space topology is a task deeply needed in order to carefully avoid falling on moats while attempting to reach local optima low-hanging fruits, or on even more dangerous existential holes. We ought to better get stuck at local optima than absolute minima.
Utility-design trajectory space could be more properly defined as a space on Rn+u , a point there would use n-coordinates to locate all physically possible designs in all relevant dimensions n, it is defined by the laws of physics and by an utility function on u. A point will correspond to a design a iff all its neighbouring points x correspond to designs one physical step away from design a. Emergent designer processes such as evolution, human enhancement and AIs draw shapes on Rn+u by connecting points that are linked by one possible step under that process. Evolution’s hand is monotonic on dimension f, fitness, which makes for a pretty clumsy drawing. Biochemical human enhancement can more freely vary on f, but might contain other constraints elsewhere, that, e.g., uploaded minds would not. Extinctions correspond to singularities on u, once reached no other point is reachable, it designates lack of design. These points that can be reached but cannot reach need to be correctly mapped. It would also be relevant to investigate how each technology draws its specific shape on design space. Using u as some height analogue, some technologies might be inherently prone to shape moats with peaks on the middle, extinctions holes, effortless utility maximizing curves and so on. I believe moral enhancement draws a particularly bumpy hole-prone shape. FAI an ever utility-ascending shape, with all mishaps being existential holes.