It turns out that examining mentation with mathematical attention to detail and form gives you the formal answer to the single formal question. I'm trying to gain notoriety here in order to find people who can help execute my plan for aligning AGI. 


Btc: 39QPwAZuaYeqrXJsocfCtJjZ9JY1totSWi


Sorted by New

Wiki Contributions


The votes on this lead me to suspect buyers waiting for sellers, so I'll offer: If someone were willing to offer a per-example bounty, I could contribute a practically unlimited amount of powerful, well-detailed examples. A general theory of conceptual structure I've been developing makes it stupidly easy to notice and dissect instances of this phenomenon (it will be posted here eventually, but my draft sequences need several more months of perfection to not simply be passed over). I'd do this for free, since I believe it'd be extremely useful to the community, but – since I've neglected to build a reputation here – something like a prize is necessary to prevent it from being ignored (and to make the required days/weeks of writing worth it even if it is). 

In statistical mechanics, one calculates the number (or hypervolume)  of possible states of a system, and defines the system's entropy as , where  is Boltzmann's constant. It's interesting to note the similarity between maximizing one's possibility space and maximizing entropy, though the equivalence between statmech entropy and information-theoretic entropy relies on physical principles that don't have any 'obvious' parallels in moral reasoning. 

I'd like to posit a slightly more general approach: deontology and virtue ethics are specific cases of a more general framework by which irrational agents with fixed cognitive substrates running lower-level utility functions (in humans: food, sex, social status, etc.) may nevertheless modify their heuristics so as to optimize for their top-level utility function in a more rational manner. 

For instance, an agent with a horrifyingly large time preference, which would (irrationally) choose one utilon right now over two in a couple hours, would do well to add heuristics counteracting those preferences. An agent who is aware that their lower-level utility function completely changes every so often would do well to not just find out and prevent whatever is causing the switch (or, failing that, learn how it switches), but prevent themselves from taking especially harmful actions when under the effect of a switch. 

Hence, such a flawed consequentialist would strive to cultivate in themselves both heuristics and inviolable rules, so as to ensure reliable future behavior under a variety of unpredictable modifications to their various utility functions. 

(As noted in your post, the agents don't even have to be irrational: limited computational power is enough for an agent to want to craft intelligent heuristics to follow when they need to act faster than they can think! So clearly there's a more general way to look at this, but I can't see it yet.)