Besides Yudkowsky and Goertzel, the only person I know of doing serious computational meta-ethics is Dutch philosopher and computer scientist Gert-Jan Lokhorst. He has a paper forthcoming in *Minds and Machines* called "Computational Meta-Ethics: Towards the Meta-Ethical Robot." I suspect it wil be of interest to some.

His paper also mentions some work in formal epistemology on computational metaphysics and computational meta-modal logic. Ah, the pleasures of scholarship! (You're all tired of me harping on about scholarship, right?)

Harry Gensler's book "Formal Ethics" deals with a few meta-ethical principles using deontological modal logic together with a linguistic gimmick (imperative sentences) due to H-N Castañeda. Computational to the extent that it provides pencil-and-paper algorithms for reasoning. Gensler is a theist, but that doesn't harm the book as long as you can tolerate a few exercises in which that unneeded hypothesis (Laplace) is assumed.

"Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations" by Yoav Shoham and Kevin Leyton-Brown is an outstanding resource for those like myself who think that ethics should be based on Game Theory.

A good free online textbook on game theory is Osborne and Rubinstein's "A Course in Game Theory". I haven't done much more than sample it, but the coverage seems complete and rigorous. Rubinstein also has a book named "Modeling Bounded Rationality". The importance of that subject in the context of computational ethics should be obvious. But if you are still dubious, check out the anonymous reader comment at the link.

Some researchers working in modal logics relevant to mechanized ethical reasoning are Wiebe van der Hoek, Peter Vranas, Johan van Benthem, Joseph Halpern, and Krister Segerberg.

Awesome! Thanks!

I admit to not really understanding how different deontic logics work. I would hope it's like propositional logic in that you can't generate false statements out of true premises and all you have to do is set up a theorem-prover and let it do its thing.

But that doesn't seem to be true since the article wants to judge each new system the AI develops against some criteria to make sure it doesn't come up with weird moral laws. And if that's the case, and you really don't know what kind of things it's going to come up with, just using similarity or difference from four statements to judge whether they're okay or not doesn't seem very reassuring.

Especially since Wikipedia discusses how innocent-seeming deontic logics can sometimes accidentally prove "it is obligatory to murder" and it doesn't seem like any of the four statements listed as sanity checks would flag that one as a problem.

It may be that I just haven't internalized Moravec's Paradox enough to view a system that at its best can correctly generate statements like "If A->X and X is forbidden, A is forbidden" as an interesting accomplishment. But couldn't you get the same thing by just using a utility-maximizing AI and normal propositional logic?

Well, it

is like propositional logicin that thepurelogical system is not particularly useful. What you really want is anappliedlogical system in which you supply additional axioms as a kind of "domain knowledge". But, whenever you add axioms, you run the risk of making the system inconsistent.Generally speaking, proofs of consistency are difficult, but proofs of

relative consistencyare somewhat straightforward. One way of describing what Lokhorst did is that he automated proofs and disproofs of relative consistency. It is an easy thing to do in simple propositional logic, moderately difficult in most modal logics, and more difficult in full first-order logic.I would describe Lokhorst's accomplishment as simple mechanized meta-logic, rather than simple mechanized meta-ethics. Though it does bear a close resemblance to the traditional ethical reasoning technique of evaluating an ethical system against a set of ethical 'facts'.

ETA: As for "just setting up a theorem prover and letting it do its thing", I'm pretty sure that when (admissible) new axioms are added to a proof system, the theorem prover needs to be "re-tuned" for efficiency. Particularly so in modal logics. So, a certain amount of meta- thinking is going to need to be done by any self-improving system.

Thanks for posting this. This approach certainly seems like the most straightforward way to test systems. "Here are unacceptable ideas and some desirable ideas. Does this system generate the desirable ones and not the unacceptable ones?" The problem is that this isn't computationally reasonable, at least not generally. We could do it with tricks, though someone would have to develop those tricks first :)

I wonder what principles would lead a robot to create that list of unacceptable theorems?

Thanks for this comment - it helped clarify to me what exactly the paper was doing.

Maybe I'm missing something, but I'm not too impressed by this. It seems like exactly the sort of thing that Eliezer was talking about in "Against Modal Logics" — putting all our confusion into irreducible modal operators and using deductive reasoning to move those boxes of confusion around, instead of actually reducing any of the things we're trying to talk about. What metaethical claims are even being made in this paper, other than the rather obvious desiderata on page 5?

It's not making meta-ethical claims. It's presenting a system for how you could implement meta-ethics in a machine agent.