And here I was wondering if this was a paper from the esteemed Brazilian jiu jitsu coach (who does in fact have a Masters degree in philosophy.)

Rather than doing pretty much anything, it seems more likely to me that a genuinely nihilistic agent would default to doing nothing.

I think that's an interesting point. I suppose I was thinking that nihilism, at least in the way its typically discussed, holds not that doing nothing is rational but, rather, that no goals are rational (a subtle difference, perhaps). This, in my opinion, might equate with all goals being equally possible. But, as you point out, if all goals are equally possible the agent might default to doing nothing.

One might put it like this: the agent would be landed in the equivalent of a Buridan's Ass dilemma. As far as I recall, the possibility that a CPU would be landed in such a dilemma was a genuine problem in the early days of computer science. I believe there was some protocol introduced to sidestep the problem.

John Danaher on 'The Superintelligent Will'

by lukeprog 1 min read3rd Apr 201212 comments


Philosopher John Danaher has written an explication and critique of Bostrom's "orthogonality thesis" from "The Superintelligent Will." To quote the conclusion:


Summing up, in this post I’ve considered Bostrom’s discussion of the orthogonality thesis. According to this thesis, any level of intelligence is, within certain weak constraints, compatible with any type of final goal. If true, the thesis might provide support for those who think it possible to create a benign superintelligence. But, as I have pointed out, Bostrom’s defence of the orthogonality thesis is lacking in certain respects, particularly in his somewhat opaque and cavalier dismissal of normatively thick theories of rationality.

As it happens, none of this may affect what Bostrom has to say about unfriendly superintelligences. His defence of that argument relies on the convergence thesis, not the orthogonality thesis. If the orthogonality thesis turns out to be false, then all that happens is that the kind of convergence Bostrom alludes to simply occurs at a higher level in the AI’s goal architecture. 

What might, however, be significant is whether the higher-level convergence is a convergence towards certain moral beliefs or a convergence toward nihilistic beliefs. If it is the former, then friendliness might be necessitated, not simply possible. If it is the latter, then all bets are off. A nihilistic agent could do pretty anything since, no goals would be rationally entailed.