anonymous12

*I've spent about equal amounts of time on programming and mathematics, but ... I'm confident that I can solve most typical programming problems, while even basic math problems are far more intimidating and error-prone ... I believe this asymmetry is due to the fact that one can "interact" with computer programs.*

Quite true. This is one of the reasons there's so much interest in developing interactive proof assistants (HOL, Coq, Isabelle/Isar...) so that they can be used for "ordinary" mathematics. Not everyone likes both programming and math but for those who do, developing formalized mathematics on a proof assistant is a very engaging and even addictive experience.

*If a rocket launch is what it takes to give me a feeling of aesthetic transcendence, I do not see this as a substitute for religion. That is theomorphism - the viewpoint from gloating religionists who assume that everyone who isn't religious has a hole in their mind that wants filling.*

Eliezer, there is evience that people *do* have a God-shaped hole in their minds. Razib @ gnxp.com has documented this extensively. For instance Buddhism is a nominally non-theistic religion, yet it has independently evolved into worship of the "Lord Buddha", or some Bodhisattva, etc. [1] [2]

"I thought the whole point of probabilistic methods is that it doesn't matter too much what the prior is, it will always eventually converge on the right answer..."

AIUI this is somewhat misleading. Bayesian methods are most valuable precisely when the amount of available data is limited and prior probability is important. Whenever "it doesn't matter too much what the prior is", it makes more sense to use frequentist methods, which rely on large amounts of data to converge to the right solution.

Of course frequentist tools *also* make assumptions about the data and some of these assumptions may be disguised and poorly understood (making sense of these is arguably part of the "searching for Bayes structure" program), but some interpretations are straightforward: for instance, likelihood-based methods are equivalent to Bayesian methods assuming a uniform prior distribution.

(As an aside, it's ironic that Bayesian interpretation of such statistical tools is being pursued for the sake of rigor, given that frequentist statistics itself was developed as a reaction to widespread ad-hoc use of the "principle of inverse probability".)

Joseph Knecht:

The problem with your argument is that justification is cheap, while accuracy is expensive. The canonical examples of "unjustified" beliefs involve mis-calibration, but calibration is easy to correct just by making one's beliefs vaguer and less precise. Taken to the extreme, a maximum-entropy probability distribution is perfectly calibrated, but it adds zero bits of mutual information with the environment.

*Evolutions would tend to give humans brains with beliefs that largely matched the world, else they would be weeded out. So, after conception, as the mind grows, it would build itself up (as per its genetic code, proteome, bacteria, etc.) with beliefs that match the world, even if it didn't perform any Bayesian inferences.*

Note that natural selection can be seen as a process of Bayesian inference: gene frequencies represent prior and posterior probabilities, while the fitness landscape is equivalent to a likelihood function. However, evolution can only provide the mind with *prior* beliefs; presumably, these beliefs would have to match the ancestral evolutionary environment.

*Here the illusion of inference comes from the labels, which conceal the premises, and pretend to novelty in the conclusion.*

Surely you aren't suggesting that Aristotelian categorization is useless? Assigning arbitrary labels to premises is the *only* way that humans can make sense of large formal systems - such as software programs or axiomatic deductive systems. OTOH, trying to reason about *real-world* things and properties in a formally rigorous way will run into trouble whether or not you use Aristotelian labels.

Silas, see Naive Bayes classifier for how an "observable characteristics graph" similar to Network 2 should work in theory. It's not clear whether Hopfield or Hebbian learning can implement this, though.

To put it simply, Network 2 makes the strong assumption that *the only influence on features such as color or shape* is whether the object is a a rube or a blegg. This is an extremely strong assumption which is often inaccurate; despite this, naive Bayes classifiers work extremely well in practice.

The relationship between "expected evidence" and hindsight bias ought to be carefully considered. Since the set of plausible "just-so" explanations is effectively infinite, it's quite difficult to anticipate them in a formal decision model and assess their probability. You are focusing on a tiny fraction of explanations for which we *can* come up with testable evidence: this is not a convincing argument.

There was once a scorpion who begged a frog to carry him across the river because he could not swim.

The frog hesitated for fearing being stung by the scorpion. The scorpion said: "Don't worry, you know I won't sting you since we will both drown if I do that". So the frog carried the scorpion across the river. But in the middle of the river, the scorpion stung the frog. The frog asked the scorpion in disbelief: "Why did you do this? Now we will both drown!" - "Because you are a game theorist and I am not!", replied the scorpion.