Pseudolikelihood is method for approximating joint probability distributions. I'm bringing this up because I think something like this might be used in human cognition. If so, it would tend to produce overconfident estimates.

Say we have some joint distribution over X, Y, and Z, and we want to know about the probability of some particular vector (x, y, z). The pseudolikelihood estimate involves asking yourself how likely each piece of information is, given all of the other pieces of information. Then you multiply these together. So the pseudolikelihood of (x, y, z) is P(x|yz) P(y|xz) P(z|xy).

Not only is this wrong, but it gets more wrong as your system is bigger. By that I mean that a ratio of two pseudolikelihoods will tend towards 0 or infinity for big problems, even if the likelihoods are close to the same.

So how can we avoid this? A correct way to calculate a joint probability P(x,y,z) looks like P(x) P(y|x) P(z|xy). At each step we only condition on information "prior" to the thing we are asking about. My guess about how to do do this involves making your beliefs look more like a directed acyclic graph. Given two adjacent beliefs, you need to be clear on which is the "cause" and which is the "effect." The cause talks to the effect in terms of prior probabilities and the effect talks to the cause in terms of likelihoods.

Failure to do this could take the form of an undirected relationship (two beliefs are "related" without either belief being the cause or the effect), or loops in a directed graph. I don't actually think we want to get rid of undirected relationships entirely -- people do use them in machine learning -- but I can't see any good reason for keeping the latter.

An example of a causal loop would be if you thought of math as an abstraction from everyday reality, and then turned around and calculated prior probabilities of fundamental physical theories in terms of mathematical elegance. One way out is to declare yourself a mathematical Platonist. I'm not sure what the other way would look like.

Do you have any grounds for thinking human cognition uses pseudolikelihood? One of the reason Eliezer's contributions to the site are so strong are because he regularly had research to back up his articles, instead of relying solely on his intuition. I am assuming you are relying on intuition anyway, since you don't state what grounds you have for privileging this hypothesis.

OK, you're right that I have way too little evidence to single out this hypothesis. I think it jumped into my head because I had read about pseudolikelihood estimates recently.

In particular, even if we use some form of approximate inference, there's so many options out there (and probably none of them are good enough to be what humans actually use) that pseudolikelihood is not itself that likely.

Other versions of approximate inference: Markov-Chain Monte Carlo, Variational Inference, Loopy Belief Propogation.

Although merely citing research to back up your claims doesn't, in my opinion, make your arguments significantly stronger unless the research itself has been established fairly rigorously; for instance, the affect heuristic, while a popular theme on LessWrong, lacks experimental evidence.

This post has a big problem in that "cause" and "effect" are totally the wrong words, because when calculating joint probabilities you don't need to know causal information at all.

P(x,y,z) = P(x) P(y|x) P(z|xy) = P(x) P(z|x) P(y|xz) = P(y) P(x|y) P(z|xy) = P(y) P(z|y) P(x|zy) = P(z) P(y|z) P(x|zy) = P(z) P(x|z) P(y|xz).

OK, you're right. You only need to think about causality if you don't want your graph to be fully connected.

Causality, or whatever way of making it directed and acyclic feels natural. If you have statistical observations but no causal information, you're best off, e.g., just going from left to right.

Feedback loops, where X affects Y and Y affects X, exist in real life, and you want to be able to model them.

An object's state at time 0 and at time 1 are not the same variable.

While, as noted by others, pseudolikelihood is unlikely to be what humans actually use, I think it is interesting to ask whether some cognitive biases come from some sort of approximate inference. Designing an experiment to test this conclusively would be quite challenging, but very useful.