Inherited Improbabilities: Transferring the Burden of Proof

I'm not sure if I understand the post completely. Is the following a fair translation?

"If our prior against Knox's guilt is 1:1000000, and a staged burglary would imply with 99% certainty that Knox is guilty, and we have 1000:1 evidence that the burglary was staged, then mathematically this isn't enough to convict Knox. You need more evidence."

(For some reason the post is much longer than that, and makes all those arguments whose purpose I don't understand...)

[-]komponisto15y100

(For some reason the post is much longer than that, and makes all those arguments whose purpose I don't understand...)

Such as....?

Yes, the point is a mathematical triviality. For that matter, so is Bayes' theorem itself. That doesn't mean that everybody grasps its implications at once, so that it isn't worth writing detailed posts on.

[-]nshepperd15y90

Pretty much, I think.

If the prior P(guilty) is 1:1000000 and P(guilty|staged) is really high, a consistent prior requires that P(staged) is around 1:1000000 as well. Therefore 1000:1 evidence isn't enough.

[-]gwern15y40

Prediction that defense appeal will succeed: http://predictionbook.com/predictions/1804

[-]komponisto15y00

I've updated my probability upward after last Saturday's news about the DNA evidence.

[-]Daniel_Burfoot15y40

Great analysis. I am a little bit worried about adopting the idea of Bayesian logic in the criminal justice system, though, since it seems like it will just give people an incentive to commit a priori improbable crimes!

[-]jimmy15y50

This amounts to saying that we should crank up the innocent/guilty conviction ratio for things that are improbable, which doesn't make much sense. The only way we'd catch more low a priori criminals is by lowering our standards of evidence, which necessarily means convicting more innocents.

That's no more helpful than saying "I'm worried that there's going to be an incentive not to get caught because we only punish the criminals we think are guilty". We still can't punish someone we don't think is guilty- but as an aside, it does mean we should punish effort spent on not getting caught.

[-]DanielLC15y50

We could also catch more low a priori criminals by improving our methods of dealing with the evidence, like using Bayesian logic.

[-]Jack15y30

Good post though I'm not sure the whole discussion about MP and MT was necessary- isn't the relationship between the break in and the murder biconditional (which I guess probabilistically speaking means that the specificity and sensitivity are about equally high)?

I think part of the problem we see is people either unfamiliar with or incapable of reasoning probabilistically. Yes the fundamental error the prosecutors are making applies to a deductive interpretation of their position- but my sense is the prosecutors (and probably most people involved) don't realize that a convincing arguments for a guilty verdict could begin by showing that the probability of a staged break in was say, 1/3, independent of all the other evidence. Saying that doesn't even sound like a point in favor of guilt. But of course 1/3 is well above a reasonable prior and requires a fair amount of evidence. And given enough additional evidence against Knox and Sollecito that estimate can go up to the .999 you want it at. But traditional rationality doesn't give people a good way of thinking about how low-probability sub-hypotheses can provide evidence for high probability hypotheses.

[-]JGWeissman15y30

If A is strong evidence of B

I would word this as "If A is sufficiently strong evidence of B to overcome B's prior improbability". Simply saying "A is strong evidence of B" feels to me like a statement about the likelihood ratio, not the posterior probability.

[-]komponisto15y00

This concern occurred to me; but consider: just how large does the likelihood ratio have to be for the evidence to be considered "strong"? Arguably, this depends on the prior probability (and thus the desired posterior probability) in the first place.

In any event, my hope is that the meanings of these vague verbal mnemonics are sufficiently clarified by the formulas.

ETA: Word "sufficiently" added to post.

[-]gwern15y20

One person's modus ponens is another's modus tollens.

I found a citation for this in Here is a hand to 'Dretske, Fred (1995), Naturalizing the Mind, Cambridge, Mass.: The MIT Press. ISBN 0-262-04149-9'. Let's give credit where credit is due; it's a truth worth remembering about logically valid arguments.

[-]Richard_Kennaway15y30

It's much older than that. A quick Google found a mention from 1973, but I would not be surprised to find it is at least a century older.

[-]komponisto15y10

Summary added:

Rules of logic have counterparts in probability theory. This post discusses the probabilistic analogue of modus tollens (the rule that if A=>B is true and B is false, then A is false), which is the inequality P(A) ≤ P(B)/P(B|A). What this says, in ordinary language, is that if A strongly implies B, then proving A is approximately as difficult as proving B.

[-]Nisan15y10

I'd like to add that the high "burden of proof" in this case comes from both

(1) the low prior probability of guilt in this case; and

(2) the high probability threshold that the court generally demands ("beyond a reasonable doubt") before it will condemn the defendants. If we wanted to bring in decision theory, we would assign a lot of disutility to a wrongful conviction. This determines what "likely" and "unlikely" mean in this context.

[-]jimmy15y50

The former dwarfs the latter.

The prior for any given person being guilty is on the 'one in a million' order of magnitude, but the courts are probably closer to 1 in 10 on the margin (wild ass guess). If you translate "beyond reasonable doubt" to 99% or 99.9%, that still might translate down to 90% once you take into account overconfidence.

From looking at this example, it certainly doesn't look like the algorithm used by the court system has an innocent to guilty ratio of anywhere near as low as 1 in a million on the marginal cases.

It's a bit of an 'Einstein's arrogance' thing

[-]TheOtherDave15y00

Yes, if you want to argue P1 from P2, you must show P2. And whatever standard of proof you demand for acting on P1, you should also demand for P2 if you're using it this way: lowering the bar for proving P2 and then arguing P1 from P2 implicitly lowers the bar for proving P1.

And that's no less true if P1 = Knox and Sollecito are guilty and P2 = the burglary was fake. This much ought to be uncontroversial.

Of course, what evidence the prosecution actually has for P1 and P2 in this case is a different question.

I must sheepishly admit that the elaborate explanation actually made it more difficult for me to understand the argument; I had to reconstruct it myself in order to see it.

[-]komponisto15y00

I must sheepishly admit that the elaborate explanation actually made it more difficult for me to understand the argument; I had to reconstruct it myself in order to see it.

I'm sorry to hear that. Unfortunately, adopting a general policy of eschewing elaborate explanations, and just stating the distilled main point and expecting everyone to understand its significance, won't work either (it's been tried).

I'm happy you at least did end up understanding the argument.

It might be more helpful if you identified particular passages that seemed to cloud your understanding.

Of course, what evidence the prosecution actually has for P1 and P2 in this case is a different question.

Do you suspect them of having strong evidence for either?

[-]TheOtherDave15y10

Agreed that just distilling the main point would do no better, and quite possibly worse.

It might be more helpful if you identified particular passages that seemed to cloud your understanding.

I didn't mean to suggest it was particularly cloudy; it was more meant as an admission of personal failing.

That said, most of my confusion stemmed from thinking you were introducing a new claim when you were actually introducing an alternate framing of the same claim. Were I editing this for publication I would recommend clearly labeling and separating those framings -- e.g., the probability-mathematical discussion vs. the real-world background -- and adding a brief introduction summarizing the basic argument.

Aka "tell 'em what you're going to tell 'em, tell 'em, and then tell 'em what you told 'em."

Do you suspect them of having strong evidence for either?

Nope. I know absolutely nothing about this case other than what you present here, and what you present doesn't suggest any such evidence.

[-]The Dao of Bayes15y00

(2') If A is (sufficiently) strong evidence of B, then the prior probability of A can't be much higher than the prior probability of B.

The logic and math of this post seems very confused. It feels like you are saying "If the sun rises tomorrow, I will kill you. The probability of me being a murderer is 1:10^8, therefor the probability of the sun rising tomorrow cannot be much higher than 1:10^8"

First off, there's some very crucial evidence you are forgetting in evaluating this case. The key element here is that numerous small bits of evidence are cumulative. This is a very important point, and one which jsteinhardt touched on already.

First, we have a very major piece of evidence: A murder did in fact occur, and the murderer must have been in Perugia at the time they committed this murder. At this point, we have approximately 10^5 possible suspects (Perugia has a population of 166,253), and we know, factually, that one of them is the guilty party. If we had no other evidence, we could reasonably assign a probability of 1:10^5 that each one is guilty. You'll notice that this is vastly higher than the normal probability of someone being a murderer, because we already have quite a few bits of evidence.

If the burglary was faked with odds of 10^4:1, then we can assume that everyone that had a motive to do so now has a guilt probability of 10^4:10^5, or approximately 1:10. A 10% chance of Amanda Knox being guilty is certainly poor evidence, and I don't see any reason to favor her over other people who have been demonstrated to have equal motive, but I'm also basing this entirely on this specific post.

The consequences of the burglary being faked does not change based the probability that it occurred, any more than my threat to kill you tomorrow will prevent the sun from rising. If we're dealing with probability, then there is some factual probability that the burglary was faked, based on it's own evidence, and this probability is entirely independent of the consequences. Further, this probability, and the probability that (Burglary Faked => Amanda is Guilty) cannot be 100%, despite your post assuming such. You cannot include impossible numbers and then expect a firm conclusion to arise.

P.S. If your point was simply "The judge is assuming impossible numbers", then I'd feel you are probably wrong on this point. I'd be happy to elaborate if that is in fact the case.

P.P.S. You can argue that a "higher standard of evidence" for proving that may be required, based on legal and moral principles, but that has nothing at all to do with probabilities.

[-]komponisto15y40

First of all, Welcome to Less Wrong!

The logic and math of this post seems very confused. It feels like you are saying "If the sun rises tomorrow, I will kill you. The probability of me being a murderer is 1:10^8, therefor the probability of the sun rising tomorrow cannot be much higher than 1:10^8"

Well, if you knew that

(1) if the sun will rise tomorrow, then I am a murderer,

and you also knew that

(2) I am not a murderer,

then you would indeed know that

(3) the sun will not rise tomorrow.

First off, there's some very crucial evidence you are forgetting in evaluating this case.

There is very little -- certainly very little of importance -- that I have forgotten about this case. And I have pretty much all of the publicly available information that exists about it at my fingertips, in case I do forget anything. So, no.

What I am aware of and what I explicitly mention in a particular post are not the same thing.

The key element here is that numerous small bits of evidence are cumulative.

While this is mathematically as beyond dispute as (say) the formulas I presented in the post, it's worth noting that approaching something like a murder case in this way is highly dangerous, due to various cognitive biases (which of course are our subject matter here on LW). There is a serious risk of misjudging the strength of such small pieces of evidence, and compounding the error by missing dependence relations, so that you end up double-counting evidence.

But anyway, this doesn't have much to do with this post.

The consequences of the burglary being faked does not change based the probability that it occurred, any more than my threat to kill you tomorrow will prevent the sun from rising. If we're dealing with probability, then there is some factual probability that the burglary was faked, based on it's own evidence, and this probability is entirely independent of the consequences

The intuition you're describing here is exactly the one that my post aims to refute.

It might seem, as it no doubt did to Massei and Cristiani, that you should be able to establish whether the burglary was fake independently of whether Knox and Sollecito killed Kercher. After all, there isn't much physical connection between the events in Romanelli's room and the events in Kercher's, is there? But this is a mistake -- or at least, it is so long as you believe that establishing the burglary was fake would imply that Knox and Sollecito killed Kercher.

In principle, you certainly could establish that the burglary was fake without making any tacit assumption that Knox and Sollecito have a substantial probability of being guilty of murder; but the type of evidence you would need to do that would have to be very strong -- around as strong as the evidence needed to show their guilt independently of the burglary question.

P.S. If your point was simply "The judge is assuming impossible numbers", then I'd feel you are probably wrong on this point. I'd be happy to elaborate if that is in fact the case.

I'm not sure what you mean here, but it sounds like you perhaps think that Massei and Cristiani's reasoning is sound. (Do you think that Knox and Sollecito are likely guilty? If so, I'd be happy to discuss that, but this post wouldn't be the place to do it.)

P.P.S. You can argue that a "higher standard of evidence" for proving that may be required, based on legal and moral principles, but that has nothing at all to do with probabilities.

If you read the post, you'll see that it's pretty much entirely about probabilities.

[-]The Dao of Bayes15y00

I feel you can demonstrate quite amply that A is not sufficient proof of B, and that A=>B has not been sufficiently proven either.

However, neither of these assertions seems to be your point. You seem to be insisting that you can't prove A, and I see absolutely no evidence of that, unless you take as given the assumption A=>B. I would certainly challenge that assumption.

Am I mistaken in this understanding of your point?

P.S. I feel the evidence suggets Knox is guilty at around a 10% chance, based solely on the evidence in this post. I do not feel a 10% chance of guilt is sufficient. I have not considered any evidence outside this post, as my interest is in the probability math, and not in the actual case itself.

P.P.S. A discussion of the dangers of cognitive biases is, I feel, entirely orthogonal to a discussion on probabilities and mathematics. Given my interest is in the math, not the case, I am going to skip over discussion of such biases.

[-]komponisto15y20

So you don't agree that if Knox and Sollecito faked the burglary, then they are likely guilty of murder?

I feel the evidence suggets Knox is guilty at around a 10% chance, based solely on the evidence in this post

There isn't much evidence presented in this post -- hardly any at all. (Plenty of information is linked to, of course...)

A discussion of the dangers of cognitive biases is, I feel, entirely orthogonal to a discussion on probabilities and mathematics.

Well, then I must say you're on the wrong website!

But if your interest is more in the math than in the case, I'm not sure what you're disagreeing with me about. It's kind of hard to dispute the inequality

$P\(A\$ %20\leq%20\frac{P(B)}{P(B%7CA)})

isn't it?

[-]The Dao of Bayes15y00

Your post is entangling three separate issues, and I think that's making it confusing to discuss (it was certainly confusing to read!)

Mathematics: "P(A) <= P(B) / P(B|A)."

No argument here.

Probability: How does the evidence A impact the probability of conclusion B?

I feel you are using entirely incorrect math for the situation, as stated in my previous posts. Just because the formula is correct, does not mean it is applicable to the problem you are trying to solve.

If A is proven, and A=>B is proven, then B is proven. The prior probability of B cannot negate the proof of A, nor the proof of A=>B, and thus has absolutely no bearing on the situation. Prior probability matters if, and only if, we are discussing p(A) and p(A=>B), at which point we still have new evidence (A, A=>B) that requires us to update to a new new probability of B.

You cannot continue to assert the prior probability of B, despite new evidence that suggests a higher or lower chance of B.

Cognitive Bias: Is the judge properly evaluating p(A) and p(A=>B)?

I feel that there is insufficient information to draw a firm conclusion here. However, based on what you have said, I feel rather strongly that you have misinterpreted his evaluations, because you are assuming that common language and logical language are the same.

[-]komponisto15y00

If A is proven, and A=>B is proven, then B is proven

Agreed.

The prior probability of B cannot negate the proof of A, nor the proof of A=>B, and thus has absolutely no bearing on the situation

This sentence doesn't make sense as written. I don't know what it means for a probability to "negate" a proof, and so I don't know what you're trying to say when you assert that this can't happen.

My best guess is that you're trying to say that "even if P(A) is small on account of P(B) being small, some finite amount of evidence will still suffice to prove A, and therefore B." Which is obviously true, and nothing I have written says otherwise.

You cannot continue to assert the prior probability of B, despite new evidence that suggests a higher or lower chance of B.

This sounds like our previous discussion, where you said, and I agreed, that other evidence that Knox and Sollecito killed Kercher could raise the probability of their having faked the burglary. I've never disputed this, but have pointed out that this isn't Massei and Cristiani's reasoning. They attempted to prove the fake burglary without invoking the other murder evidence.

However, based on what you have said, I feel rather strongly that you have misinterpreted his evaluations, because you are assuming that common language and logical language are the same.

You'll have to be more specific here.

[-]The Dao of Bayes15y-20

Ahhh, you make so much more sense when you phrase it this way!

"other evidence that Knox and Sollecito killed Kercher could raise the probability of their having faked the burglary"

But my point is, this is backwards. It only works if you assume with near-100% certainty that faking the burglary and being the murderer are correlated. Otherwise "faked the burglary" IS simply evidence that Knox is the murderer.

If we prove that Knox killed Kercher, it proves that any 100% correlation is true. It does NOT prove any less-than-100% correlation. It's even entirely possible for a correlation to be one-directional (A implies B, but B does not imply A).

Thus, Knox killed Kercher is only proof of a faked burglary if you already assume the correlation is proven and two-directional.

[-]nshepperd15y50

In probability, "correlations" are always bidirectional. Bayes theorem:

$P(A|B\$ =\frac{P(B|A)P(A)}{P(B)})

If P(B|A) > P(B), then P(A|B) > P(A). By the same factor even:

$\frac{P(A|B\$ }{P(A)}=\frac{P(B|A)}{P(B)})

[-]Jack15y40

The analogy to biconditionality in deductive logic would be P(A|B)= P(B|A) which obviously isn't always true.

[-]The Dao of Bayes15y00

I'm just trying to understand your point a bit better. Hopefully you don't mind the late reply (I've been on vacation for a while)

"In probability, "correlations" are always bidirectional."

Can't there be three separate, equally valid points which, if proven, would prove she was the murderer? Even if those three equally valid proofs of her guilt are contradictory? Once we know she is guilty, they can't all three be true, can they?

I'm not sure how one would accurately express this, given what you're saying. The probability that A implies Guilt, B implies Guilt, and C implies Guilt can all be 100%, yes? Obviously, the probability that guilt implies all of A+B+C is 0%, since they are contradictory. Therefor, how can it be correct to assume the opposite correlation, that Guilt implies A at 100% certainty?

[-]nshepperd15y30

It isn't!

In general it is not true that P(A|B) = P(B|A). P(A|Guilt) depends on the prior probabilities of A and Guilt, as well as P(Guilt|A). For example, say we have four possible proofs A, B, C, D, and P(Guilt|A or B or C) = 1, and P(Guilt|D) = 0. Our prior is all four are equally likely: P(A) = P(B) = P(C) = P(D) = 0.25. P(Guilt) is then 0.75 = P(Guilt|A)P(A) + P(Guilt|B)P(B)...

Given this, we have:

$\\begin\{aligned\}P\(A|Guilt\$ &=\frac{P(Guilt%7CA)P(A)}{P(Guilt)}\\&=\frac{1.0*0.25}{0.75}\\&=\frac{1}{3}\end{aligned})

P(A|Guilt) isn't 1. But it's 33%, which is still higher than the prior %25: that is, Guilt is evidence for A.

By the way I think it might help if you avoid talking in proofs and implication and 100% certainty. In hypothetical examples it's useful to set things to P(X) = 1, but in the real world evidence is always probabilistic; nothing's ever 100%.

[-]The Dao of Bayes15y00

Ahhh, that helps clear things up. For some reason I'd been understanding you as saying that, given P(Guilt|A) = 1, P(A|Guilt) was also 1. It looks like what you meant was just that Guilt is evidence for, but not necessarily 100% proof of, A. Am I getting that all correct?

[-]nshepperd15y10

Yes.

P(Guilt|A) = P(A|Guilt) only when P(A) = P(Guilt). In which case it would be 100% proof. But that is a rare situation.

[-]Psy-Kosh15y10

Nitpick: the two conditionals also be equal if A and Guilt were mutually exclusive. (in that case, of course, the two conditionals would be both zero)

[-]komponisto15y50

Theorem: If A is evidence of B, then B is also evidence of A.

Proof: To say that A is evidence of B means that P(A|B) > P(A|~B), or in other words that P(A&B)/P(B) > P(A&~B)/P(~B), which we may write as P(A&B)/P(B) > (P(A)-P(A&B))/(1-P(B)). Algebraic manipulation turns this into P(A&B) > P(A)P(B), which is symmetric in A and B; hence we can undo the manipulations with the roles of A and B reversed to arrive back at P(B|A) > P(B|~A). QED.

Hence, if A implies B, then B also implies A!

Now of course, the strengths of these implications might be vastly different. But that's a separate matter.

Here, the point is that A implies B with near certainty (where A is "K&S faked burglary" and B is "K&S killed Kercher"); I'm not terribly concerned with how strongly B implies A. I don't need for B to imply A very strongly to make my point, but Massei and Cristiani would definitely need that in order to enable any charitable reading of their burglary section at all.

[-]jsteinhardt15y00

But, of course, the mathematics of probability theory don't work that way. A hypothesis, such as that the apparent burglary in Filomena Romanelli's room was staged -- doesn't get points for its ability to explain the data unless it does so better than its negation. And, in the absence of the assumption that Knox and Sollecito are guilty -- if we're presuming them to be innocent, as the law requires, or assigning a tiny prior probability to their guilt, as epistemic rationality requires -- this contest is rigged. The standards for "explaining well" that the fake-burglary hypothesis has to meet in order to be taken seriously are much higher than those that its negation has to meet, because of the dependence relation that exists between the fake-burglary question and the murder question.

This isn't quite true. If the prior probability of being a murderer is 1 in 10^6, and I can find 30 things that are explained twice as well by the murder hypothesis as the non-murder hypothesis, then the posterior probability of being a murderer is 99.9%, in the absence of mitigating factors (since 2^30/10^6 is about 1000.) So, many pieces of weak evidence for an unlikely proposition can still establish that proposition.

[-]Nisan15y60

You'd also need those 30 things to be independent.

[-]NihilCredo15y10

Probably superfluous nitpicking: you can build a strong case even with partially-interdependent pieces of evidence, you will just need more of them since you have to work off their conditional probabilities (which is mathematically equivalent to "splitting" them into independent pieces of evidence).

[-]komponisto15y00

So, many pieces of weak evidence for an unlikely proposition can still establish that proposition.

This doesn't contradict anything in the paragraph you quoted. (If you don't mind, tell me where you thought the contradiction was, so that I can explain further.)

[-]jsteinhardt15y10

The sentence in particular that I objected to was

The standards for "explaining well" that the fake-burglary hypothesis has to meet in order to be taken seriously are much higher than those that its negation has to meet, because of the dependence relation that exists between the fake-burglary question and the murder question.

My impression was that you were claiming that, since the fake burglary hypothesis would imply murder, evidence must be extremely strong to be counted in favor of fake burglary. But I may have misunderstood you. At any rate, you elsewhere state that

Of their 427-page report, Massei and Cristiani devote approximately 20 pages (mainly pp. 27-49) to their argument that the burglary was staged by Knox and Sollecito [...] if they were really able to demonstrate this, they would scarcely have needed to bother writing the remaining 400-odd pages of the report!

If you agree with my point, then I don't see how you can find it odd that they would feel obliged to include more in their report than just the claim that the burglary was faked. Like you said, even if the evidence is fairly strong in favor of this assertion, far more evidence would be needed to convict those two of murder, which is presumably the point of the remaining 400 pages.

[-]komponisto15y50

My impression was that you were claiming that, since the fake burglary hypothesis would imply murder, evidence must be extremely strong to be counted in favor of fake burglary

My claim is that to reach a desired level of certainty about the burglary being faked, you would need evidence of approximately the same strength required to reach the same level of certainty about murder. (In other words, that the prior probability of fake burglary is roughly the prior probability of murder.)

Like you said, even if the evidence is fairly strong in favor of [fake burglary], far more evidence would be needed to convict those two of murder

This is the very opposite of what I said! What I said was that if you knew with high confidence that the burglary was fake, then you would need almost no additional evidence to convict of murder.

[-]jsteinhardt15y30

Okay, so we seem to be in complete agreement about how the math works out. If so, then I'm confused as to why you object so strongly to the prosecution's argument on purely mathematical grounds; I haven't read their argument myself, so it's entirely possible that the argument itself is weak in some way, but I think that right now we're just talking about the math.

If we ignore their specific language, the plan of coming up with ~20 pieces of moderate evidence is a perfectly reasonable strategy for correctly establishing guilt, assuming that there is absolutely no mitigating evidence. Your complaint seems to be that they use different language/notation than you and I do to talk about evidence, which seems hardly fair.

Although I would also note that since humans are bad at intuitively distinguishing between moderate evidence for and moderate evidence against a hypothesis, trying to find many pieces of weak evidence is probably not a good strategy if the goal is to get humans to correctly decide the accuracy of an assertion.

ETA: By the way, I've been working under the assumption, based on the tone of the original post, that you think there are serious mathematical flaws in the prosecutions argument. If that's not the case, and you just wanted to use this case as a point of illustration, then I apologize for the confusion.

[-]Jack15y40

What I gather is that the prosecution concludes, after the first twenty pages of the brief that discuss the break-in exclusively that the break in was almost certainly staged by Knox and Sallecito. But if they really thought that they would have already more or less made the case the Knox and Sallecito are guilty and the remaining 380 pages would be unnecessary. So the prosecution can't be weighing the evidence correctly.

[-]komponisto15y20

Okay, so we seem to be in complete agreement about how the math works out. If so, then I'm confused as to why you object so strongly to the prosecution's argument on purely mathematical grounds; I haven't read their argument myself, so it's entirely possible that the argument itself is weak in some way, but I think that right now we're just talking about the math.

If I may presume to diagnose your confusion, it seems that you're compartmentalizing between "mathematical" aspects of an argument and "other" aspects. But I'm not. I'm taking it for granted that "the math" is the argument. Probability theory is a mathematical formalization of the process of argument and inference. It isn't just a cool gadget that one throws in on special occasions.

So, I don't object to Massei and Cristiani's argument on "purely mathematical grounds". I simply object to it, period -- and in this post I have used mathematical language to describe, in precise terms, what my objection is.

(And I expected readers to assume, given my previous writing on the case, that this particular point was far from my only objection to Massei and Cristiani's 427-page argument that Knox and Sollecito killed Kercher; hence I was not expecting replies of the form "well, but they might have other good evidence that Knox and Sollecito are guilty". They don't; we've already covered that.)

[-]komponisto15y10

If we ignore their specific language, the plan of coming up with ~20 pieces of moderate evidence is a perfectly reasonable strategy for correctly establishing guilt, assuming that there is absolutely no mitigating evidence. Your complaint seems to be that they use different language/notation than you and I do to talk about evidence, which seems hardly fair.

I honestly have no idea where you're getting this from. I don't know of any passage in the post where I complained about Massei and Cristiani's choice of language; and nor did I attempt to argue (as several people seem to have thought I did) against a strategy of proving one's case by adducing a large amount of weak evidence in one's favor (although as a matter of fact I do believe that is the wrong type of argument to expect for a proposition of this sort, and that people have probably been misled by detective stories and the like into thinking it a reasonable strategy, when it would actually be very difficult to make work in practice -- that however would be the topic of a separate post, and isn't addressed in this one).

My criticism of Massei and Cristiani in this post is really quite simple, or so I thought: the type of evidence that they cite to prove that the burglary was faked suggests that they did not realize how high the burden of proof for this proposition was -- that, just to prove the burglary was faked, they needed evidence of the same level of strength as would be required to directly prove Knox and Sollecito guilty of murder.

Quite frankly, I'm baffled at how this point seems to have gotten lost, because I thought I was emphatic and indeed repetitious about it in the post.

[-]shokwave15y00

If we ignore their specific language, the plan of coming up with ~20 pieces of moderate evidence is a perfectly reasonable strategy for correctly establishing guilt, assuming that there is absolutely no mitigating evidence. Your complaint seems to be that they use different language/notation than you and I do to talk about evidence, which seems hardly fair.

I think the assertion is that they appear to be coming up with ~20 pieces of evidence and then trying to say that each piece is very strong - or at least, they have done so for the burglary hypothesis, so they might be doing so for the other pieces of evidence too. Naturally, their methods of making each piece look very strong are flawed.

You almost pinpointed the reason why this happening here:

trying to find many pieces of weak evidence is probably not a good strategy if the goal is to get humans to correctly decide the accuracy of an assertion.

Humans are bad at intuitively handling evidence in general. There is a possibility that this case suffers from a serious malady: presiding judge Massei has decided the correct, accurate decision in this matter is that Knox and Sollecito are guilty, and has strategically prepared the judge's report to get people to decide this way. This hypothesis explains why the judge has produced such a weighty document when 20 pages of it would have sufficed.

[-]The Dao of Bayes15y00

"my claim is that to reach a desired level of certainty about the burglary being faked, you would need evidence of approximately the same strength required to reach the same level of certainty about murder."

This assumes that the burglary being faked is the only piece of evidence. If we have three sets of evidence, and each one suggests a 90% chance of guilt, and each is independent of the other, then we have probability (10:1) x (10:1) x (10:1) = (1000:1). No one set of evidence needs to have a (1000:1) probability of guilty in order to reach a final conclusion that the odds are (1000:1). Arguing via modus tollens about a single piece of evidence tells us only that that evidence, in and of itself, is insufficient proof. It tells us nothing about how that evidence may act cumulatively with other pieces of evidence.

[-]komponisto15y10

This assumes that the burglary being faked is the only piece of evidence

No; I fully grant that other evidence that Knox and Sollecito are guilty, if it exists, would be evidence of the burglary being fake, which would lower the burden of proof on that hypothesis.

However, that isn't how Massei and Cristiani reason. They don't say, in the section on the burglary (which is at the beginning of the report), "and since we know from all the other evidence that Knox and Sollecito are guilty, we can therefore easily use these arguments about glass patterns to confirm that they did in fact stage the burglary, in case you were wondering about that". And it's easy to see why they don't say that: there wouldn't be much point, because if they've already shown that Knox and Sollecito are guilty, their work is done! (*)

Instead, what they say is "these arguments about glass patterns etc. prove that the burglary was staged. Now, having established that piece of evidence against them (i,e. the staging of the burglary), let us now consider the other evidence, which, in combination with the burglary, will show how really guilty they are."

( ) Technically, staging a burglary is itself an offense, so there may actually have been reason for them to proceed this way. But in that case the burglary issue would have come at the end* of the report, not the beginning.

[-]The Dao of Bayes15y00

"these arguments about glass patterns etc. prove that the burglary was staged. Now, having established that piece of evidence against them (i,e. the staging of the burglary), let us now consider the other evidence, which, in combination with the burglary, will give us an accurate probability on whether they are guilty"

I've bolded a single change to your quote. With that change made, do you feel this is a reasonable assertion?

[-]komponisto15y00

No. The error is in the first sentence

these arguments about glass patterns etc. prove that the burglary was staged.

They only (conceivably) prove the burglary was staged if you're already taking into account the rest of the evidence of murder.

[-]The Dao of Bayes15y00

That's only true if you assume p(A=>B) is 1

[-]komponisto15y00

...or approximately 1.

(And by P(A=>B), I think you meant P(B|A), didn't you?)

[-]The Dao of Bayes15y-10

P(Someone faked the burglary) != P(Amanda Knox faked the burglary). The report asserts the first, not the second, from my reading.

Given that "someone faked" is true, I think assigning an approximately 100% chance that Amanda Knox is guilty is rather seriously unfounded. What am I missing?

[-]komponisto15y00

What am I missing?

That "burglary was faked" is shorthand for "burglary was faked by Knox and Sollecito" throughout this post and discussion. The latter is what Massei and Cristiani argue, and is what would most strongly imply that Knox and Sollecito are guilty of murder.

[-]The Dao of Bayes15y10

The evidence you quoted merely suggests the burglary was faked. I'd assume there are more people with a motive to do that than just Knox and Sollecito? Why would we assume, with high enough certainty to convict, that it was certainly them and not a roommate, or someone who knew them?

[-]komponisto15y00

Look, I'm not saying Massei and Cristiani's argument that Knox and Sollecito staged the burglary is convincing, by any means!

That said, their argument that if the burglary was staged, the staging was done by Knox and Sollecito is probably the most convincing part of it. At the very least, they would have a highish prior, since they had access to the house and were "available" that night to do the staging if they wanted to.

[-]Jack15y00

I figured this out but it threw me when I got to this part of the post. I'm not sure the convenience of the shorthand justifies throwing your readers off.

[-]Fuji12y-30

Exactly. The problem for Knox and Sollecito is that there is so much evidence that even if it was all weak (and it isn't) just the number of items is sufficent to arrive at a high certainty of guilt because they are all independent events.

http://themurderofmeredithkercher.com/The_Evidence

That is a lot of evidence. Some items are so strong that they in isolation would be sufficent to reach the level of certainty required to convict and others are strong evidence but not enough. I count 24 items.

[-]Jack12y60

Only works if each piece of evidence is independent. They're clearly not.

LESSWRONG
LW

LESSWRONG
LW

46

Inherited Improbabilities: Transferring the Burden of Proof

46

46