That critique might sound good in theory, but I think it falls flat in practice. Hearsay is a rule with more than 30 exceptions, many of which seem quite technical and arbitrary. But I have seen no evidence that the public views legal systems that employ this sort of convoluted hearsay regime as less legitimate than legal systems that take a more naturalistic, Benthamite approach.
In practice, even laypeople who are participating in trials don't really see the doctrine that lies beneath the surface of evidentiary rulings, so I doubt they form their judgments of the system's legitimacy based on such details.
A few comments:
It is somewhat confusing (at least to legal readers) that you use legal terms in non-standard ways. Conflating confrontation with hearsay issues is confusing because making people available for cross-examination solves the confrontation problem but not always the hearsay one.
I like your emphasis on the filtering function of evidentiary rules. Keep in mind, however, that these rules have little effect in bench trials (which are more common than jury trials in state courts of general jurisdiction). And relatively few cases reach trial at all; more are disposed of by pretrial motions or by settlements. (For some data, you could check out this paper by Marc Galanter.) So this filtering process is only rarely applied in real-world cases!
Before suggesting that we should exclude evidence of low reliability, you should probably take more time to think about substitution effects. If lawyers cannot use multiply embedded hearsay, what will juries hear instead? Also, you would want to establish that juries would systematically err in their use of such evidence. It is not a problem to have unreliable evidence come in if juries in fact recognize its unreliability.
I've recently spent some time thinking about how we might apply the scientific method towards designing better rules of legal procedure and evidence. It turns out to be trickier than you might think, largely because it is hard to measure the impact of legal rules on the accuracy of case resolutions. If you are curious about such things (and with apologies for blatant self promotion), you might want to read some of what I wrote here, particularly parts 2-4.
This may be why very smart folks often find themselves unable to commit to an actual view on disputed topics, despite being better informed than most of those who do take sides. When attending to informed debates, we hear a chorus of disagreement, but very little overt agreement. And we are wired to conduct a head count of proponents and opponents before deciding whether an idea is credible. Someone who can see the flaws in the popular arguments, and who sees lots of unpopular expert ideas but few ideas that informed people agree on, may give up looking for the right answer.
The problem is that smart people don't give much credit to informed expressions of agreement when parceling out status. The heroic falsfier, or the proposer of the great new idea, get all the glory.
Internal credibility is of little use when we want to compare the credentials of experts in widely differing fields. But is is useful if we want to know whether someone is trusted in their own field. Now suppose that we have enough information about a field to decide that good work in that field generally deserves some of our trust (even if the field's practices fall short of the ideal). By tracking internal credibility, we have picked out useful sources of information.
Note too that this method could be useful if we think a field is epistemically rotten. If someone is especially trusted by literary theorists, we might want to downgrade our trust in them, solely on that basis.
So the two inquiries complement each other: We want to be able to grade different institutions and fields on the basis of overall trustworthiness, and then pick out particularly good experts from within those fields we trust in general.
p.s. Peer review and citation counting are probably incestuous, but I don't think the charge makes sense in the expert witness evaluation context.
True. But it is still easier in many cases to pick good experts than to independently assess the validity of expert conclusions. So we might make more overall epistemic advances by a twin focus: (1) Disseminate the techniques for selecting reliable experts, and (2) Design, implement and operate institutions that are better at finding the truth.
Note also that your concern can also be addressed as one subset of institutional design questions: How should we reform fields such as medicine or economics so that influence will better track true expertise?
Experts don't just tell us facts; they also offer recommendations as to how to solve individual or social problems. We can often rely on the recommendations even if we don't understand the underlying analysis, so long as we have picked good experts to rely on.
One can think that individuals can profit from being more rational, while also thinking that improving our social epistemic systems or participating in them actively will do more to increase our welfare than focusing on increasing individual rationality.
Care to explain the basis for your skepticism?
Interestingly, there may be a way to test this question, at least partially. Most legal systems have procedures in place to allow judgments to be revisited upon the discovery of new evidence that was not previously available. There are many procedural complications in making cross-national comparisons, but it would be interesting to compare the rate at which such motions are granted in systems that are more adversarially driven versus more inquisitorial systems (in which a neutral magistrate has more control over the collection of evidence).
Obviously it helps if the experts are required to make predictions that are scoreable. Over time, we could examine both the track records of individual experts and entire disciplines in correctly predicting outcomes. Ideally, we would want to test these predictions against those made by non-experts, to see how much value the expertise is actually adding.
Another proposal, which I raised on a previous comment thread, is to collect third-party credibility assessments in centralized databases. We could collect the rates at which expert witnesses are permitted to testify at trial and the rate at which their conclusions are accepted or rejected by courts, for instance. We could similarly track the frequency with which authors have their articles accepted or rejected by journals engaged in blind peer-review (although if the review is less than truly blind, the data might be a better indication of status than of expertise, to the degree the two are not correlated). Finally, citation counts could serve as a weak proxy for trustworthiness, to the degree the citations are from recognized experts and indicate approval.
Another good example is the legal system. Individually it serves many participants poorly on a truth-seeking level; it encourages them to commit strongly to an initial position and make only those arguments that advance their cases, while doing everything they can to conceal their cases' flaws short of explicit misrepresentation. They are rewarded for winning, whether or not their position is correct. On the other hand, this set-up (in combined with modern liberalized disclosure rules) works fairly well as a way of aggregating all the relevant evidence and arguments before a decisionmaker. And that decisionmaker is subject to strong social pressures not to seek to affiliate with the biased parties. Finally, in many instances the decisionmaker must provide specific reasons for rejecting the parties' evidence and arguments, and make this reasoning available for public scrutiny.
The system, in short, works by encouraging individual bias in service of greater systemic rationality.