whales — LessWrong

How do you assess the quality / reliability of a scientific study?

Answer by whalesOct 30, 201990

Recapitulating something I've written about before:

You should first make a serious effort to formulate both the specific question you want answered, and why you want an answer. It may turn out surprisingly often that you don't need to do all this work to evaluate the study.

Short of becoming an expert yourself, your best bet is then to learn how to talk to people in the field until you can understand what they think about the paper and why—and also how they think and talk about these things. This is roughly what Harry Collins calls "interactional" expertise. (He takes gravitational-wave scientist Joe Weber's late work as an especially vivid example: "I can promise such lay readers that if they teach themselves a bit of elementary statistics and persevere with reading the paper, they will find it utterly convincing. Scientific papers are written to be utterly convincing; over the centuries their special language and style has been developed to make them read convincingly.... The only way to know that Weber’s paper is not to be read in the way it is written is to be a member of the ‘oral culture’ of the relevant specialist community." The full passage is very good.)

If you only learn from papers (or even textbooks and papers), you won't have any idea what you're missing. A lot of expertise is bound up in individual tacit knowledge and group dynamics that never get written down. This isn't to say that the 'oral culture' is always right, but if you don't have a good grasp of it, you will make at best slow progress as an outsider.

This is the main thing holding me back from running the course I've half-written on layperson evaluation of science. Most of the time, the best thing is just to talk to people. (Cold emails are OK; be polite, concise, and ask a specific question. Grad students tend to be generous with their time if you have an interesting question or pizza and beer. And I'm glad to answer physics questions by LW message.)

Short of talking to people, you can often find blogs in the field of interest. More rarely, you can also find good journalism doing the above kind of work for you. (Quanta is typically good in physics, enough so that I more or less trust them on other subjects.)

There's plenty to be said about primary source evaluation, which varies with field and which the other answers so far get at, but I think this lesson needs to come first.

Rationality Exercises Prize of September 2019 ($1,000)

whales6y*30

Hm, not sure what happened to the Washington Post comments. Sorry about that. Here's my guess as to what I was thinking:

The axes are comparing an average (median income) to a total (student loan debt). This is generally a recipe for uninformative comparisons. Worse, the average is both per person and per year. So by itself this tells you little about the debt burden shouldered by a typical member of a generation. For example, you could easily see growth in total debt while individual debt burden fell, depending on the growth in degrees awarded and the typical time to pay off debt. If you wanted to make claims about how debt burdens individuals, as the blurb does, you'd have to look at what's happening with the typical debt of recent graduates.

But of course you can't stop there and say, "Ah, Peter Thiel is trying to mislead me, I'm going to disbelieve what I see as his point." Recent-graduate debt has been increasing, just not as much as the graph suggests. And maybe total student loan debt is a significant number in its own right?

(I don't know if I had intended the above as "the answer"; more likely, I just wanted people thinking about it more thoroughly than some of the commentary I had seen at the time. You also make good points.)

Thanks for trying these out. I don't think I ever heard in detail from anyone who did (beyond "this was neat"). If I were writing them today I'd be less coy about it.

Rationality Exercises Prize of September 2019 ($1,000)

whales6y70

My past occasional blogging included a few exercises that might be of interest. I'm pretty sure #4 is basically an expanded version of something from the Sequences, although I don't recall which post exactly. Others are more open ended. (Along the lines of #5 I've been casually collecting examples of scientific controversy and speculation with relatively clear-cut resolutions for the purposes of giving interested laypeople practice evaluating these things, to the extent that's possible. I don't know if I'll ever get around to writing something up, but if anyone has their own examples, I'd love to hear about them.)

How Popper killed Particle Physics

whales8y30

Maybe we're talking about different things, but from the page I'm on now where I'm looking at and replying to the discussion of the link (https://www.lesserwrong.com/posts/vhAJ4DBXZukE7SNtq/how-popper-killed-particle-physics/) the only link to the actual article is still gjm's. In particular, the title of the blog post is not a link, although I would have expected it to be. To get to the actual article I have to click on the linkpost title in one of the other post listings (Featured/Frontpage/All). This happens to me for all link posts and for different browsers on both mobile and desktop.

Non-market failures: inefficient networks

whales8y10

Content note: This is a collection/expansion of stuff I've previously posted about elsewhere. I've gathered it here because it's semi-related to Eliezer's recent posts. It's not meant to be a response to the "inadequacy" toolbox or a claim to ownership of any particular idea, but only one more perspective people may find useful as they're thinking about these things.

Continuing the discussion thread from the MTG post

whales8y70

For what it's worth, I was another (the other?) person who downvoted the comment in question early (having upvoted the post, mostly for explaining an unfamiliar interesting thing clearly).

Catching up on all this has been a little odd to me. I'm obviously not a culture lord, but also my vote wasn't about this question of "the bar" except (not that I would naturally frame it this way) perhaps as far as I read CoolShirtMcPants as doing something similar to what you said you were doing---"here is my considered position on this, I encourage people to try it on and attend to specifically how it might come out as I imply"---and you as creating an impasse instead of recognizing that and trying to draw out more concrete arguments/scenarios/evidence. Or that even if CSMP wasn't intentionally doing that, a "bar" should ask that you treat the comment that way.

On one hand, sure, the situation wasn't quite symmetric. And it was an obvious, generic-seeming objection, surely already considered at least by the author and better-expressed in other comments. But on the other hand, it can still be worth saying for the sake of readers or for starting a more substantive conversation; CSMP at least tried to dig a little deeper. And in this kind of blogging I don't usually see one person's (pseudonymously or otherwise) staking out some position as stronger evidence than another's doing so. Neither should really get you further than deciding it's worth thinking about for yourself. This case wasn't an exception.

(I waffled on saying anything at all here because your referendum, if there is one, appears to have grown beyond this, and all this stuff about status seems to me to be a poor framing. But reading votes is a tricky business, so I can at least provide more information.)

LDL 2: Nonconvex Optimization

whales8y10

Two more thoughts: the above is probably more common in [what I intuitively think of as] "physical" problems where the parameters have some sort of geometric or causal relationship, which is maybe less meaningful for neural networks?

Also, for optimization more broadly, your constraints will give you a way to wind up with many parameters that can't be changed to decrease your function, without requiring a massive coincidence. (The boundary of the feasible region is lower-dimensional.) Again, I guess not something deep learning has to worry about in full generality.

LDL 2: Nonconvex Optimization

whales8y40

Hm. Thinking of this in terms of the few relevant projects I've worked on, problems with (nominally) 10,000 parameters definitely had plenty of local minima. In retrospect it's easy to see how. Saddles could be arbitrarily long, where many parameters basically become irrelevant depending on where you're standing, and the only way out is effectively restarting. More generally, the parameters were very far from independent. Besides the saddles, for example, you had rough clusters of parameters where you'd want all or none but not half to be (say) small in most situations. In other words, the problem wasn't "really" 10,000-dimensional; we just didn't know how or where to reduce dimensionality. I wonder how common that is.

Against naming things, and so on

whales8y20

I think the main thing I want to say [besides my response to Oliver below] is that this post was not framed in my head as starting a conversation in response to your post, but as gesturing in the direction of some under-emphasized considerations as one contribution in a long-running conversation about rationalist jargon. Of course, I ended up opening with and only taking quotes from you, and now it looks the way it does, i.e. targeting your "bid" but somewhat askew. So that was a mistake, for which I apologize.

Also, I know I basically asked for your "actually a defeater" response, but I really was non-rhetorically hoping people would think about what I was leaning upon and accomplishing (or not) by using the Names that I chose throughout that might not align with their prior ideas about what the Names are for.

Against naming things, and so on

whales8y10

Pretty much agreed. I might go beyond "provisional" to "disposable". I really do take maintaining fluidity and not fooling yourself to be more important/possible than creating common vocabulary or high-level unitary concepts or introspective handles [though I don't introspect verbally, so maybe I would say that]; I really do think the way the community treats words is a good lever for that.

(Of course, this is all very abstract, isn't a full elaboration of what I believe, and certainly has no force of argument. At best, I'm pointing towards a few considerations I could readily abstract out of the sum of my observations, in the hopes that people can recontextualize some of their reading with concerns along these lines.)

I'd also like to see someone try your last suggestion. (If nothing else, I might use it in a fiction project.)

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments