Criticism of some popular LW articles

[-]Bucky5y260

I think when assessing lesswrong it is important to think of posts and their comments as a single entity. Many of the objections to the posts that you mention are also brought up in the comment sections, often themselves highly upvoted (in the first example the comment has 1 more karma than the post).

If you take upvotes to mean you are glad something was posted then I don’t think it is inconsistent to upvote something you think contains an error. Therefore high karma alone shouldn’t be enough to consider something to have been considered correct by the LW community, just that it has some value. If there are also no/only minor critical comments then I think that is much stronger evidence.

I don’t think this completely exonerates LW but I think it means the picture is not quite as bleak as it would first appear.

(Edited to add: As an example I upvoted this post despite this comment as I think this is an important kind of thing to look at and agree that my frame of mind can be important in how I read content)

[-]DirectedEvolution5y60

This fits with my depiction of LW as a sort of serious-minded scholarly fanfic community.

I think it can be valuable to read the post alone, write down your reaction to it, and only then examine the comments. If somebody else made the same critique as you, it's a bit like pre-registering an experiment. You can have more confidence that you're doing your own thinking, and that you're converging on the truth. Perhaps this is a way to get out of the "I can only produce rationalizations" dilemma.

[-]Bucky5y80

I think I didn’t get the fanfic analogy at first. Could I summarise it as “Lesswrong is to scholarship as serious fanfic is to original novels”?

I know the LW team have spoken about a level above curated which would be intended to be more on the level of scholarship. I think the 2018 review was designed to serve this purpose so we should hope that these posts in particular don’t contain any glaring errors!

I think it’s super valuable to be able to be able to put imperfect ideas out there (I see one of the 2018 review top posts was Babble!) but thinking about this has really emphasised to me how useful epistemic statuses are.

To your second paragraph - yes, definitely this! When I do this it definitely gets me out of the habit of passive reading.

[-]Sunny from QAD5y30

As another example, when I get into a debate with someone in the comments section, I tend to upvote the other person's comments as long as they're reasonably well-thought-out and well-written.

[-]Zvi5y210

I don't think the criticism of post 2 here is on point at all. Elizabeth is making the claim that if everyone shifted from thinking dollars most reliable to thinking euros, that this would be self-fulfilling and have big impacts. This seems right, regardless of why this happened. The response seems wrong in four ways.

One, I don't think that there needs to be an overarching reason. It's not crazy that a propaganda campaign (someone 'talking their book' on a large enough level) combined with large bets could cause a cascade effect in worlds where that wouldn't have otherwise happened.

Two, it's the currency and not the stock market that matters here. Stock market is a different thing. Not central but worth noting.

Three, the currency markets are exactly the thing Elizabeth is talking about - anticipated future prices, which are a function of supply and demand. A lot of the demand for dollars is the expectation that people will demand dollars because business with others is done in dollars, etc. Hitting the tipping point would cause a cascade. The idea that markets are about some 'fundamentals' and causation is one-directional simply isn't right. Expectations are huge.

Four, even if in this particular example we are not close to a switch, that would only mean it's a bad example. The principle certainly holds. E.g. it is easy to imagine worlds in which there was a 'flippening' and ETH took over the BTC role as primary method of cryptocurrency payment some time in 2018, without anything fundamental changing. There's no reason to think that would have become undone - likely the opposite, and if ETH had passed BTC it would have pulled further away over time.

[-]Daniel V5y70

To the contrary, I think the criticism of post 2 is very on point. But Zvi and I are looking at two different parts: Zvi's looking at the logic/begging the question part, and I'm looking at the critique. In thought experiments, we can take imagined exogenous changes to be exogenous even though in the real world they'd be endogenous (i.e., we can take them as events rather than outcomes). Later, we can relax that assumption; the endogeneity problem is important for understanding whether the conclusions extend to the real world, but it is not important for understanding what the conclusions are within the thought experiment. So I agree with Zvi that the logic isn't really an issue here.

However, I do believe this is a bad example (/weak post, Sorry Elizabeth) precisely for the reason AllAmericanBreakfast pointed out- it frames basic economics knowledge as a new insight. Admittedly, the EconLog post that was linked to doesn't discuss comparative advantage either, but that's because it's really just about the "flight to safety" in 2008 where capital has to go somewhere, so it goes to the safest haven- even if that place is on fire, at least it's not on fire next to a ticking time bomb. But, if you really want to talk about the "benefit not from absolute skill or value at a thing, but by being better at it than anyone else" then you can just consult microeconomics 101 (literally) and read up on absolute vs. comparative advantage. And then a better example of it is what you would find in the textbook (ha, probably Mankiw's) of English cloth vs. Portuguese wine, which clearly illustrates the concepts.

Or, maybe Elizabeth really wasn't referring to comparative advantage and more specifically to "when a superlative is applied in a context and the context is later lost." This might seemingly apply better to the USD (we think of it as a safe haven because we used to think of it as a safe haven), but again the USD is not an apt example here because the context isn't lost, it just changed (e.g., suppose the USD scores a 10/10 at being a currency and things change and now it's a terrible 3/10 but it's still better than all the rest). The Tallest Pygmy derives its tension from that fact that you think you've found someone "tall" but it's just among the pygmies you're sampling. The Tallest Pygmy, then, is best understood as getting stuck in a valley at a local, but not global, minimum (gradient descent). Or peaking at a local, but not global, maximum. Sometimes you are fine with local maxima, but if you are optimizing for global maxima, then obviously this creates a problem. May as well go with a classic example instead, which clearly illustrates sampling bias (statistics).

You see this in the academic literature as well where people refer to concepts as "effects." I think it is a good idea to be skeptical of those findings- not that they are fake, just that more clarity could be gained from understanding the core concept that generates the effect. Elizabeth's example is not great for comparative advantage, nor for gradient descent/sampling bias. The USD in 2008 is a "lesser of two evils effect," or really not an effect at all- if you have a choice between 10%, 9%, and 8% returns at equal risk, you choose 10%; if a regime change occurs that makes you choose between 5%, 4.5%, and 4%, you choose 5%. It's worse than before, but it's the best around.

LessWrong is a great community to be in, but AllAmericanBreakfast is correct that many posts stumble upon "new" insights that are really just symptomatic of not having done enough research, particularly when it comes to economics. And that's okay in this forum, we're all trying to figure this stuff out!

[-]DirectedEvolution5y30

The TPE as defined is

“Tallest Pygmy Effect” is when you benefit not from absolute skill or value at a thing, but by being better at it than anyone else.

This has two conditions:

1. The benefit must accrue only to the best option.

2. The difference between the best and second best option must be small.

You and Elizabeth are adding two further conditions, which produce mutability:

3. It is possible for the second-best option to overtake the best option (a 'flippening').

And fragility:

4. It is easy for a flippening to occur.

The general mutability of a TPE seems uncontroversial, as does the existence of fragile TPEs. But mutability does not imply fragility, and Elizabeth specifically says that it does.

Tallest pygmy effects are fragile, especially when they are reliant on self-fulfilling prophecies or network effects.

My mistake was in engaging with the concrete example of the US vs. EU currency, in an attempt to demonstrate that TPEs aren't necessarily fragile. Decomposing the term and argument into atomic propositions to show where the transition from definition->empirical assertion occurs would have been a safer way to do it.

I think my underlying motivation for attacking the example is that I'm frequently exposed to people with excessive belief in economic fragility. For example, the idea that "if we all just stopped believing in the value of money, it would be just paper." I saw this point of view behind terms like "fragile" and "self-fulfilling prophecy."

I'm sure that for others, there's an excessive perception of economic stability. Things change all the time, and sometimes it comes down to a pure change in public perception.

In the future, I'd like to employ the practice of decomposition before I start writing a critical response.

[-]philh5y30

And fragility:

It is easy for a flippening to occur.

Elizabeth uses the word "fragile", but doesn't say that this is what it means. I'm not sure exactly what she means by it - and I'm not sure if the thing she means is true - but I don't think this is a likely guess.

(My guess would be something like "unstable", in the sense that once it goes away, there's no particular reason to expect it to come back.)

But mutability does not imply fragility, and Elizabeth specifically says that it does.

Not specifically. She merely says that TPEs are fragile. This is somewhat a nitpick, but... I feel like you're trying to take something informal and formalize it, and some of the features of your formalization don't seem motivated by the informal version.

I’m frequently exposed to people with excessive belief in economic fragility. For example, the idea that “if we all just stopped believing in the value of money, it would be just paper.”

This seems basically true to me? I wouldn't call the economy fragile, because I don't expect this to happen. Sometimes people say things like this and I get the sense that they do think it means the economy is fragile in some way that I don't think it is. I think they're making a mistake, but not about this.

[-]DirectedEvolution5y40

I feel like you're trying to take something informal and formalize it, and some of the features of your formalization don't seem motivated by the informal version.

If an informal claim about economics doesn't translate readily to a formal claim, then it's not even wrong. Informal language games are fine in many contexts, but Elizabeth's posts tend to be about careful fact-gathering and scrutinizing sources for accuracy, so I think it's OK to apply that standard to her work.

My guess would be something like "unstable", in the sense that once it goes away, there's no particular reason to expect it to come back.

This would only be true if

a) The TPE outweighs more fundamental factors, meaning that it's hard for a lower-ranked choice to become the top choice, merely due to the TPE.

For example, Facebook is useful primarily because so many people are on it. It would be hard for a direct competitor to attract a user base no matter how much better the underlying software is. There is a clear way to rank all the alternatives in terms of quality, but there's a huge cliff between 1st and 2nd place that exists merely due to the TPE.

b) Random noise outweighs more fundamental factors (or there's no fundamental factors at all), meaning that the differences between lower-ranked choices is obscured by chance. There is no clear way to sort lower-ranked alternatives in terms of quality.

For example, you have no knowledge of horse racing. But you happen to hear that the Mafia given Black Beauty a drug that makes her 5% faster (TPE), making her slightly more likely to win the Kentucky Derby, but not guaranteeing her victory. If the Mafia changes its mind and decides to drug Seabiscuit instead, there's no clear reason to expect that they'll change their mind a third time. Even if they do, there's no reason to think they'll change their mind back and drug Black Beauty, rather than doping Secretariat or Man 'O War. The Mafia is unlikely to change its mind, so you assume that Black Beauty has a small but durable edge.

These two examples illustrate that the TPE has several factors.

It can have a large or small importance relative to fundamental factors (large in the case of Facebook, small in the case of Black Beauty). A small difference implies fragility, large differences imply durability.
It can be endogenous (Facebook's TPE is due to it having the most users, so the fact that it's in 1st place helps anchor it in 1st place) or exogenous (Black Beauty's TPE is due to outside intervention, and it's not clear how capricious the Mafia will be in changing this choice). Exogenous factors are fragile, endogenous factors are durable.
And the non-TPE fundamentals of the options can be well-ordered (as in the case of Facebook's software quality relative to its competitors) or unordered (as in the case of the horse racing neophyte at the Kentucky Derby). Unordered fundamentals mean that the choice selected for the top spot is arbitrary, so that having held the top spot confers no special advantage in regaining it if it is lost. Unordered fundamentals imply fragility, well-ordered fundamentals imply durability.

All of these factors vary empirically. No matter whether "fragility" referred to one, two or all three of these factors, it's clear that fragility and durability exist on a spectrum and may vary widely on a case-by-case basis.

Indeed, Elizabeth says

Tallest pygmy effects are fragile, especially when they are reliant on self-fulfilling prophecies or network effects.

It's not clear to me that "self-fulfilling prophecies or network effects" are inherently fragile. The advantage that Facebook gains due to the size of its user base is an example of a network effect that is durable in all three senses of the term as I've defined it.

This matters. If we cultivate "TPEs tend to be/are inherently fragile" as a heuristic, it will encourage people to look at problems like "how to make a social media company that's bigger than Facebook" as a technical problem. "If you build it, they will come." Well, no, not necessarily. Facebook's TPE is a big moat. In fact, I'd offer an alternative heuristic:

The more obvious the TPE, the more likely it is to be durable.

[-]habryka5y*160

Thanks for doing this!

Just as a clarification: None of the three articles you reviewed are curated articles, in the sense that they are not displayed as part of the curated section of the site, and that they have not been sent out to everyone who is subscribed to curated articles.

They are all still reasonably popular posts, so critiquing them still seems good. I am generally happy to see critiques like this, and think you broadly make some decent points (though I have to read what you said in more detail before I can make a better call).

Presumably you found those articles via the "From the Archives" section of the site (which is part of the "Recommended" section, which shows you posts randomly sampled from LessWrong history, proportional to a simple function of their karma (it's karma raised to some power)). We recently changed the UI around the recommendation section which makes it less obvious where the "From the archives" section ends and the list of curated posts begins, which is something I've been meaning to fix, so maybe that confused you? Or maybe you were just using "curated" in some broader sense, though in that case it still seemed good to clarify for other readers who might be confused.

[-]Zvi5y70

General inquiry as to level of appetite for this type of criticism, and whether doing such for recent posts would be a positive or negative for those writing.

(Not as a 'should this have been written?' but more as a 'should I/others consider writing more similar posts?'

[-]Ben Pace5y120

On the current margin I'd be interested in more of this.

[-]DirectedEvolution5y40

Oh I was just confused. Thanks for the clarification. SMASH THIS POT.

[-]habryka5y40

Sorry for the confusion then! Our current UI sure seems like it would make that confusion likely, so I think of this as mostly my responsibility. I will think about how to preserve the simplicity of the section while also making it clearer that there are two types of posts in there (from the archives and curated).

[-]Dustin5y60

A couple of initial thoughts I had whilst reading this. Take these as more of pondering on my state of mind rather than critiques or corrections.

Without some more formal structure in place, the nature of which I'm unaware, I am not able to "assess" content for correctness or usefulness.

I find this curiously foreign to my default mode of thinking when reading on LW and elsewhere. It is not uncommon for me to find myself thinking "that seems wrong" and "that seems right" within a single paragraph of content from writers I think are the "rightest". On the other hand, I usually do not feel as confident about my assessment in either direction as you seem to be in your post.

That being said...

My reaction to rationalist content is governed by my frame of mind.

I assume this to be the case with all content and I've always assumed it holds for everyone and it hasn't occurred to me to think of rationalist content as different in this way, but seeing you state it "out loud" makes me think maybe I should have.

[-]Bunthut5y30

Tallest pygmy effects are fragile, especially when they are reliant on self-fulfilling prophecies or network effects. If everyone suddenly thought the Euro was the most stable currency, the resulting switch would destabilize the dollar and hurt both its value and the US economy as a whole.

This is begging the question. If everyone suddenly thought the Euro was the most stable currency, something dramatic would have had to have happened to shift the stock market's assessment of the fundamentals of the US vs. EU economies and governments. Economies are neither fragile nor passive, and these kinds of mass shifts in opinion on economic matters don't blow with the wind. Furthermore, people are likely to hedge their bets. If the US and EU currencies are similar in perceived stability, serious investors are likely to diversify.

Which question? That of whether the stability of currencies in in part caused by self-fulfilling prophecies? You seem to be saying that self-fulfilling prophecies dont happen dont happen with competent predictors. Do you assert this as a possibility not disproven, or as a fact?

[-]Mathisco5y30

May I ask why you think you "passively consume" LW content? I notice the same behavior in myself, so I'm curious.

P.S. I hope it's still better than passively consuming most other media.

[-]DirectedEvolution5y40

In one sentence, active reading produces a higher number of reactions per sentence read.

In reading the posts for this exercise, I noticed myself having a far higher number of reactions to the content than normal.

[-]romeostevensit5y20

Objection to 'a': we observe whether our ex ante heuristics converge on the same ex post predictions that known very powerful predictions do to check whether we are using good meta heuristic selection.

LESSWRONG
LW

LESSWRONG
LW

71

Criticism of some popular LW articles

71

71