Are we full of bullshit?
If we wish to really spur the destruction of bullshit, perhaps there should be an anti-review: A selection process aimed at posts that received many upvotes, seem widely loved, but in retrospect were either false or so confused as to be as bad or worse than being false. The worst of LW, rather than the best; the things that seemed most shiny and were most useless.
I note that for purposes of evaluating whether we are full of bullshit, the current review process will very likely fail because of how it is constructed; it isn't an attempt to falsify, it's making the wrong move on the Wason Selection Task. While such a negative process might do the opposite.
(Of course, the questionable social dynamics around this would be even worse)
Huh, I feel like it's pretty good for that purpose? If you want a list of posts that were popular but not endorsed, just take the difference between the highly upvoted posts and the review results.
The only requirement for a posts to enter the review phase is that anyone thinks it still has anything going for it. As such, if a post is obviously a fad, it won't end up thoroughly reviewed, but if it's really that common-knowledge that it was a fad, that seems fine. And even then, we still occasionally have people nominate posts for review just because they think they are bad and want people to do a retrospective on them.
One thing I want to remind people: if something looks like it's going to end up winning the review, and you disagree with it, if you write up a critical review that gets upvoted (10+ karma), it'll show up whenever we spotlight the review. This may not be fully satisfying if you were really hoping to change everyone's mind, but it does mean you can at least make sure our infrastructure makes sure everyone knows about your disagreement.
(I recommend optimizing your first sentence to convey the most important argument of your disagreement, so the one-line version of the comment gets the core idea across)
For example, AI Control was one of the leading candidates from the last review, but, John's countertake is highlighted for people who are skimming through the /bestoflesswrong page.
Here's a feature proposal.
The problem: At present, when a post has 0 reviews, there is an incentive against writing critical reviews. Writing such a review enables the post to enter the voting phase, which you don't especially want to happen if you think the post is undeserving. This seems perverse: critical reviews are valuable, especially so if someone would write a positive review later, enabling the post to enter voting anyway. (In principle, you can "lie in ambush" until someone writes a positive review and only then write your negative review, but that requires annoying logistics.)
My suggestion: Allow flagging reviews as "critical" in the UI. (One option is to consider a review "critical" whenever your own vote for the post is negative, another is to have a separate checkbox.) Such reviews would not count for enabling the post to enter voting.
Mmm.
Somewhat related problem: a lot of the impact of writing a review is that it bumps the post into awareness on the frontpage which makes it more likely for people who liked it to see it and vote positively on it (whether this is good or bad from the perspective of a critical reviewer depends on whether you think you're writing a takedown of something popular, or, just clarifying why something-isn't-good that there was already rough mutual agreement wasn't that good). I don't know that that problem needs "solving" but wanted to acknowledge it and see if anyone had thoughts.
How does crossposting something to nominate work? I tried with Thresholding and the system is tracking its date as the date I crossposted, not the date of the original. Reasonable but not great for my purposes. Is there something I'm supposed to do?
Just to check, did you use the "Submit Linkposts" functionality on the nomination page for that, or did you crosspost it some other way?
ETA: Ok, looks like the library responsible for extracting external article data/metadata didn't successfully extract the date the article was published. I've manually set it to the correct date.
We have a ritual around these parts.
Every year, we have ourselves a little argument about the annual LessWrong Review, and whether it's a good use of our time or not.
Every year, we decide it passes the cost-benefit analysis[1].
Oh, also, every[2] year, you do the following:
Maybe you can tell that I'm one of the more skeptical members of the team, when it comes to the Review.
Nonetheless, I think the Review is probably worth your time, even (or maybe especially) if your time is otherwise highly valuable. I will explain why I think this, then I will tell you which stretch of ditch you're responsible for digging this year.
Are we full of bullshit?
Every serious field of inquiry has some mechanism(s) by which it discourages its participants from huffing their own farts. Fields which have fewer of these mechanisms tend to be correspondingly less attached to reality. The best fields are those where formal validation is possible (math) or where you can get consistent, easily-replicable experiment results which cleanly refute large swathes of hypothesis-space (much but not all of physics). The worst fields are those where there is no ground truth, or where the "ground truth" is a pointer to a rapidly changing[3] social reality.
In this respect, LessWrong is playing on hard mode. Most of the intellectual inquiry that "we" (broadly construed) are conducting is not the kind where you can trivially run experiments and get really huge odds ratios to update on based on the results. In most of the cases where we can relatively easily run replicable experiments, like all the ML stuff, it's not clear how much evidence any of that is providing with respect to the underlying questions that are motivating that research (how/when/if/why AI is going to kill everyone).
We need some mechanism by which we look at the posts we were so excited about when they were first published, and check whether they still make any sense now that the NRE[4] has worn off. This is doubly-important if those posts have spread their memes far and wide - if those memes turned out to be wrong, we should try to figure out whether there were any mistakes that could have been caught at the time, with heuristics or reasoning procedures that wouldn't also throw out all true and useful updates too (and maybe attempt to propagate corrections, though that can be pretty hopeless).
Is there gold in them thar hills?
Separate from the question of whether we're unwittingly committing epistemic crimes and stuffing everyone's heads full of misinformation, is the question of whether all of the blood, sweat, tears, and doomscrolling is producing anything of positive value.
I wish we could point to the slightly unusual number of people who went from reading and writing on LessWrong to getting very rich as proof positive that there's something good here. But I fear those dwarves are digging too deep...
So we must turn to somewhat less legible, but hopefully also less cursed, evidence. I've found it interesting to consider questions like:
Imagine that we've struck the motherlode and the answers to some of those questions are "yes". The Review is a chance to form a more holistic, common-knowledge understand of you and other people in your intellectual sphere are relating to these questions. It'd be a little sad to go around with some random mental construction in your head, constantly using it to understand and relate to the world, assuming that everyone else also had the same gadget, and to later learn that you were the only one. By the law of the excluded middle, that gadget is either good, in which case you need to make sure that everyone else also installs it into their heads, or it's bad, which means you should get rid of it ASAP. No other options exist!
If your time and attention is valuable, and you spend a lot of it on LessWrong, it's even more important for you to make sure that it's being well-spent. And so...
The Ask
Similat to last year, actually. Quoting Ray:
Except, uh, s/2023/2024. This year, you'll be nominating posts from 2024!
How To Dig
Copied verbatim from last year's announcement post.
Instructions Here
Nuts and Bolts: How does the review work?
Phase 1: Preliminary Voting
To nominate a post, cast a preliminary vote for it. Eligible voters will see this UI:
If you think a post was an important intellectual contribution, you can cast a vote indicating roughly how important it was. For some rough guidance:
Votes cost quadratic points – a vote strength of "1" costs 1 point. A vote of strength 4 costs 10 points. A vote of strength 9 costs 45. If you spend more than 500 points, your votes will be scaled down proportionately.
Use the Nominate Posts page to find posts to vote on.
Posts that get at least one positive vote go to the Voting Dashboard, where other users can vote on it. You’re encouraged to give at least a rough vote based on what you remember from last year. It's okay (encouraged!) to change your mind later.
Posts with at least 2 positive votes will move on to the Discussion Phase.
Writing a short review
If you feel a post was important, you’re also encouraged to write up at least a short review of it saying what stands out about the post and why it matters. (You’re welcome to write multiple reviews of a post, if you want to start by jotting down your quick impressions, and later review it in more detail)
Posts with at least one review get sorted to the top of the list of posts to vote on, so if you'd like a post to get more attention it's helpful to review it.
Why preliminary voting? Why two voting phases?
Each year, more posts get written on LessWrong. The first Review of 2018 considered 1,500 posts. In 2021, there were 4,250. Processing that many posts is a lot of work.
Preliminary voting is designed to help handle the increased number of posts. Instead of simply nominating posts, we start directly with a vote. Those preliminary votes will then be published, and only posts that at least two people voted on go to the next round.
In the review phase this allows individual site members to notice if something seems particularly inaccurate in its placing. If you think a post was inaccurately ranked low, you can write a positive review arguing it should be higher, which other people can take into account for the final vote. Posts which received lots of middling votes can get deprioritized in the review phase, allowing us to focus on the conversations that are most likely to matter for the final result.
Phase 2: Discussion
The second phase is a month long, and focuses entirely on writing reviews. Reviews are special comments that evaluate a post. Good questions to answer in a review include:
In the discussion phase, aim for reviews that somehow give a voter more information. It's not that useful to say "this post is great/overrated." It's more useful to say "I link people to this post a lot" or "this post seemed to cause a lot of misunderstandings."
But it's even more useful to say "I've linked this to ~7 people and it helped them understand X", or "This post helped me understand Y, which changed my plans in Z fashion" or "this post seems to cause specific misunderstanding W."
Phase 3: Final Voting
Posts that receive at least one review move on the Final Voting Phase.
The UI will require voters to at least briefly skim reviews before finalizing their vote for each post, so arguments about each post can be considered.
As in previous years, we'll publish the voting results for users with 1000+ karma, as well as all users. The LessWrong moderation team will take the voting results as a strong indicator of which posts to include in the Best of 2024, although we reserve some right to make editorial judgments.
Your mind is your lantern. Your keyboard, your shovel. Go forth and dig!
Or at least get tired enough of arguing about it that sheer momentum forces our hands.
Historical procedures have varied. This year is the same as last year.
And sometimes anti-inductive!
New relationship energy.
Ray: "Maybe also literal but I haven't done the UI design yet."
Ray: "In previous years, we had a distinction between "nomination" comments and "review" comments. I streamlined them into a single type for the 2020 Review, although I'm not sure if that was the right call. Next year I may revert to distinguishing them more."
Ray: "These don't have to be long, but aim to either a) highlight pieces within the post you think a cursory voter would most benefit from being reminded of, b) note the specific ways it has helped you, c) share things you've learned since writing the post, or d) note your biggest disagreement with the post."