Voting Phase of 2018 LW Review

Ben Pace

For the past 1.5 months on LessWrong, we've been doing a major review of 2018 — looking back at old posts and asking which of them have stood the test of time.

The LessWrong 2018 Review has three major goals.

First, it is an experiment in improving the LessWrong community's longterm feedback and reward cycle.

Second, it is an attempt to build common knowledge about the best ideas we've discovered on LessWrong.
Third, after the vote, the LessWrong Team will compile the top posts into a physical book.

We spent about 2 weeks nominating posts, and 75 posts received the the 2 required nominations to pass through that round. (See all nominations on the nominations page.) Then we spent a month reviewing them, and users wrote 72 reviews, a number of them by the post-authors themselves. (See all reviews on the reviews page.)

And finally, as the conclusion of all of this work, we are now voting on all the nominated posts. Voting is open for 12 days, and will close on Sunday, January 19th. (We'll turn it off on Monday during the day, ensuring all timezones get it throughout Sunday.)

The vote has a simple first section, and a detailed-yet-optional second section based on quadratic voting. If you are one of the 430 users with 1000+ karma, you are eligible to vote, then now is the time for you to participate in the vote by following this link.

For all users and lurkers, regardless of karma, the next 12 days are your last opportunity to write reviews for any nominated posts in 2018, which I expect will have a significant impact on how people vote. As you can see, all reviews are highlighted when a user is voting on a post. (To review a post, go to the post and click "Write A Review" at the top of the post.)

This is the end of this post. If you'd like to read more detailed instructions about how to vote, the rest of the text below contains instructions for how to use the voting system.

How To Vote

Sorting Posts Into Buckets

The first part of voting is sorting the nominated posts into buckets.

The five buckets are: No, Neutral, Good, Important, Crucial. Sort the posts however you think is best.

The key part is the relative weighting of different posts. For example, it won't make a difference to your final vote if you put every post in 'crucial' or every post in 'good'.

Fine-Tuning Your Votes

The system we're using is quadratic voting (as I discussed a few weeks ago).

Once you're happy with your buckets, click 'Convert to Quadratic'. At this point the system converts your buckets roughly into their quadratic equivalents.

The system will only assign integer numbers of votes, which means that it will likely only allocate around 80-90% of the total votes available to you. If you vote on a smaller number of posts (<10), the automatic system may not use your entire quadratic voting budget.

If you're happy with how things look, you can just leave at this point, and your votes will be saved (you can come back any time before the vote closes to update them). But if you want to allocate 100% of your available votes, you'll likely need to do fine-tuning.

There are two key parts of quadratic voting you need to know:

First, you have a limited budget of votes.
Second, votes on a post have increasing marginal cost.

This means that your first vote costs 1 point, your second vote on that post costs 2 points, your third costs 3 points. Your nth vote costs n points.

You have 500 points to spend. You can see how many points you've spent at the top of the posts.

The system will automatically weight the buckets differently. For example, I just did this, and I got the following weightings:

Good: 2 votes.
Important: 4 votes.
Crucial: 9 votes.
Neutral: 0 votes.
No: -4 votes.

(Note that negative votes cost the same as positive votes. The first negative vote costs 1 point, the second negative vote costs 2 points, etc.)

You'll see your score at the top of the page. (When I arrived on the fine-tuning page, the system had spent about 416 points, which meant I had a significant number of votes left to buy.)

Once you're happy with the balance, just close the page; your votes will be saved.

You can return to this page anytime until voting is over, to reconfigure your weights.

Leaving Comments (Anonymous)

There's a field to leave anonymous thoughts on a post. All comments written here will be put into a public Google doc, and be linked to from the post that announces the result of the vote. If you want to share your thoughts, however briefly, this is a space to do that.

I will likely be making a book of 2018 posts, and if I do I will use the votes as my guide to what to include, so I'll definitely be interested in reading through people's anonymous thoughts and feelings about the 2018 LW posts.

Extra Notes

Additional info on voting

If you'd like to go back to the buckets stage, hit “Return to basic voting”. If you do this, all of your fine-tuning will be thrown out the window, and the system will re-calculate your weights entirely based on the new buckets you assign.

I find it really valuable to be able to see the posts in the order I've currently ranked them, so there's a button at the top to re-order the posts, which I expect to be clicked dozens of times by each user.

If you click on a post, all nominations and reviews for that post will appear in a box on the side of the page. You may want to read these to make a more informed decision when voting.

The results

The voting will have many outputs. Once we've had the time to analyse the data, we'll include a bunch of data/graphs, all anonymised, such as:

For each winning post and each bucket, how many times people picked that bucket
The individual votes each post got
The results if the karma cutoff was 10,000 karma rather than 1,000
The output of people's buckets compared with the output of people's quadratic fine-tuning
The mean and standard deviation of the votes

Vote Here (if you have more than 1000 karma)

The posts available for review are presented in (what I guess is) a consistent order that is (so far as I know) the same for everyone. I expect this to mean that posts presented earlier will get more votes. If, as seems plausible, most of the things in the list aren't bad enough to get a "no" vote from most voters, this means that there is a bias in favour of the earlier posts in the list.

Related: keeping track of which posts I've looked at is a bit of a pain. Obviously I can see which ones I've voted non-neutral for, but there's no obvious way to distinguish "decided to stick with neutral" from "haven't looked yet". So long as the order of presentation is consistent, I can just remember how far through the list I am, but (see above) it's not obviously a good thing for the order of presentation to be consistent. And this phenomenon incentivizes me to process posts in order, rather than deliberately counteracting the bias mentioned above by trying to look at them in a random order.

The list is now shuffled (as a tiebreak after sorting by your own vote). The shuffle is done once per user, so each user should see the posts in a random order, but it'll be the same order each time you revisit it. This change went live around the 13th.

[EDIT: When I say that posts earlier in the list got 25-50% more votes, I mean simply the number of non-neutral votes cast on those items, regardless of direction of magnitude. It would perhaps be more accurate to say these posts had 25-50% more people vote on them.]

The posts available for review are presented in (what I guess is) a consistent order that is (so far as I know) the same for everyone. I expect this to mean that posts presented earlier will get more votes.

Good call. I looked into this and found an effect of somewhere between 25-50% more votes for posts being displayed earlier in the list. The team rolled out a fix to randomize loading this morning.

Interestingly, the default sort order was by number of nominations in ascending, so the most heavily nominated (approx, the most popular) posts were being displayed last. These posts were getting as many votes as those at the beginning of the list (though possibly not as many as they might have otherwise), and it's the posts in the middle that were getting less.

This was an oversight which we're glad to have caught. We've around halfway through the voting, plus the second half will have the deadline rush, so hopefully this bias will get countered in the coming week.

Unfortunately you just make mistakes the first time you're doing things. :/

I voted in category mode, and am some way through fine-tuning in quadratic mode.

Some of these posts have almost no content in themselves but link to other places where the actual meat is. (The two examples I've just run into: "Will AI see sudden progress?" and "Specification gaming examples in AI".)

Are we supposed to vote on these as if they include that external content? If the highest-voted posts are made into some sort of collection, will the external content be included there?

One wish I have related to the voting is that either there were much fewer posts that got to the voting stage, or I could choose to just vote on a subset of posts, rather than all of them. As it is, I'm somewhat intimidated by the prospect of reading 75 posts to decide how to vote on each one.

You don't have to vote on all posts, you don't have to have read all the posts. I think it's fine for the vote to correlate with what posts people actually read. The default vote on all is ‘neutral’, which equals zero quadratic votes. I voted on about half of the posts, and I think I read way more than most people.

Edit: Rewritten.

Is it pro-social or anti-social to vote on posts I have skimmed but not read?

I think it would make sense if you weakly vote on them, by spending relatively few points of your quadratic budget on them. Voting very strongly on them feels wrong to me. Basically, vote in strength proportional to your confidence times the goodness/badness of your assessment of the post, would be my guess.

+1 I have voted on a number of posts that I've mostly skimmed, but not voted with much weight.

(Quadratic voting makes the first few votes just very cheap, which was one part of my reasoning.)

I also note that I think there's signal in your decision to only skim a post, as opposed to reading it, but as noted in habryka's response, it's probably a weak signal.

I think it's a particularly weak signal when you're trying to evaluate 75 posts at once.

Have finished fine-tuning my votes.

Woop! I did the same yesterday.

Maybe a bit late to ask about this, but I realized how a vote really depends on the information you guys have from the votes (seeing as the votes are really used just to give a signal, and aren't actually deciding anything).

Do you see all the votes, along with the user names?

Do you see each users' votes, but anonymized?

Do you see only the final count, and not the votes?

We see information about how much individuals vote (i.e. total amount of points you’ve assigned and how many posts you’ve voted on), but not the details of which votes.

The current plan is something like: carve up vote totals from a couple different cross sections (i.e. check if there’s a difference between how the highest karma users voted vs the overall vote score, and probably checking what AF users thought about AF posts). I expect us to make those totals public.

I’m not sure about all the ways we will look at data but I expect the general principle will be “look for ways to to get insights about how people used it, while keeping everyone anonymized”.

We see information about how much individuals vote

For accuracy's sake, I'll add that we have all the data about who voted on what. Our internal policy is not to look at votes by specific users on specific posts unless we have really good reason to such as suspecting foul play.

Ray is correct about what we in fact look at, but feels important to say that we in principle could see it all if we chose to, and that we're requesting some trust from the community.

I've found the review process a good insight into what goes on on LW. From the posts nominated and the comments and feedback they have generated I have more of an understanding of general topics but I have a lot of questions about definitions...

A couple of thoughts:

improving the LessWrong community's longterm feedback and reward cycle.

Feedback risks punishment.

"Reward cycle" is an interesting phrase - More karma sharing between the collective? Or rewarding interactions and the development of ideas/processes/beliefs/understanding through debate and learning?

What is the aim/dream for the book? On a scale of globally-acclaimed to something solid for the collective to hold.

A nomination/review cycle for 2017 (and back) would be worth doing if creating a physical book for wider release.

It seems to me like you extended the voting period and spent more effort to get people to vote by sending out emails because you believe the voting is important.

When it comes to this post it doesn't make a case of why you believe it's valuable to vote here. I'm curious about what your idea happens to be.

When it comes to the interface I think it would be great if the interface would show me my past karma votes on the post. It's useful to have the information of how I found the post after reading it the first time at hand when trying to evaluate 75 posts at once.

I realize we didn't justify the Voting very hard. Here's my offhand attempt, which maybe we'll roll into the actual post after chatting about it more on Monday.

LessWrong runs, for good or for ill, off the same forces much of the rest of the internet runs on: people who are slightly bored at work. Naturally, posts get rewarded mostly by upvotes and comments, which disproportionately reward things for being exciting and for controversial (respectively). These are quite easy to goodhart on.

The Review (in general), and Voting (in particular) are an attempt to do a more nuanced thing – to take the accumulated taste of the LessWrong community, and use it to reflect hard on what was actually good, and then backpropagate that signal through people's more general sense of "what sort of posts are good to write and why?"

Without the Vote, the signal would basically be entirely "what the Mod Team Thinks Was Best", or, if we weren't doing this at all "what posts were memorable, and/or high karma". And this isn't ideal for a few reasons:

The Mod Team doesn't have domain expertise in all the areas that posts explore
Even though we're putting a lot of work into it, it's still a really daunting project to form opinions on all 75 posts. Having a mixture of people who've looked harder at different posts helps give more coverage of nuanced opinions.
Something something wisdom of crowds – each person is biased in some way, or has different knowledge. Getting many people to participate helps counterbalance various knowledge and biases that individuals have.

I meanwhile expect the voting here to be better than usual karma-voting, because it's more comparative. You're not just voting on "this post seems good!" but "this post seems better than this other post". What I found useful for my own voting was being forced to stop and think and build a model of what-sorts-of-posts-are-good-and-why.

When it comes to the interface I think it would be great if the interface would show me my past karma votes on the post. It's useful to have the information of how I found the post after reading it the first time at hand when trying to evaluate 75 posts at once.

Yeah, I definitely agree with this. I think we've put about as much work into the UI as we're going to this year (I originally budgeted a couple weeks of time for the Review and ended up spending 1.5 months on it, but I think, assuming it stays in roughly the same form next year, this is an obvious thing to include)

The right hand side of the interface isn't working for me - the scroll bar looks like its scrolling down but the text doesn't move. I'm using Firefox.

Oh, I think a) that may get fixed (or at least replaced by newer and more interesting bugs) in another 30 minutes when our current deploy finishes, and b) in the meantime, try scrolling on alternate parts of the page? What I just found was that the scroll depended on whether my cursor was within a particular sub-window.

Working perfectly now :)

Alas, will look into it.

The link is 404 for me.

Should be fixed now.

The posts available for review are presented in (what I guess is) a consistent order that is (so far as I know) the same for everyone. I expect this to mean that posts presented earlier will get more votes.

I voted in category mode, and am some way through fine-tuning in quadratic mode.

Are we supposed to vote on these as if they include that external content? If the highest-voted posts are made into some sort of collection, will the external content be included there?

Edit: Rewritten.

Is it pro-social or anti-social to vote on posts I have skimmed but not read?

+1 I have voted on a number of posts that I've mostly skimmed, but not voted with much weight.

(Quadratic voting makes the first few votes just very cheap, which was one part of my reasoning.)

I also note that I think there's signal in your decision to only skim a post, as opposed to reading it, but as noted in habryka's response, it's probably a weak signal.

I think it's a particularly weak signal when you're trying to evaluate 75 posts at once.

Have finished fine-tuning my votes.

Woop! I did the same yesterday.

Do you see all the votes, along with the user names?

Do you see each users' votes, but anonymized?

Do you see only the final count, and not the votes?

We see information about how much individuals vote (i.e. total amount of points you’ve assigned and how many posts you’ve voted on), but not the details of which votes.

I’m not sure about all the ways we will look at data but I expect the general principle will be “look for ways to to get insights about how people used it, while keeping everyone anonymized”.

We see information about how much individuals vote

Ray is correct about what we in fact look at, but feels important to say that we in principle could see it all if we chose to, and that we're requesting some trust from the community.

A couple of thoughts:

improving the LessWrong community's longterm feedback and reward cycle.

Feedback risks punishment.

What is the aim/dream for the book? On a scale of globally-acclaimed to something solid for the collective to hold.

A nomination/review cycle for 2017 (and back) would be worth doing if creating a physical book for wider release.

It seems to me like you extended the voting period and spent more effort to get people to vote by sending out emails because you believe the voting is important.

When it comes to this post it doesn't make a case of why you believe it's valuable to vote here. I'm curious about what your idea happens to be.

I realize we didn't justify the Voting very hard. Here's my offhand attempt, which maybe we'll roll into the actual post after chatting about it more on Monday.

The Mod Team doesn't have domain expertise in all the areas that posts explore
Even though we're putting a lot of work into it, it's still a really daunting project to form opinions on all 75 posts. Having a mixture of people who've looked harder at different posts helps give more coverage of nuanced opinions.
Something something wisdom of crowds – each person is biased in some way, or has different knowledge. Getting many people to participate helps counterbalance various knowledge and biases that individuals have.

When it comes to the interface I think it would be great if the interface would show me my past karma votes on the post. It's useful to have the information of how I found the post after reading it the first time at hand when trying to evaluate 75 posts at once.

The right hand side of the interface isn't working for me - the scroll bar looks like its scrolling down but the text doesn't move. I'm using Firefox.

Working perfectly now :)

Alas, will look into it.

The link is 404 for me.

Should be fixed now.

LESSWRONG
LW

LESSWRONG
LW

51

Voting Phase of 2018 LW Review

51

How To Vote

Extra Notes

51

51