LESSWRONG
LW

History of RationalityProbability & StatisticsSite Meta
Personal Blog

65

History of LessWrong: Some Data Graphics

by Said Achmiz
16th Nov 2018
1 min read
18

65

History of RationalityProbability & StatisticsSite Meta
Personal Blog

65

History of LessWrong: Some Data Graphics
13habryka
13Chris_Leong
8cousin_it
25Wei Dai
15Said Achmiz
3habryka
2Charlie Steiner
3Said Achmiz
3gjm
3Said Achmiz
3gjm
5Said Achmiz
3Said Achmiz
3habryka
3ChristianKl
3Said Achmiz
5ChristianKl
3Said Achmiz
New Comment
18 comments, sorted by
top scoring
Click to highlight new comments since: Today at 9:02 AM
[-]habryka7y130

Huh, this looks somewhat different from our own analytic graphs we've compiled recently, which look like this (the karma graph in the first below is mostly just inflation caused by the new karma system, so that doesn't really mean much):

LW Analytics

In particular, the post over time graph looks different for Said's stuff, which is somewhat confusing, though I guess it kinda lines up. I notice that you do have some stuff in 2008, and it's aggregated weekly instead of monthly. We filtered out a bunch of stuff marked as spam on our side, but that shouldn't have even reached your side.

Here are also some graphs for number of unique commenters and posters (the vote spike is caused by some data import shenanigans, ignore that one):

LW Analytics 2

Reply
[-]Chris_Leong7y130

I actually appear twice on the list of most prolific authors - Casebash and Chris Leong are both me. Hopefully, I can get my two accounts combined at some point.

Reply
[-]cousin_it7y80

Said, thanks for doing this, and also for running GW!

It looks like #posts/day is propped up by attention seeking, so it stays stable; while #comments/post is a more genuine indicator of health, and shows a slow decline. Also it's curious that the relaunch of LW didn't affect metrics much. Wonder what the mods would say about this.

Reply
[-]Wei Dai7y250

I miss the days when there were hundreds of comments per day and I had to write my own Javascript code to scrape and display all the recent comments (which I called LessWrong Power Reader) to keep up. It seems really strange that even the AI alignment posts these days are getting so few comments compared to for example the decision theory posts back then, even though there must be a lot more people working on or interested in AI alignment today than decision theory back then?

Reply
[-]Said Achmiz7y150

Thanks for running GW ought properly to go to clone of saturn, who both wrote the server code (e.g.) and administers the server. I’m just the front-end guy! :)

As for the graphs/data—you’re quite welcome! (And, by the way, if you, or anyone, have suggestions on what other relationships or patterns might be interesting to extract from this data and make a chart of, let me know and I’ll add it to the post.)

Reply
[-]habryka7y30

Median number of comments/post seems to have gone up a bit since the relaunch, at least according to the graphs I posted below. Average number probably has probably stayed roughly the same, which I think is mostly the cause of Scott moving to his own blog, and his posts continuing to create infinite amounts of comments.

Overall, I think the launch shows up relatively clearly on a bunch of the metrics I posted below, and it's too hard to eye-ball scatter plots to judge on the plots above, though I agree that the overall effect of the relaunch on number of comments is relatively minor.

Reply
[-]Charlie Steiner7y20

Maybe? I think the user habits are pretty different on the site now compared to then. But I agree that more comments would be better :)

Reply
[-]Said Achmiz7y30

Thanks to Betawolf on IRC for this graph, which plots ratio of karma to comment count vs. time, and shows some interesting statistics:

Reply
[-]gjm7y30

That fitted curve looks pretty dubious in its earlier parts. (Maybe I'm misunderstanding and it isn't a fitted curve at all?)

Reply
[-]Said Achmiz7y30

It isn’t a fitted curve, no. It’s just a trend-line, which looks curved due to the log y-axis.

Reply
[-]gjm7y30

If it's a computed trend-line rather than something someone eyeballed then in my book that is a fitted curve. Anyway, that makes sense; presumably it goes below zero somewhere a little to the left of where it stops. Given the obvious discontinuity, it might have made more sense to plot separate lines for before and after...

Reply
[-]Said Achmiz7y50

It’s computed, yes. And I defer to your grasp of the terminology!

And yes, separate trend-lines for before and after the red vertical line might have been good; if the graph’s creator makes such a version, I will post it.

Reply
[-]Said Achmiz7y30

Update: Here is is another graph (also from Betawolf on IRC) that may address your questions/concerns:

And the same thing on a log scale:

Reply
[-]habryka7y30

Hmm, the shift there seems mostly the cause of the changes to self-upvoting, as well as the increase in average karma weight. I think you might be able to adjust for that a bit by just using the vote count field (though obviously that ignores the difference between upvotes and downvotes). You could also resolve every individual anonymized vote and then count the upvotes to properly adjust for this, but that might be a bit of a pain.

Reply
[-]ChristianKl7y30

What's the unit of measurement for "most prolific author"? Karma?

Reply
[-]Said Achmiz7y30

https://www.dictionary.com/browse/prolific

It’s just number of posts written.

Reply
[-]ChristianKl7y50

And I take it comments aren't post for that count.

Reply
[-]Said Achmiz7y30

No, posts are posts, and comments are comments.

Reply
Moderation Log
Curated and popular this week
18Comments

Some graphs showing posting activity on LessWrong through the years.

NOTE: If you’re reading this post on GreaterWrong, you can click on the images to enlarge, zoom in, and click through them all as a slideshow.

Comments per post:

The same thing, on a log scale:

Posts per month:

The 100 most prolific authors over LessWrong’s lifespan:

The same thing, on a log scale:

Whose posts have generated the most total discussion?

As above, but on a log scale:

Data available in a Google Docs spreadsheet. (Or download in CSV format.)

You can also download an Excel spreadsheet, which contains the above graphs and some intermediate processed data.

Edit 2018-11-16: Updated data; corrected some minor abnormalities caused by data retrieval issue. (If you’ve downloaded the data already, please re-download the corrected versions—the links are the same.)