Also known as Max Harms. (I post AI alignment content under my other account.)

Not the same person as MaxH!

Wiki Contributions


Just wanted to remind folks that this is coming up on Saturday! I'm looking forward to seeing y'all at the park. It should be sunny and warm. Feel free to send me requests for snacks or whatever.


Is there a minimal thing that Claude could do which would change your mind about whether it’s conscious?

Edit: My question was originally aimed at Richard, but I like Mikhail’s answer.

Thanks! The creators also apparently have a substack: https://forecasting.substack.com/

If you have multiple quality metrics then you need a way to aggregate them (barring more radical proposals). Let’s say you sum them (the specifics of how they combine are irrelevant here). What has been created is essentially a 25-star system with a more explicit breakdown. This is essentially what I was suggesting. Rate each post on 5 dimensions from 0 to 2, add the values together, and divide by two (min 0.5), and you have my proposed system. Perhaps you think the interface should clarify the distinct dimensions of quality, but I think UI simplicity is pretty important, and am wary of suggesting having to click 5+ times to rate a post.

I addressed the issue of overcompensating in an edit: if the weighting is a median then users are incentivized to select their true rating. Good thought. ☺️

Thanks for your support and feedback!

I agree that there are benefits to hiding karma, but it seems like there are two major costs. The first is in reducing transparency; I claim that people like knowing why something is selected for them, and if karma becomes invisible the information becomes hidden in a way that people won’t like. (One could argue it should be hidden despite people’s desires, but that seems less obvious.) The other major reason is one cited by Habryka: creating common knowledge. Visible Karma scores help people gain a shared understanding of what’s valued across the site. Rankings aren’t sufficient for this, because they can’t distinguish relative quality from absolute quality (eg I’m much more likely to read a post with 200 karma, even if it’s ranked lower due to staleness than one that has 50).

I suggested the 5-star interface because it's the most common way of giving things scores on a fixed scale. We could easily use a slider, or a number between 0 and 100 from my perspective. I think we want to err towards intuitive/easy interfaces even if it means porting over some bad intuitions from Amazon or whatever, but I'm not confident on this point.

I toyed with the idea of having a strong-bet option, which lets a user put down a stronger QJR bet than normal, and thus influence the community rating more than they would by default (albeit exposing them to higher risk). I mainly avoided it in the above post because it seemed like unnecessary complexity, although I appreciate the point about people overcompensating in order to have more influence.

One idea that I just had is that instead of having the community rating set by the weighted mean, perhaps it should be the weighted median. The effect of this would be such that voting 5-stars on a 2-star post would have exactly the same amount of sway as voting 3.5, right up until the 3.5 line is crossed. I really like this idea, and will edit the post body to mention it. Thanks!

I agree with the expectation that many posts/comments would be nearly indistinguishable on a five-star scale. I'm not sure there's a way around this while keeping most of the desirable properties of having a range of options, though perhaps increasing it from 10 options (half-stars) to 14 or 18 options would help.

My basic thought is that if I can see a bunch of 4.5 star posts, I don't really need the signal as to whether one is 4.3 stars vs 4.7 stars, even if 4.7 is much harder to achieve. I, as a reader, mostly just want a filter for bad/mediocre posts, and the high-end of the scale is just "stuff I want to read". If I really want to measure difference, I can still see which are more uncontroversially good, and also which has more gratitude.

I'm not sure how a power-law system would work. It seems like if there's still a fixed scale, you're marking down a number of zeroes instead of a number of stars. ...Unless you're just suggesting linear voting (ie karma)?

Ah! This looks good! I'm excited to try it out.

Yep. I'm aware of that. Our karma system is better in that regard, and I should have mentioned that.

