I agree that "X explains Q% of the variance in Y" to me sounds like an assertion of causality, and a definition of that phrase that is merely correlations seems misleading.
Might it be better to say "After controlling for Y, the variance of X is reduced by Q%" if one does not want to imply causation?
Edit: On re-read, I think I misinterpreted what you were saying, and most of what I wrote is not relevant--my actual question is "why is pairwise ranking of events/states/choices not sufficient to quantify utility?".
The inconsistency across time and individuals in evaluating marginal differences in utility does not necessarily preclude utility from being quantifiable.
In the extreme, almost everyone would, given the choice between "eternal happiness, fulfillment, and freedom for me and everyone I care about" and "the less desirable between eternal torture our oblivion for me and everyone I care about", choose the former. It seems reasonable then to say that the former has higher utility to them than the latter. Given such a starting point, the utility of further events can be approximately quantified by ranking. You may (reasonably) conclude that you don't care about imperceptible differences in utility without throwing away the concept of quantifiable utility in general.
Have I misinterpreted what you meant by "Utility as an instantaneous ordinal preference, revealed by choice, is extremely well-supported. Utility as a quantity is far more useful for calculations, but far less justifiable by observation" ? Is pairwise (approximate) ranking of choices not sufficient to quantify utility?
(P.S. Utilitarianism is not actually my foremost moral axiom, but quantification of utility does not seem incoherent to me)
I have some suggestions for mechanistic improvements to the LW website that may help alleviate some of the issues presented here.
RE: Comment threads with wild swings in upvotes/downvotes due to participation from few users with large vote-weights; a capping/scaling factor on either total comment karma or individual vote-weights could solve this issue. An example total-karma-capping mechanism would be limiting the absolute value of the displayed karma for a comment to twice its parent's karma. An example vote-weight-capping mechanism would be limiting vote weight to the number of votes on a comment. The total-cap mechanism seems easier to implement if LW just records the total karma for a comment rather than maintaining the set of all votes on it. Any mechanism like those described has some issues though, including the possibility of users voting on something but not seeing the total karma change at all.
RE: Post authors (and commenters) not having enough information about the behavior of specific commenters when deciding whether/how to engage with them, and the cruelty of automatically attaching preemptive dismissals to comments; it does not seem more cruel to publicly tag a user's comments with a warning box saying "critiques from this user are usually not substantive/relevant" than to ban them. This turns hard-censorship into soft-censorship, which seems less dangerous to me, and also like it could be more easily applied by moderators without requiring hundreds of hours of deliberation.
RE: Going after only the most legible offender(s) rather than the worst one(s); Giving users and moderators the ability to mark a commenters interactions throughout a thread as "overall unproductive/irrelevant/corrosive/bad-faith" in a way that allows users to track who they've had bad interactions with in the past, and allows moderators better visibility into who is behaving badly even when they have not personally seen the bad behavior (with the built-in bonus of marking examples). These marks should only be visible to the user assigning them and to moderators for what I think are obvious reasons. A more general version of this system would be the ability to assign tags to users for a specific comment/chain (e.g. "knowledgeable about African history", "bad-faith arguer" that link back to the comment which inspired the tag. Such a system is useful for users who have a hard time remembering usernames, but could also unfortunately result in ignoring good arguments from people after a single bad interaction.
Meta: I am new and do not know if this is an appropriate place for site-mechanic suggestions, or where to find prior art. Is there a dedicated place for this?
Attempting to destroy anything with non-epsilon probability of preventing you from maximally satisfying your current utility function (such as humans, which might shut you down or modify your utility function in the extreme case) is one of the first instrumentally convergent strategies I thought of, and I'd never heard of instrumentally convergent strategies before today. Seems reasonable for EY to assume.