Rana Dexsin

You see either something special, or nothing special.

Wiki Contributions

Comments

GPT-3 Catching Fish in Morse Code

It reminds me of the way human children going through language learning will often latch onto words or phrases to repeat and play with before moving on. (Possibly annoying the adults around them in the process.)

Do you consider your current, non-superhuman self aligned with “humanity” already?

That's part of the point, yes! My thought is that the parent question, while it's trying to focus on the amplification problem, kind of sweeps this ill-defined chunk under the rug in the process, but I'm not sure how well I can justify that. So I thought asking the subquestion explicitly might turn out to be useful. (I should probably include this as an edit or top-level comment after a bit more time has passed.)

Do you consider your current, non-superhuman self aligned with “humanity” already?

This is part of what I was getting at in terms of meta-questions. There have been attempts to solve this, of which CEV seems to come up the most; I haven't personally found any of them strictly compelling. The parent question mixes the self-or-other-human thought experiment with the self-extrapolation part, but I wanted to see what the answers would be like without it—if it's testing for “does extrapolating yourself misalign you”, and if it means comparatively, then surely something like a control group should be in play, even if I can't sort one out in a more ordered fashion.

Do you consider your current, non-superhuman self aligned with “humanity” already?

Mark “agree” on this comment if you would say that no, you do not consider yourself aligned with humanity.

Do you consider your current, non-superhuman self aligned with “humanity” already?

Mark “agree” on this comment if you would say that yes, you consider yourself aligned with humanity.

LessWrong Has Agree/Disagree Voting On All New Comment Threads

To derive from something I said as a secondary part of another comment, possibly more clearly: I think that extracting “social approval that this post was a good idea and should be promoted” while conflating other forms of “agreement” is a better choice of dimensionality reduction than extracting “objective truth of the statements in this post” while conflating other forms of “approval”. Note that the former makes this change kind of a “reverse extraction” where the karma system was meant to be centered around that one element to begin with and now has some noise removed, while the other elements now have a place to be rather than vanishing. The last part of that may center some disapprovals of the new system, along the lines of “amplifying the rest of it into its own number (rather than leaving it as an ambiguous background presence) introduces more noise than is removed by keeping the social approval axis ‘clean’” (which I don't believe, but I can partly see why other people might believe).

Of Strange Loop relevance: I am treating most of the above beliefs of mine here as having primarily intersubjective truth value, which is similar in a lot of relevant ways to an objective truth value but only contextually interconvertible.

LessWrong Has Agree/Disagree Voting On All New Comment Threads

Strange-Loop relevant: this very comment above is one where I went back to “disagree” with myself after Duncan's reply. What I meant by that is that I originally thought the idea I was stating was likely to be both true and relevant, but now I have changed my mind and think it is not likely to be true, but I don't think that making the post in the first place was a bad idea with what I knew at the time (and thus I haven't downvoted myself on the other axis). However, I then remembered that retraction was also an option. I decided to use that too in this case, but I'm not sure that makes full sense here; there's something about the crossed-out text that gives me a different impression I'm not sure how to unpack right now. Feedback on whether that was a “correct” action or not is welcome.

LessWrong Has Agree/Disagree Voting On All New Comment Threads

Okay. I think I understand better now, and especially how this relates to the “trust” you mention elsewhere. In other words, something more like: you think/feel that not locking the definition down far enough will lead to lack of common knowledge on interpretation combined with a more pervasive social need to understand the interpretation to synchronize? Or something like: this will have the same flaws as karma, only people will delude themselves that it doesn't?

LessWrong Has Agree/Disagree Voting On All New Comment Threads

I think an expansion of that subproblem is that “agreement” is determined in more contexts and modalities depending on the context of the comment. Having only one axis for it means the context can be chosen implicitly, which (to my mind) sort of happens anyway. Modes of agreement include truth in the objective sense but also observational (we see the same thing, not quite the same as what model-belief that generates), emotional (we feel the same response), axiological (we think the same actions are good), and salience-based (we both think this model is relevant—this is the one of the cases where fuzziness versus the approval axis might come most into play). In my experience it seems reasonably clear for most comments which axis is “primary” (and I would just avoid indicating/interpreting on the “agreement” axis in case of ambiguity), but maybe that's an illusion? And separating all of those out would be a much more radical departure from a single-axis karma system, and impose even more complexity (and maybe rigidity?), but it might be worth considering what other ideas are around that.

More narrowly, I think having only the “objective truth” axis as the other axis might be good in some domains but fails badly in a more tangled conversation, and especially fails badly while partial models and observations are being thrown around, and that's an important part of group rationality in practice.

LessWrong Has Agree/Disagree Voting On All New Comment Threads

(Preamble: I am sort of hesitant to go too far in this subthread for fear of pushing your apparent strong reaction further. Would it be appropriate to cool down for a while elsewhere before coming back to this? I hope that's not too intrusive to say, and I hope my attempt below to figure out what's happening isn't too intrusively psychoanalytical.)

I would like to gently suggest that the mental motion of not treating disagreement (even when it's quite vague) as “being kicked”—and learning to do some combination of regulating that feeling and not associating it to begin with—forms, at least for me, a central part of the practical reason for distinguishing discursive quality from truth in the first place. By contrast, a downvote in the approval sense is meant to (but that doesn't mean “will consistently be treated as”, of course!) potentially be the social nudge side—the negative-reinforcement “it would have been better if you hadn't posted that” side.

I was initially confused as well as to how the four-pointed star version you suggested elsewhere would handle this, but combining the two, I think I see a possibility, now. Would it be accurate to say that you have difficulty processing what feels like negative reinforcement on one axis when it is not specifically coupled with either confirmatory negative or relieving positive reinforcement on the other, and that your confusion around the two-axis system involves a certain amount of reflexive “when I see a negative on one axis, I feel compelled to figure out which direction it means on the other axis to determine whether I should feel bad”? Because if so, that makes me wonder how many people do that by default.

[This comment is no longer endorsed by its author]Reply
Load More