alkexr

Posts

Sorted by New

Wiki Contributions

Comments

Improving on the Karma System

My gut feeling is that attracting more attention to a metric, no matter how good, will inevitably Goodhart it.

That is a good gut feeling to have, and Goodhart certainly does need to be invoked in the discussion. But the proposal is about using a different metric with a (perhaps) higher level of attention directed towards it, not just directing more attention to the same metric. Different metrics create different incentive landscapes to optimizers (LessWrongers, in this case), and not all incentive landscapes are equal relative to the goal of a Good LessWrong Community (whatever that means).

I am not sure what problem you are trying to solve, and whether your cure will not be worse than the disease.

This last sentence comes across as particularly low-effort, given that the post lists 10 dimensions along which it claims karma has problems, and then evaluates the proposed system relative to karma along those same dimensions.

The value of low-conscientiousness people on teams

The way this topic appears to me is that there are different tasks or considerations that require different levels of conscientiousness for the optimal solution. In this frame, one should just always apply the appropriate level of conscientiousness in every context, and the trait conscientiousness is just a bias people have in one direction or the other that one should try to eliminate.

This frame is useful, because it opens up the possibility to do things like "assess required conscientiousness for task", "become aware of bias", "reduce bias", etc. But it may also be wrong in an important way. It's somewhere between difficult to impossible to tell how much conscientiousness is required in any particular case, what's more, even what constitutes an optimal solution may not be obvious. In this frame, trait conscientiousness is not bias, but diversity that nature threw against the problem to do selection with.

I have trouble understanding why, in this case, everyone would need to have a consistently high or consistently low level of it across a wide range of contexts; and why, for example, one can't just try a range of approaches of varying consientiousness levels and learn to optimize from the experience. It isn't necessary in any of the examples above to have a person involved with consistently low levels of it, just a person who in that particular case takes the low-conscientiousness approach. This way we could still fall back on the interpretation as a bias and blame nature for just being inefficient doing the evolution biologically instead of memetically.

The reverse Goodhart problem

I think it's empirical observation.

The world doesn't just happen to behave in a certain way. The probability that all examples point in a single direction without some actual mechanism causing it is negligible.

The reverse Goodhart problem

I ended up using mathematical language because I found it really difficult to articulate my intuitions. My intuition told me that something like this had to be true mathematically, but the fact that you don't seem to know about it makes me consider this significantly less likely.

If we have a collection of variables , and , then  is positively correlated in practice with most  expressed simply in terms of the variables.

Yes, but  also happens to be very strongly correlated with most  that are equal to . That's where you do the cheating. Goodhart's law, as I understand it, isn't a claim about any single proxy-goal pair. That would be equivalent to claiming that "there are no statistical regularities, period". Rather, it's a claim about the nature of the set of all potential proxies.

In a Bayesian language, Goodhart's law sets the prior probability of any seemingly good proxy being a good proxy, which is virtually 0. If you have additional evidence, like knowing that your proxy can be expressed in a simple way using your goal, then obviously the probabilities are going to shift.

And that's how your  and  are different. In the case of , the selection of  is arbitrary. In the case of , the selection of  isn't arbitrary, because it was already fixed when you selected . But again, if you select a seemingly good proxy  at random, it won't be an actually good proxy.

The reverse Goodhart problem

You have a true goal, . Then you take the set of all potential proxies that have an observed correlation with , let's call this . By Goodhart's law, this set has the property that any  will with probability 1 be uncorrelated with  outside the observed domain.

Then you can take the set . This set will have the property that any  will with probability 1 be uncorrelated with  outside the observed domain. This is Goodhart's law, and it still applies.

Your claim is that there is one element,  in particular, which will be (positively) correlated with . But such proxies still have probability 0. So how is that anti-Goodhart?

Pairing up  and  to show equivalence of cardinality seems to be irrelevant, and it's also weird.  is an element of , and this depends on .

[This comment is no longer endorsed by its author]Reply
The reverse Goodhart problem

Your  is correlated with , and that's cheating for all practical purposes. The premise of Goodhart's law is that you can't measure your true goal well. That's why you need a proxy in the first place.

If you select a proxy at random with the only condition that it's correlated with your true goal in the domain of your past experiences, Goodhart's law claims that it will almost certainly not be correlated near the optimum. Emphasis on "only condition". If you specify further conditions, like, say, that your proxy is your true goal, then, well, you will get a different probability distribution.

Why don't long running conversations happen on LessWrong?

Some frames worth considering:

  • Strong Prune, weak Babble among LessWrongers
  • Conversation failing to evolve past the low-hanging fruit
  • People being reluctant to express thoughts that might make their account look stupid in a way that's visible to the entire internet
  • Everyone can participate, and as the number of people involved in a conversation increases it becomes more and more difficult to track all positions
  • Even lurkers like me can attempt to participate, and it's costly in terms of conversational effort to figure out what background knowledge someone is missing
  • Most topics that appear on LessWrong are suited for mental masturbation only, they offer no obvious course of action through which people can decide to care about said topics
  • There is way, way too much content (heck, I've only skimmed through the comments under this post)
  • Long-running conversations don't tend to happen; therefore there is little incentive in delving deep into one topic, so people (well, me at least) end up engaging with more topics, but in a shallow manner; which in turn creates conditions where long-running conversations are less likely
  • Due to the way the platform is designed, the only real way to maintain a long-running conversation between persons A and B is if the pattern of response is ABABABAB..., so anyone not being confident at any point is a single point of failure

I also have a suggestion. After the discussion here inevitably fades, you could write another post in which you summarize the main takeaways and steelman a position or two that look valuable at that time. That might generate further discussion. Repeat. This way you could attempt to keep the flame of the conversation alive. But if you end up doing this, make sure to give the process a name, so that people realize that this is a Thing and that they are able to participate in this-Thing-specific ways.

What's your visual experience when reading technical material?

The first layer of internal visual experience I have when reading is a degree of synesthesia (letters have colors). Most of the time I'm not aware that this is happening. It does make recalling writing easier (I sometimes deduce missing letters, words or numbers from the color).

Then there is the "internal blackboard", which I use for equations or formulas. I use conscious effort to make the equation appear as a visual experience (in its written form). I can then manipulate this image as if the individual symbols or symbol groups were physical objects that can move and react with each other. This apparently allows me to solve more complex equations in my head than most mathematicians. (I believe this is a learnable skill.)

Finally, there are the visual experiences that I use to understand concepts. I'm not sure how to describe these, because these certainly aren't actual images that are actually possible. More like structures of shapes, spatial relations and other "sub-visual" experiences. It's not like I can actually visualize an n-dimensional subspace, but it isn't simply a lower-dimensional analogue either. It looks thin, but with a vast inside, in a way that would be contradictory in "normal" visual experience.

Whenever I read about a concept that seems interesting (e.g. Moloch), I pause. Then I take the verbal experience of what I've read, and use it as a guide for some internal thought process to follow. The nature of this process is the creation and manipulation of impossible visual experiences of this kind.

These days my visualization is a lot fainter than it used to be, so faint in fact that sometimes I barely see anything at all, in spite of knowing what I'm (not) seeing. This includes my dreams, and maybe even waking experience (how would I tell?), and I believe this is unnatural. This only seems to have a negative effect on the "internal blackboard", but not on any of the other mechanisms I mentioned.

Zvi's Law of No Evidence

Absence of evidence of X is evidence of absence of X.

A claim about the absence of evidence of X is evidence of:

  • the speaker's belief of the listeners' belief in X,
  • absence of evidence of NOT X,
  • the speaker's intention of changing the listeners' belief in X.

No paradox to resolve here.

Load More