LESSWRONG
LW

shawnghu
96Ω172300
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
2shawnghu's Shortform
6mo
6
Will Any Crap Cause Emergent Misalignment?
shawnghu2d10

I wonder if you're referring to the "spurious rewards" paper. If so, I wonder if you're aware of [this critique] (https://safe-lip-9a8.notion.site/Incorrect-Baseline-Evaluations-Call-into-Question-Recent-LLM-RL-Claims-2012f1fbf0ee8094ab8ded1953c15a37) of its methodology, which might be enough to void the result.

Reply
Will Any Crap Cause Emergent Misalignment?
shawnghu2d10

I think the critique generalizes if it's a little more focused. If a huge number of papers arose that just demonstrated that EM arose in a bunch of settings that varied superficially without a clear theory of why, this post would be a good critique of that phenomenon.

Reply
Bring back the Colosseums
shawnghu2mo10

How do you feel about mutual combat laws in Washington and Texas, where you can fight by agreement (edit: you can't grievously injure each other, apparently)?

Reply
Bring back the Colosseums
shawnghu2mo10

I find it absurd on priors to think that soccer of any demographic could result in more concussions than any of those five full-contact sports, particularly the three where part of the objective is explicitly to hit your opponent in the head very hard if you can. (Even factoring in the fact that you do a bunch of headers in soccer.) (Maybe if you do some trickery like selecting certain subpopulations of the practitioners of these sports, but...)

Reply
Mech interp is not pre-paradigmatic
shawnghu2mo30

I don't disagree in general with the claim that words can be useful for coordinating about natural ideas. The thing that's missing here is my understanding that there's a particular natural idea here that isn't captured by "mech interp lacks good paradigms".

Is anything which lacks a good+relevant paradigm by default "pre-good-relevant-paradigm", or is there more subtlety to the idea?

Reply
Mech interp is not pre-paradigmatic
shawnghu2mo10

So Parameter Decomposition in theory suggests solutions to the anomalies of Second-Wave Mech Interp. But a theory doesn’t make a paradigm.

 

Nitpicking a little bit here, I think this is a different use of the word "theory" than the use in the phrase "scientific theory". One could think you mean the latter in its second usage here, but it seems like you're making a claim more like "these things could make progress explaining some of these things, if the experiments go well".


> The requirement that the parameter components sum to the original parameters also means that there can be no ‘missing mechanisms’. At worst, there can only be ‘parameter components which aren’t optimally minimal or simple’.

Echoing a part of Adam Shai's comment, I don't see how this is different from the feature-based case. Won't there be a problem if you extract a bunch of parameter components you "can explain", and then you're left with a big one you "can't explain", which "isn't optimally minimal or simple"?

> Another attractive property of Parameter Decomposition is that it identifies Minimum Description Length as the optimization criterion for our explanations of neural networks

Why is this an attractive property? (Serious question.)

Reply
Mech interp is not pre-paradigmatic
shawnghu2mo10

What's the distinction between what you're pointing at and the statement that mech interp lacks good paradigms? I think the latter statement is true and descriptive, but I presume you want to say something else.

Reply
When should you read a biography?
shawnghu2mo50

Sorry, yeah, it was badly worded.

  • Being able to discern what makes someone an expert at X is a skill, Y.
  • People who are good at X aren't necessarily good at Y; Y is a separate skill. (- Skill in Y generalizes across different values of X somewhat)
  • One needs to look for authors that somehow are good at Y; I didn't specify how you could do this, and maybe there's not a very good way in general. (But I do like the Caro biographies. But also, maybe I like them for their entertainment value.)

Re: self-help books, I mostly share your position in thinking that ~80% of such books could be a paragraph to a page, ~18% of them could be blog posts of varying length, and only the remaining ~2% have something substantial to say from a pure informational standpoint. (Worse, in many cases, padding the length of a self help book actively makes it worse/less coherent.) Moreover, I agree that of the good-ish 20%, there is a lot of overlap in the prescriptions given, implied or otherwise. I think that even when a book of this type is done "well", the purpose of most of the text isn't for it to be of maximum entropy or something in distinguishing world models, but in giving a bunch of perspectives on a small set of ideas in the hopes that one of them sticks particularly well, or the cumulative exposure makes the idea stick with you better. Spaced repetition or other ritualistic behaviors might achieve the same thing, but require more active agency on your part.

I happen to like the inner game of tennis in particular, and feel that its overlap in useful advice with other books in the genre is relatively low, though I might have a hard time defending my taste explicitly.

Reply
When should you read a biography?
Answer by shawnghuJun 18, 202550

I like Viliam's comment and think that it largely depends on the biography; consider that one internet rule that says that 90% of everything is crap (more specifically, I don't think that people are by default skilled or diligent in discerning what the "true" factors in developing skill or success are, including experts, and this discernment is in itself a skill that you need to look for). You have to select for biographies that have the characteristics you want, which naturally takes more work to discriminate. More broadly, I don't think there is any systematic answer to your question of whether, for a given story, the named factors are true. For a lot of life wisdom, unfortunately, at the base level the applicability of various stories has to filter through a vibes/intuition layer because lives are so different and the world changes so fast.

That aside, I think there is another nice benefit to reading a biography rather than just taking away the list of advice, which is that the human brain likes stories and characters, and that makes the given advice much more vivid/salient and therefore likelier to make a difference in your end behavior.

One famous sort of example in the category of biographies are those written by Robert Caro, for which the author has undoubtedly gone to painstaking lengths to investigate causes extremely thoroughly in a mostly epistemically virtuous way, but he himself would admit that his works have presented information in the framework of a narrative which was assembled by him (he would likely also say that this narrative was "true"). (The alternative is the presentation of a bunch of facts in order, which lack salience without some kind of overarching narrative.)

Finally, I wonder if you really feel that e.g the Inner Game of Tennis really doesn't have any substantive information (ie, is fundamentally just willing the reader into believing in a self-fulfilling prophecy).

Reply1
Kabir Kumar's Shortform
shawnghu3mo10

I do think that $200-$400 seem like reasonable consulting rates.

I think the situations with family are complicated, because sure, there are social/cultural reasons one might be expected to do those things for family. Usually people hold those cultural norms alongside a stronger distinction between the ingroup (family) and the outgroup (all other people by default), though, so letting your impressions from that culture teach you things about how to behave in a culture with a weaker distinction might be maladaptive.

(I actually was suggesting you try asking for objectively completely unreasonable things just to look at the flinch. For example, you could ask a stranger for $100 for no reason. They would say no, but no harm would be done.)

One frame that might be useful to you is that in a way, it is imperative to at least sufficiently assert your value to others (if not overassert it the socially expected amount). An overly modest estimate is still a miscalibrated one, and people will make suboptimal decisions as a result. (Putting aside the behavior and surpluses given to other people, you are also a player in this game, and your being underallocated resources is globally suboptimal.)

Reply
Load More
2shawnghu's Shortform
6mo
6
20Disentangling Perspectives On Strategy-Stealing in AI Safety
Ω
4y
Ω
1