Oh, I wasn't claiming originality, just trying to give some background to people who might have stumbled here.
The way I think of the Lottery Ticket Hypothesis is in terms of the procedure for actually finding a lottery ticket: start with a random initialised network, then train it, then look at the weights of the network that are the largest after training (say keep only 10% or 1% of the weights), go back to the initial random network and drop all the weights that won't end up being the largest, now train that highly sparse network and you'll end up with close to the same end performance, even though you're using a much smaller network.
This seems to mean that all those weights we dropped don't really have any effect on the training, we might have imagined that weird nonlinear effects might mean that getting to a good final solution would require the presence of weights that in the end will be useless, those weights might have been "catalysts", helping the network arrive at its final solution, but not themselves useful for prediction. But no, it seems that if weights are going to end up useless, then they're useless from the beginning.
I largely agree that anger and emotional happiness (but not ultimate Happiness) are quite incompatible, doing 1200 hours of meditation over two years has reduced both the frequency and half-life of any anger I feel and it made me realise just how unpleasant it actually was. However, let me try to steelman anger:
First, very advanced meditators actually seem to report that even moments of anger are "perfect" in some way, just like all other moments. I've certainly seen Shinzen Young and other zen masters get somewhat angry, not at anyone, but to emphasise a particular point in an emotional way. So anger is certainly useful for communicating your values to an audience. Hearing someone say "I really care about this" in a neutral tone doesn't quite have the punch of someone getting mildly angry.
There also seems to be a relationship between anger and a sort of aggressive motivation. I notice that my bench press sets seem a lot easier if I get angry first, and they're also a lot more pleasant, being angry without doing anything is unpleasant, but expending that "angry energy" towards a goal does seem to be pleasant. Being angry about things like the existence of cancer can certainly help with motivation to solve these problems.
Does FTX let you do the Perpetual Future Arbitrage trade on margin? I was seeing 20x leverage offered without additional fees on their website, that would come out to like 50% monthly gain, which seems batshit insane to me.
I agree with the argument conditional on us not being fundamentally confused about anthropics, but as with most of these sorts of arguments, most of my uncertainty is coming from what is meant by "reference class" and "observer". I have the nagging feeling that all of anthropics stems from us not being able to properly define what an observer actually is. And I think it's possible that observers actually don't exist and we're completely confused about all of this.
Though any belief so extreme wouldn't really feel like a "belief" in the colloquial sense, I don't internally label my belief that there is a chair under my butt as a "belief". That label instinctually gets used for things I am much less certain about, so most normal people doing an internal search for "beliefs" will only think of things that they are not extremely certain of. Most beliefs worth having are extreme, but most beliefs internally labeled as "belief" worth having are not extreme.
In the specific scenario of "Mark Xu asks to bet me about the name on his driver's license", my confidence drops immediately because I start questioning his motives for offering such a weird bet.
Though even there, his lectures are famous for only being truly appreciated after you've first learned the material elsewhere. They are incredibly good at giving you the feeling of understanding but quite a bit less good at actually teaching problem-solving. When reading them, it was a common occurrence for me to read a chapter and believe the subject was the most straightforward and natural thing in the world, only to be completely mystified by the problems.
I expect any version of "align narrowly superhuman models" which evaluates the success of the project entirely by human feedback to be completely and totally doomed, at-best useless and at-worst actively harmful to the broader project of alignment
There are plenty of problems where evaluating a solution is way way easier than finding the solution. I'm doubtful that the model could somehow produce a "looks good to a human but doesn't work" solution to "what is a room-temperature superconductor?". I agree that for biological problems the issue is much more concerning, and certainly for any kind of societal problem, but as long as we stay close to math, physics and chemistry, "looks good to a human" and "works" are pretty closely related to each other.
As defined, I think my cheerful price for many purposes would be extremely high, like 50$ for giving you the cup of coffee I just bought from the Starbucks across the street. However, it just seems rude to name a price that high to a friend, my instincts to not offend a friend are driving the price I would say downwards, maybe you are trying to not expend friendship capital by asking me for my cheerful price, but naming a high price feels to me like I'm expending friendship capital. And in fact there might be some part of me that resents you for asking me to name a cheerful price, so you're expending friendship capital in just asking for my cheerful price. If I do actually want to make the trade, I'm also thinking of the likelihood that you'll stop bargaining once you find out that my cheerful price is too high, which drives the number I'll say still more downward. My point is that it's basically impossible to not expend friendship capital when asking someone to name any price at all.