I have read your link and I understand you seem to have much more knowledge about statistics than I am. Perhaps I'm making a simple mistake somewhere in my reasoning. My thoughts are:
50 success horizon went from 1 to 4 hours. I interpreted your comment and post to posit that the increase is to some large degree due to training on cybersecurity to increase preformance in the 2-4hr range. However, looking at the charts it seems clear that the increase is not dominated by increased preformance in the 2-4hr range.
My point isn't that there's enough samples or that it isn't possible to game the METR test, it's that the measured improvement for Claude Opus 4.5 doesn't seem to be primarly in the 2-4 hr bracket, if anything it's dominated by 4-16hr.
If I'm intrepreting these charts correctly, there is a decent amount of progress in 15m-1hr bracket, a small amount of progress in the 1hr bracket, a decent amount of progress in the 2-4 hr bracket, and a large amount of progress in the 4-16hr bracket. It doesn't look like progress was dominated by the 1-4hr range.
Thought-provoking post, especially the section about the meaning of traditional food. I'm curious about the relationship between insulin resistance and obesity. If you can keep skinny can you avoid the metabolism FUBAR you're talking about even if your diet is still westernised?
So, I have to admit I'm still confused. Is the icecream example fairly unrelated to the introduction and first chapter? They seem to be talking mostly about pure qualia, while the ice-cream example is talking about actions.
I agree qualia is entirely disconnected from rationality, but I think anything beyond qualia such as actions or intending to take actions is rationality fair game so to speak. I don't see an issue in Bryce assessing the rationality of Ash stopping for ice-cream, it was his communication/social skills that were lacking.
After reading your post very carefully, I think you are agreeing with the above, I just had the opposite impression upon my first and second reading. Apologies if i'm still misunderstandin, either way I find this topic very interesting so thanks for writing this post.
I also find the ice-cream example confusing because it's your main example but it doesn't seem like it supports your main points. For example replace Ash with a drug addict and ice cream with meth and Bryce suddenly looks like a hero trying to help his friend from making a big mistake.
I think the below example still makes Bryce look like how it seems you intended him to look even if it was talking about meth.
Ash: “Ooh, I want ice cream.”
Bryce: “Seriously?”
Ash: “Well, I'm not actually going to buy some, I just want some”
Bryce: “That’s the problem. You know it’s not good for you. You should want to be eating healthy food”
Ash: “I mean, yeah, but right now I just really feel like ice cream.”
Bryce: “That's stupid”
For anyone under 30, 10 years of polical lying, or 20 years for journalists "lying" is a long time and could be seen as a new reality, especially seeing as it seems to be working out quite well for the liars so it's unlikely to change anytime soon.
Can you explain more, could a lower income worker without family nearby afford child care and full time house help?
Personally the idea of no free will doesn't negatively impact my mental state, but I can imagine it would for others, so I'm not going to argue that point. You should perhaps consider the positive impacts of the no-free will argument, I think it could lead to alot more understanding and empathy in the world. It's easy for most to see someone making mistakes such as crime, obesity, or just being extremely unpleasant and blame/hate them for "choosing" to be that way. If you believe everything is determined, I find it's pretty easy to re-frame it into someone who was just unlucky enough to be born into the specific situation that led them to this state. If you are yourself successful, instead of being prideful of your superior will/ soul, you can be humble and grateful for all the people and circumstances that allowed you to reach your position/mental state.
Thanks, that sated my curiosity nicely. Just so you know I'm not trying to pretend I've optimised my child's upbringing, just doing the best I can like most parents I know. I reckon your kids are lucky to have you.
The examples are clearly favoring haiku for those who want a summary.
-Chatgpt: 674 words to conclude there is no clean answer.
Haiiku - 274 words and gives an unambiguous answer.