I read that this "spoiled meat" story is pretty overblown. And it doesn't pass the sniff test either. Most meat was probably eaten right after slaughter, because why wouldn't you?
Also herbs must have been cheaply available. I also recently learned that every household in medieval Europe had a mother of vinegar.
I played a game against GPT-4.5 today and seemed to be the strongest LLM I have played so far. Didn't hallucinate once, didn't blunder and reached a drawn endgame after 40 moves.
What helps me to overcome the initial hurdle to start doing work in the morning:
Also:
I think it also helps to take something you are good at and feel good about and in that context take responsibility for something and/or interact with/present to people. Only this kind of social success will build the confidence to overcome social anxiety, but directly trying to do the social stuff you feel worst about usually backfires (at least for me).
Which is exactly what I am doing in the post? By saying that the question of consciousness is a red herring aka not that relevant to the question of personhood?
No.
The argument is that feelings or valence more broadly in humans requires additional machinery (amygdala, hypothalamus, etc). If the machinery is missing, the pain/fear/.../valence is missing although the sequence learning works just fine.
AI is missing this machinery, therefore it is extremely unlikely to experience pain/fear/.../valence.
It's probably just a difference in tokenizer. Tokenizers often produce tokens with trailing whitespace. I actually once wrote a tokenizer and trained a model to predict "negative whitespace" when a token for once shouldn't have a trailing whitespace. But I don't know how current tokenizers handle this, probably in different ways.
I originally thought that the METR results meant that this or next year might be the year where AI coding agents had their breakthrough moment. The reasoning behind this was that if the trend holds AI coding agents will be able to do several hour long tasks with a certain probability of success, which would make the overhead and cost of using the agent suddenly very economically viable.
I now realised that this argument has a big hole: All the METR tasks are timed for un-aided humans, i.e. humans without the help of LLMs. This means that especially for those tasks that can be successfully completed by AI coding agents, the actual time a human aided by LLMs would need is much shorter.
I'm not sure how many task completion time doublings this buys before AI coding agents take over a large part of coding, but the farther we extrapolate from the existing data points the higher the uncertainty that the trend will hold.
Estimating task completion times for AI-aided humans would have been an interesting addition to the study. The correlation of the time-savings through AI-support with the task completion probability by AI coding agents might have allowed the prediction of the actual economic competitiveness of AI coding agents in the near future.
I meant chess specific reasoning.
Is there already an METR evaluation of Claude 4?