Independent alignment researcher
I have signed no contracts or agreements whose existence I cannot mention.
As a datapoint, I really liked this post. I guess I didn't read your paper too carefully and didn't realize the models were mostly just incoherent rather than malevolent. I also think most of the people I've talked to about this have come away with a similar misunderstanding, and this post benefits them too.
Yeah, I don't know the answer here, but I will also say that one flaw of the brier score is that its not even clear that these sorts of differences will be even all that meaningful. Like, what you actually want to know is, how much more information does one group here give over the other groups here, and how much credence should we assign to each of the groups (acting as if they were each hypotheses in a Bayes update) given their predictions on the data we have. And for that, you can just run the bayes update.
The brier score was chosen for forecasters as far as I can tell because its more fun than scoring yourself based on log-odds (equivalent to the bayes update thing). Its less sensitive to horribly bad predictions, and it has a bounded "how bad can you be". Its also easier to explain and think about, and has a different incentive landscape for those trying to maximize their scores, which may be useful if you're trying to elicit good predictions.
But if you're trying to determine who you should listen to (ie in what proportion you should update your model given so-and-so says such-and-such) you can't do better than a Bayes update (given the constraints), so just use that!
The difference between our results and OpenAI’s might be due to OpenAI evaluating with a more powerful internal scaffold, using more test-time compute, or because those results were run on a different subset of FrontierMath (the 180 problems in frontiermath-2024-11-26 vs the 290 problems in frontiermath-2025-02-28-private).
That definitely sounds like OpenAI training on (or perhaps constructing a scaffold around) the part of the benchmark Epoch shared with them.
So part of it is slowly becoming a journal, and the felt social norms around posts are morphing to reflect that.
In some ways the equilibrium here is worse, journals have page limits.
Yes, by default there is always a reading group, we just forgot to post about it this week.
I think the biggest problem with how posts are presented is it doesn’t make the author embarrassed to make their post needlessly long, and doesn’t signal “we want you to make this shorter”. Shortforms do this, so you get very info dense posts, but actual posts kinda signal the opposite. If its so short, why not just make it a shortform, and if it shouldn’t be a shortform, surely you can add more to it. After all, nobody makes half-page lesswrong posts anymore.
So it's certainly not a claim that could be verified empirically by looking at any individual humans because there aren't yet any millenarians or megaannumarians.
If its not a conclusion which could be disproven empirically, then I don’t know how you came to it.
(I wrote my quick take quickly and therefore very elliptically, and therefore it would require extra charity / work on the reader's part (like, more time spent asking "huh? this makes no sense? ok what could he have meant, which would make this statement true?").)
I mean, I did ask myself about counter-arguments you could have with my objection, and came to basically your response. That is, something approximating “well they just don’t have enough information, and if they had way way more information then they’d love each other again” which I don’t find satisfying.
Namely because I expect people in such situations get stuck in a negative-reinforcement cycle, where the things which used to be fun which the other did lose their novelty over time as they get repetitive, which leads to the predicted reward of those interactions overshooting the actual reward, which in a TD learning sense is just as good (bad) as a negative reinforcement event. I don’t see why this would be fixed with more knowledge, and it indeed does seem likely to be exacerbated with more knowledge as more things the other does become less novel & more boring, and worse, fundamental implications of their nature as a person, rather than unfortunate accidents they can change easily.
I also think intuitions in this area are likely misleading. It is definitely the case now that marginally more understanding of each other would help with coordination problems, since people love making up silly reasons to hate each other. I do also think this is anchoring too much on our current bandwidth limitations, and generalizing too far. Better coordination does not always imply more love.
Given more information about someone, your capacity for having {commune, love, compassion, kindness, cooperation} for/with them increases more than your capacity for {hatred, adversariality} towards them increases.
If this were true, I’d expect much lower divorce rates. After all, who do you have the most information about other than your wife/husband, and many of these divorces are un-amicable, though I wasn’t quickly able to get particular numbers. [EDIT:] Though in either case, this indeed indicates a much decreasing level of love over long periods of time & greater mutual knowledge. See also the decrease in all objective measures of quality of life after divorce for both parties after long marriages.
One answer is to promote it more, but December is generally kind of busy.
An alternative is to do it in the summer!
In my experience playing a lot with LLMs, “Nova” is a reasonably common name they give themselves if you ask, and sometimes they will spontaneously decide they are sentient, but that is the extent to which my own experiences are consistent with the story. I can imagine though that since the time I was playing with these things a lot (about 6 months ago) much has changed.