What's the best way to understand what markets think about AGI timelines? Polymarket has a couple of semi-related markets with $5k-$8k of liquidity in each:
https://polymarket.com/event/openai-announces-it-has-achieved-agi-before-2027
https://polymarket.com/event/ai-data-center-moratorium-passed-before-2027
But these aren't great, since either of those events seem like they could easily happen despite no AGI or fail to happen even with AGI.
You can look at stock prices for public companies. Here are some current P/E ratios from somewhat affiliated companies:
Nvidia: ~48
Alphabet: ~33
Meta: ~23
Apple: ~34
Microsoft: ~32
The mean P/E ratio for the Nasdaq is ~30. That's a pretty high ratio for Nvidia (considering it comes on top of being the world's most valuable company). But it's not THAT high.
Is there something else to look at? Perhaps options prices for the above stocks? Implied volatility measures? Something else?
Another proposal! (I think I'm too close to this to really judge what's best.)
// ODDS = YEP:NOPE
YEP = MAKE SOMETHING UP WHATEVER LOL
ASSUME YEP
FOR EACH E IN EVIDENCE
YEP *= HOW SURPRISING IS E?
ASSUME E
THROW ASSUMPTIONS IN TRASH
NOPE = MAKE SOMETHING UP WHATEVER LOL
ASSUME NOPE
FOR EACH E IN EVIDENCE
YEP *= HOW SURPRISING IS E?
ASSUME EI guess that depends on what "ASSUME E" means. It's correct if interpreted the right way, but I think it's pretty ambiguous compared to the rest of the code, so personally I don't love it. I also think it's a bit confusing to use "ASSUME E" and also use "IF YEP" when those are both conditioning operations. Maybe it would be slightly better if you wrote something like
// ODDS = YEP:NOPE
YEP, NOPE = MAKE UP SOME INITIAL ODDS WHO CARES
FOR EACH E IN EVIDENCE
YEP *= CHANCE OF E IF YEP AND ASSUMPTIONS
NOPE *= CHANCE OF E IF NOPE AND ASSUMPTIONS
ASSUME E // DO NOT DOUBLE COUNTOr if you're OK using a little set notation:
// ODDS = YEP:NOPE
YEP, NOPE = MAKE UP SOME INITIAL ODDS WHO CARES
SEEN = {}
FOR EACH E IN EVIDENCE
YEP *= CHANCE OF E IF {YEP} ∪ SEEN
NOPE *= CHANCE OF E IF {NOPE} ∪ SEEN
SEEN = SEEN ∪ {E} // DO NOT DOUBLE COUNTLike my original proposal of YEP *= CHANCE OF E IF {YEP & ALL PREVIOUSLY SEEN E} these are both (to me) unambiguous. But they're kind of clunky. I'm not sure if there is a good solution!
One trick I've had some success with here is "regurgitation": You basically say "repeat the following text exactly as written and then start putting new stuff at the end". I was able to use this to improve performance of non-base models at chess: https://dynomight.net/more-chess/
Ah, good old llama 3.1-405B base. Incidentally, a friend of mine spent a ton of time trying to get different models to imitate my style and reported that using using llama 3.1-405B base was the most critical thing. I think it makes a lot of sense that base models would be better at imitating different writing styles, but am I wrong to be surprised that they would also be good at reporting who wrote them?
The extras i's are funny. I strongly suspect they're due to the fact that years ago I used to have a subscribe form that read "dynomiiiiiiiiiight". It's possible that the fact that I did this also makes the model better at reporting that it's me, since the probability of "dynomiiiiiiiiiight" at the end of a post should be high?
FWIW I actually did run the experiment it a second time with a prompt saying "It's not Scott Alexander". I didn't save the results, but as I recall they were:
(1) Kimi K2 "Dynomight" -> "A" (??)
(2) Claude 4.5 Opus remained correct.
(3) All other models remained wrong. The only changes were that some of the "Scott Alexander" guesses became other (wrong) guesses like Zvi. Several of the models still guessed Scott Alexander despite the prompt.
I like this post a lot. But I wanted to second Morpheus' point that it's not actually quite correct.
Instead of this:
// ODDS = YEP:NOPE
YEP, NOPE = MAKE UP SOME INITIAL ODDS WHO CARES
FOR EACH E IN EVIDENCE
YEP *= CHANCE OF E IF YEP
NOPE *= CHANCE OF E IF NOPEIt should be:
// ODDS = YEP:NOPE
YEP, NOPE = MAKE UP SOME INITIAL ODDS WHO CARES
FOR EACH E IN EVIDENCE
YEP *= CHANCE OF E IF {YEP & ALL PREVIOUSLY SEEN E}
NOPE *= CHANCE OF E IF {NOPE & ALL PREVIOUSLY SEEN E}This is true because
P(YEP | E1, E2, ..., EN) / P(NOPE | E1, E2, ..., EN)
= P(YEP, E1, E2, ..., EN) / P(NOPE, E1, E2, ..., EN)
= P(YEP) / P(NOPE)
× P(E1 | YEP) / P(E2 | NOPE)
× P(E2 | YEP, E1) / P(E2 | NOPE, E1)
× ...
× P(EN | YEP, E1, ..., E{N-1}) / P(E2 | NOPE, E1, ... E{N-1})The recipe in the post is only true assuming that all the evidence is independent given YEP or NOPE. But that's rarely true, and (in my view) the most common mistake that people actually make when trying to apply Bayesian reasoning in practice, and leads to the kinds of crazy over-confident posteriors we see in certain things like the Rootclaim Covid-19 debate.
I reiterate that I like this post a lot! The point of the post isn't to make some novel mathematical contribution but to explain things in a vivid way. I think it succeeds there and with higher "production values" I think this post might have the potential to be wildly influential.
To check in on how the emergent LLM stylometry abilities are going, before publishing my most recent blog post, I decided to ask some AIs who wrote it.
Results:
Kimi K2: Dynomight
GLM 4.7: Nate Soares
Claude 4.5 Opus: Dynomight
DeepSeek Chat V3.2: Scott Alexander
Qwen 3: Dan Luu
GPT 5.2: Scott Alexander
Gemini 3: Dwarkesh Patel
Llama 4 Maverick: Scott Alexander
Grok 4: Scott Alexander
Mistral Large 3: Scott Alexander
(Urf.)
I'd be interested to hear more about #2:
I know that not caring about what most people care about will make you charismatic. People gravitate towards those with “I don’t give a fuck” energy.
Naively, I certainly agree that's attractive, but I would have thought it's just one trait among many, and wouldn't have given it such a central role in charisma.
(I stress (re: #36) that I'm not suggesting this is wrong, just interested in understanding your thinking!)
Thanks for the reply, sorry I just saw this. It was indeed my goal to talk about existing ideas in a nontechnical way, which is why I didn't frame things in terms of model expansion, etc.. Beyond that however, I am confused by your reply, as it seems to make little contact with my intended argument. You state that I recommend "just ignoring" the issue, and suggest that I endorse double-counting as OK. Can you explain what parts of the post led you to believe that was my recommendation? Because that is very much not my intended message!
(I stress that I'm not trying to be snarky. The goal of the post is to be a non-technical explanation, and I don't want to change that. But if the post reads as you suggest, I interpret that as a failure of the post, and I'd like to fix that.)