P(doom) = 50%. It either happens, or it doesn't.
Lao Mein | Statistics is Hard. | Patreon
I give full permission for anyone to post part or all of any of my comments/posts to other platforms, with attribution.
Currently doing solo work on glitch tokens and tokenizer analysis. Feel free to send me job/collaboration offers.
DM me interesting papers you would like to see analyzed. I also specialize in bioinformatics.
Turns out, the answer to my question is a clear "no". The white King has up to 7 moves, and the rook 14. In about half of the positions, the black King is in diagonal contact with the white Rook. This means that 4/14 rook moves and all but 2 King moves blunder the rook. In addition, white requires up to 16 moves to get to checkmate. The only positive factor is the slightly lower number of moves (~20 vs ~32), but a much higher proportion of moves for the single rook endgame is a blunder for white, and that more than cancels out any upside.
Consider the relatively simple 2 rooks endgame.
White has 2 rooks and makes random moves. Black only has a king. We'll ignore the white king for now, but I'm pretty sure it makes things worse for White.
Each rook has 14 possible moves, for a total of 28 rook moves. One of those 28 progresses towards a winning state, one regresses it, and 3 blunder a rook.
The Markov Chain for this converges at a rook blunder being ~435x more likely than a checkmate. If black tries to hunt down the rooks, this gets even worse.
Thus, an impossibly big advantage against Stockfish is still extremely unlikely to convert into a win.
I don't think the other endgames are massively different - the number of possible moves and blunder-moves are roughly the same, although progression is more likely to be completely lost to a random move.
My question is: does this imply that a double-rook endgame is more likely to checkmate after losing a rook than before?
The Chinese firewall works on a black-list basis, and it often takes months for even popular new sites to be banned. AI2027 is esoteric enough that it probably never will.
I really wonder what effects text like this will have on future chain-of-thoughts.
If fine-tuning on text calling out LLM COT deception reduces COT deception, that's one of those ambiguous-events-I-would-instead-like-to-be-fire-alarm-things I hate. It could be trying to be more honest and correct a bad behavior, or just learning to hide its COT more.
I think you would be able to tell which one with the right interpretability tools. We really should freak out if it's the latter, but I suspect we won't.
I guess the actual fire alarm would be direct references to writings like this post and how to get around it in the COT. It might actually spook a lot of people if the COT suddenly changes to ROT13 or Chinese or whatever at that point.
I've always viewed ALLFED as one of the most underfunded charities in existence, and highly encourage donating.
Any opinions on lanternfish as a food source? They make up a significant proportion of the world's biomass and are edible (but only barely) for humans. Is there any easy way to remove the oils that make them toxic to eat in large quantities?
The recent push for coal power in the US actually makes a lot of sense. A major trend in US power over the past few decades has been the replacement of coal power plants by cheaper gas-powered ones, fueled largely by low-cost natural gas from fracking. Much (most?) of the power for recently constructed US data centers have come from the continued operation of coal power plants that would otherwise been decommissioned.
The sheer cost (in both money and time) of building new coal plants in comparison to gas power plants still means that new coal power plants are very unlikely to be constructed. However, not shutting down a coal power plant is instant when compared to the 12-36 months needed to build a gas power plant.
Potential token analysis tool idea:
Use the tokenizers of common LLMs to tokenize a corpus of web text (OpenWebText, for example), and identify the contexts in which they frequently appear, their correlation with other tokens, whether they are glitch tokens, ect. It could act as a concise resource for explaining weird tokenizer-related behavior to those less familiar with LLMs (e.g. why they tend to be bad at arithmetic) and how a token entered a tokenizer's vocabulary.
Would this be useful and/or duplicate work? I already did this with GPT2 when I used it to analyze glitch tokens, so I could probably code the backend in a few days.
The announcement of Stargate caused a significant increase in the stock price of GE-Vernona, albeit at a delay. This is exactly what we would expect to see if the markets expect a significant buildout of US natural gas electrical capacity, which is needed for a large datacenter expansion. I once again regret not buying GE-Vernona calls (the year is 2026. OpenAI announces AGI. GE Vernova is at 20,000. I once again regret not buying calls).
This goes against my initial take that Stargate was a desperate attempt by Altman to not get gutted by Musk - he offers a grandiose economic project to Trump to prove his value, mostly to buy time for the for-profit conversion of OpenAI to go through. The markets seem to think it's real-ish.
The quotation marks actually result in missing a few glitch tokens.
For example, _POSTSUPERSCRIPT is one token, but "_POSTSUPERSCRIPT" tokenizes into:
['"_', 'POST', 'SU', 'PERSCRIPT', '"']
Every time I've tried this, it insists that _POSTSUPERSCRIPT is a "^" symbol.
The majority of "non-standard" tokens actually return something normal in if you run it through the decoder.
For example:
tokenizer.decode([d["âĢĶâĢĶâĢĶâĢĶâĢĶâĢĶâĢĶâĢĶâĢĶâĢĶâĢĶâĢĶâĢĶâĢĶâĢĶâĢĶâĢĶâĢĶâĢĶâĢĶĊĊ"]])
returns
'————————————————————\n\n'
It might be more about how Python handles characters than multi-token characters.
R1 is also strangely willing to hallucinate novel strings, which I've never seen another LLM do in response to glitch tokens. Note that lcmriday is 3 tokens! And also stragely willing to assert that it is created by OpenAI or outright is ChatGPT.