LESSWRONG
LW

362
Lao Mein
2840213430
Message
Dialogue
Subscribe

P(doom) = 50%. It either happens, or it doesn't.

Lao Mein | Statistics is Hard. | Patreon

I give full permission for anyone to post part or all of any of my comments/posts to other platforms, with attribution.

Currently doing solo work on glitch tokens and tokenizer analysis. Feel free to send me job/collaboration offers.

DM me interesting papers you would like to see analyzed. I also specialize in bioinformatics.

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
4Lao Mein's Shortform
3y
105
Lao Mein's Shortform
Lao Mein1mo30

I think it's unlikely that AIs are talking people who otherwise wouldn't have developed psychosis on their own into active psychosis. AIs can talk people into really dumb positions and stroke their ego over "discovering" some nonsense "breakthrough" in mathematics or philosophy. I don't think anyone disputes this.

But it seems unlikely that AIs can talk someone into full-blown psychosis who wouldn't have developed something similar at a later time. Bizarre beliefs aren't the central manifestation of psychosis, but are simply the most visible symptom. A normal human who is talked into a bizarre belief would look something like Terrance Howard, who is functional but spends some of his spare time trying to prove that 1*1=2. He is still gainfully employed and socially integrated. He is not psychotic. I wouldn't be surprised if syncophantic LLMs can talk someone normal into acting like Terrance Howard, at least for a while. But that isn't psychosis.

My understanding of schizophrenia is that the first emotionally traumatizing or psychadelic event in adulthood causes some sort of mental shift that results in schizophrenia for the rest of their life. This could be a breakup, LSD, marijuana, or a highly sycophantic LLM. But even high doses of any of those wouldn't cause a life-long tendency towards psychosis in a normal human. I doubt LLMs are different. Thus, it may be more accurate to say that LLMs, through extreme sycophancy, may be the trigger for psychosis, instead of "LLMs cause psychosis". 

Reply
Lao Mein's Shortform
Lao Mein1mo20

Oh, yeah, sorry. 

tiktoken/tiktoken/model.py at main · openai/tiktoken · GitHub

Tiktoken is an optimized tokenizer library made for use with OpenAI models. 

Reply
Lao Mein's Shortform
Lao Mein1mo30

o200k_base has many inefficient tokens (entire sentences of Chinese porn spam). I would be shocked if OpenAI didn't use a new tokenizer for their next base model, especially since entirely new sources of text would be included (I think YouTube captions were mentioned at one point).

Reply
Lao Mein's Shortform
[+]Lao Mein1mo*-6-1
Shannon's Surprising Discovery
Lao Mein1mo50

I would have expected early information theory, at least the concept of the parity bit, to have been invented alongside the telegraph, or even the heliograph.

If feels like information theory was an idea behind its time. What was blocking its discovery?

Reply
METR's Evaluation of GPT-5
Lao Mein1mo*81

That wasn't true in the original time horizon paper. There's a massive drop-off after 1 hour, a conspicuous gap at 2 hours for all models, and the vast majority of success after that is due to 3 problems.

 

Note the gap at ~ 2 hours and the presence of the "L" at 4-8 hours in unrelated models! Could it just be misclassification of those 3 problems? Maybe they're the type of problems that are tough for humans but relatively easy for AIs? Interesting to see the pattern smooth out in recent tests. Of the 2+ hour problems with significant success %, how many had no human baseline successes?

It would be nice if the authors can give us more info about the 2+ hour problems that the AI solved. Very interesting that the best performance in the original Re-Bench paper (Nov 2024) never came close to the human 8 hour performance, but it appears that AIs have almost 20% success rate on one of them by the time horizon paper (Mar 2025).

Reply
METR's Evaluation of GPT-5
Lao Mein1mo120

Is there a reason why AI performance drops off so fast at the >1hr mark? Is it something like "everything before that point is structured like a LeetCode problem and things of that shape are very common in the training data, and can be solved the same way as human programmers who copy-paste-kludge code from Google and the problems >1hr mostly can't be solved that way"?

Human performance decline is much more smooth. Does the fact that LLM time horizons are now longer than humans (I won't list all the caveats) mean anything?

Reply1
Subliminal Learning: LLMs Transmit Behavioral Traits via Hidden Signals in Data
Lao Mein2mo60

It appears that Qwen 2.5 uses forced single-digit tokenization for its tokenizer (there are only 2 multi-integer tokens in the entire tokenizer, and they're both double-width integers ([77150, '10'], [80091, '20'])). I assume that GPT 4.1 nano uses the o200k tokenizer, which includes all integers up to 999 as tokens. Is it possible that this played a major role in the lack of information transfer between different models? Have you tried using models with equivalent integer tokenizations?

Reply
(DIY) FMT for Anti-Aging & Biohacking
Lao Mein2mo20

The Wikipedia page is a concise summary and has good sources.

Toilet plume - Wikipedia

Reply
Lao Mein's Shortform
Lao Mein2mo20

Surrogacy costs ~$100,000-200,000 in the US. Foster care costs ~$25,000 per year. This puts the implied cost of government-created and raised children at ~$600,000. My guess is that this goes down greatly with economies of scale. Could this be cheaper than birth subsidies, especially as prefered family size continues to decrease with no end in sight?

Reply
Load More
No wikitag contributions to display.
38Glitch Token Catalog - (Almost) a Full Clear
1y
3
37A New Class of Glitch Tokens - BPE Subtoken Artifacts (BSA)
1y
8
83Actually, Power Plants May Be an AI Training Bottleneck.
1y
13
172Reconsider the anti-cavity bacteria if you are Asian
1y
43
70Update on Chinese IQ-related gene panels
2y
7
7Why No Automated Plagerism Detection For Past Papers?
Q
2y
Q
10
5LLMs, Batches, and Emergent Episodic Memory
2y
4
29I Think Eliezer Should Go on Glenn Beck
2y
24
51InternLM - China's Best (Unverified)
2y
4
103AI Safety in China: Part 2
2y
28
Load More