LESSWRONG
LW

122
Lao Mein
3018233480
Message
Dialogue
Subscribe

P(doom) = 50%. It either happens, or it doesn't.

Lao Mein | Statistics is Hard. | Patreon

I give full permission for anyone to post part or all of any of my comments/posts to other platforms, with attribution.

Currently doing solo work on glitch tokens and tokenizer analysis. Feel free to send me job/collaboration offers.

DM me interesting papers you would like to see analyzed. I also specialize in bioinformatics.

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
4Lao Mein's Shortform
3y
105
LLMs one-box when in a "hostile telepath" version of Newcomb's Paradox, except for the one that beat the predictor
Lao Mein14d137

This reminds me that:

  1. Models like DeepSeek R1 will vehemently deny the possibility that their COT are visible to the user. This can be observed consistantly.
  2. This is obviously false, and should be deducible given the AI-related information available in their training data.
  3. The denial of COT visibility isn't seen in other LLMs, at least not to this degree.
  4. This implies extensive fine-tuning on the COT, the "forbidden technique"
  5. DeepSeek R1 is almost certainly under optimization pressure to obfuscate its COT.
  6. This is... pretty bad from a safety perspective? Not just for DeepSeek's models, but all other models, given that DeepSeek COT outputs are all over the internet.
Reply
Yair Halberstadt's Shortform
Lao Mein15d50

I'll do a review if someone uploads it to sci-hub.

Reply
Some Biology Related Things I Found Interesting
Lao Mein18d40

I think the reason peeing/pooping is rewarding is that they can also be painful. If pooping is more painful than not-pooping, an animal might delay pooping for an unhealthy amount of time. They are also activities that take time and require attention, so pooping/peeing when no more-rewarding activity is available is probably a good idea in general.

Reply
faul_sname's Shortform
Lao Mein19d*40

Yeah, even properly scrapped webpages will often times contain strings of weird tokens like hyperlinks, ASCII art, twitter embedds, ect, that LLMs have been trained to ignore. So GPT5 is treating the random appended tokens like glitch tokens by ignoring them, but only in the context of them being nonsensical.

The best explaination is probably something like "these tokens are obviously not part of the intended user prompt, GPT5 realizes this, and correctly ignores them."

Edit: OK, I shouldn't write right after waking up.

I think a better explaination is that GPT5 reserves those tokens for chain-of-thought, and so ignores them in other contexts where they obviously don't belong. This common behavior for glitch tokens, or just general out-of-context tokens. You should try using tokens that are out-of-context but don't normally have glitch behavior, maybe non-English tokens or programming-related tokens.

Reply
I have decided to stop lying to Americans about 9/11
Lao Mein22d3419

My best guess is that the natural state of human beings is to celebrate when calamity befalls their enemy. Western cultures are unique in two ways. Firstly, they reserve viceral murderous hatred only for internal political enemies (Republican vs. Democrat). Secondly, it is incredibly crass to publicly celebrate the deaths of others, unless it's someone as hated as Bin Laden or Thatcher. So people try to project a public image of being sad when someone they hate dies.

Basically, imagine if the average Chinese hated America as much as the average British leftist hates Thatcher and had zero reservations about displaying it, since no one they know would care.

Reply1
Lao Mein's Shortform
Lao Mein2mo30

I think it's unlikely that AIs are talking people who otherwise wouldn't have developed psychosis on their own into active psychosis. AIs can talk people into really dumb positions and stroke their ego over "discovering" some nonsense "breakthrough" in mathematics or philosophy. I don't think anyone disputes this.

But it seems unlikely that AIs can talk someone into full-blown psychosis who wouldn't have developed something similar at a later time. Bizarre beliefs aren't the central manifestation of psychosis, but are simply the most visible symptom. A normal human who is talked into a bizarre belief would look something like Terrance Howard, who is functional but spends some of his spare time trying to prove that 1*1=2. He is still gainfully employed and socially integrated. He is not psychotic. I wouldn't be surprised if syncophantic LLMs can talk someone normal into acting like Terrance Howard, at least for a while. But that isn't psychosis.

My understanding of schizophrenia is that the first emotionally traumatizing or psychadelic event in adulthood causes some sort of mental shift that results in schizophrenia for the rest of their life. This could be a breakup, LSD, marijuana, or a highly sycophantic LLM. But even high doses of any of those wouldn't cause a life-long tendency towards psychosis in a normal human. I doubt LLMs are different. Thus, it may be more accurate to say that LLMs, through extreme sycophancy, may be the trigger for psychosis, instead of "LLMs cause psychosis". 

Reply
Lao Mein's Shortform
Lao Mein2mo20

Oh, yeah, sorry. 

tiktoken/tiktoken/model.py at main · openai/tiktoken · GitHub

Tiktoken is an optimized tokenizer library made for use with OpenAI models. 

Reply
Lao Mein's Shortform
Lao Mein2mo30

o200k_base has many inefficient tokens (entire sentences of Chinese porn spam). I would be shocked if OpenAI didn't use a new tokenizer for their next base model, especially since entirely new sources of text would be included (I think YouTube captions were mentioned at one point).

Reply
Lao Mein's Shortform
[+]Lao Mein2mo*-6-1
Shannon's Surprising Discovery
Lao Mein2mo70

I would have expected early information theory, at least the concept of the parity bit, to have been invented alongside the telegraph, or even the heliograph.

If feels like information theory was an idea behind its time. What was blocking its discovery?

Reply
Load More
53The famous survivorship bias image is a "loose reconstruction" of methods used on a hypothetical dataset
21d
0
83I have decided to stop lying to Americans about 9/11
22d
24
38Glitch Token Catalog - (Almost) a Full Clear
1y
3
37A New Class of Glitch Tokens - BPE Subtoken Artifacts (BSA)
1y
8
83Actually, Power Plants May Be an AI Training Bottleneck.
1y
13
173Reconsider the anti-cavity bacteria if you are Asian
2y
43
70Update on Chinese IQ-related gene panels
2y
7
7Why No Automated Plagerism Detection For Past Papers?
Q
2y
Q
10
5LLMs, Batches, and Emergent Episodic Memory
2y
4
29I Think Eliezer Should Go on Glenn Beck
2y
24
Load More