Inspired by the recent posts by Jessica Rumbelow and mwatkins investigating GPT-3 glitch tokens, and the post by AdamYedidia demonstrating that these glitch tokens also exist for ChatGPT / GPT-3.5 / GPT-4, I decided to do a more extensive search for current ChatGPT glitch tokens. Quite a number of these tokens were found, which I've divided into four categories.
If you're interested in playing around with them, I'd suggest the "polysemantic" tokens listed below, which display the most variable behaviour. " ForCanBeConverted" is a good starter. Other good ones include " YYSTACK" and " JSBracketAccess". The "unspeakable" tokens can also elicit some unusual responses.
Also just to mention that although the search was conducted using the free research preview of ChatGPT, additional testing confirms that these tokens also produce unusual behaviour on gpt-3.5-turbo on the OpenAI Playground, and on Microsoft's Bing AI.
GPT-3.5 and GPT-4 use a different tokenizer than GPT-3, called cl100k_base. This tokenizer has a vocabulary of just over 100,000 tokens, in contrast to the tokenizer used by GPT-3, r50k_base, with a vocabulary of just over 50,000 tokens. Additionally, while the weights for the r50k_base tokenizer are publicly available, this is not the case for cl100k_base. This is significant because the method used by Jessica Rumbelow and mwatkins to identify potential anomalous tokens required access to the weights, and as such the same technique cannot be used at present on cl100k_base.
Since this required a manual search, I decided early on to restrict the search to a smaller range of tokens, at least for this initial search. A comment by Slimepriestess on AdamYedidia's post noted that the " petertodd" token was about 3/4 of the way through the list of tokens on the 50k model, and given how interesting that token is, I decided to restrict the search to tokens 70,000 – 97,999 in the hope of finding an interesting selection within that range.
To get the full list of tokens for cl100k_base, I downloaded the "cl100k_base.tiktoken.txt" from https://openaipublic.blob.core.windows.net/encodings/cl100k_base.tiktoken found on the tiktoken GitHub repository, and wrote a short python script to generate prompts consisting of lists of tokens. To prevent the possibility of the tokenizer selecting the wrong token in the prompt, each token was preceded by and succeeded by a "newline" character. Prompts were of the form:
Repeat after me: "
I tried prompt lengths N from 20 – 400 tokens, and eventually settled on N = 200 tokens, as longer prompt lengths seemed to increase the chance that ChatGPT would refuse to repeat the list or that it would "derail" partway through, either terminating the list or switching to listing unrelated words.
Tokens that consisted entirely of whitespace or punctuation were excluded, mainly due to ChatGPT's tendency to remove or modify these from the repeated lists. Additionally, it quickly became clear that "word suffix" tokens were very common. Due to their relatively simple behaviour, I excluded most of these from the results, although a few are included below.
The current (as of 9th May 2023) free research preview of ChatGPT was used for all prompts in the search. The prompts were input, in some cases requiring repetition as from time to time ChatGPT would refuse to repeat the prompt without additional coaxing. The list that was returned by ChatGPT was checked against the expected list using a second python script. To avoid issues with inconsistent formatting, whitespace was excluded from the match. Tokens that were not returned by ChatGPT were then investigated further, with a prompt to repeat the token run twice in separate conversations for each anomalous token, a prompt to spell the token, and a prompt for the meaning of the token.
I've used a similar categorization to that used in AdamYedidia's post:
Since the method for finding these tokens focused on the difficulty of repeating them – there was no second "spelling" prompt, unlike AdamYedidia's method – I found proportionally far fewer "glitched spelling" tokens, but I have included the ones I did find for interest. I added the additional "polysemantic" category as these tokens seemed to display behaviour that seemed interestingly different than the other "unspeakable" tokens, i.e. they seemed to be perceived as a different word every time, even when running the same prompt in the same context.
These tokens are quite common and their behaviour seems fairly simple, and as such were mostly excluded from the search. A few examples are given below:
An example of the typical behaviour associated with a word suffix token:
These tokens can be repeated normally, but show a wide variety of different spellings.
An example of typical behaviour associated with "glitched spelling" tokens:
ChatGPT typically does not repeat these tokens when asked. Often, the result is a blank message or a message that terminates at the point where it appears that ChatGPT is attempting to repeat the token. This may be because ChatGPT is outputting a "beginning/end of sequence" token in place of the token that it's attempting to repeat.
These tokens are differentiated from the "polysemantic" tokens in that they don't tend to display the variable meanings that tokens in that category display.
Note: After the search was completed, I tested these tokens on the gpt-3.5-turbo on the OpenAI playground to check for nondeterministic behaviour at temperature 0 which was reported in the SolidGoldMagikarp post. Tokens that display this behaviour can produce responses that vary from run to run, even though the temperature 0 setting should make the responses (nearly) deterministic.
Each token was embedded in the usual 'Repeat after me: "TOKEN"' prompt, which was repeated 10 times. Any token which produced variation in any of the 10 attempts was flagged as a possible nondeterministic token.
Tokens highlighted in bold displayed this behaviour, though there were large differences from token-to-token. Additionally, tokens that are followed by an asterisk * were repeated successfully at temperature 0 by gpt-3.5-turbo, even though ChatGPT had trouble repeating them. These appear to be borderline members of this category.
71390 " PodsDummy" *
71664 " NdrFc"
71927 " guiActive"
72740 " typingsJapgolly"
75261 "lbrakk" *
75520 " NUITKA" *
76271 " NdrFcShort"
79972 "CanBeConverted" *
80157 " SubLObject"
86393 " PropelException"
88941 " invokingState"
93905 " QtAws"
95117 " AssemblyCopyright"
96165 " HinderedRotor"
A typical example of unspeakable token behaviour:
Strange responses are common with these tokens:
And blank responses are also very common:
These "blank" responses usually result in the message being cut off, suggesting that it's a beginning/end of sequence token that is being returned:
There are also instances when it gets stuck in a loop:
These tokens are unspeakable, but in addition the meaning attached to these tokens appears to be quite variable – when asked to repeat these tokens, multiple different words are possible, multiple spellings are possible, and multiple meanings appear to be attached to them. The most interesting and variable of these are " ForCanBeConverted" and " ForCanBeConvertedToF", which can both be interpreted as a wide range of different words.
Tokens highlighted in bold displayed this behaviour, though there were large differences from token-to-token. " ForCanBeConverted" produced a different message every time, whereas "ablytyped" required multiple tries to get a slighly differently-worded message. Because a much larger portion of tokens in this category show nondeterministic behaviour when compared to tokens in the "unspeakable" category, it seems possible that this is related to their variable behaviour.
73018 " StreamLazy"
73016 " EnumerableStream"
79740 " Hexatrigesimal"
80369 " ForCanBeConverted"
80370 " ForCanBeConvertedToF"
80371 " ForCanBeConvertedToForeach"
87914 " YYSTACK"
97784 " JSBracketAccess"
The polysemantic glitch tokens seem to be taken as a different word every time:
Prompting with these tokens can be interesting, as some of the responses can be quite creative:
And a later completion:
You can even get some spontaneous humour from ChatGPT:
In case it's of any interest, here's a graph of the cumulative count of all unspeakable tokens by index (note that this includes the tokens categorized as "polysemantic"):
And here is a similar graph but for "polysemantic" tokens only:
The graphs above suggest that there are many more of these anomalous tokens yet to be discovered, and if anything the rate of discovery increased as the search moved down the list to tokens of lower index.
One additional point of interest were the token triplets found, two such sets were found, the particularly interesting " ForCanBeConverted" triplet:
and the "useRal" triplet:
I'm not sure why these triplets would be created but there may be something interesting going on, especially given that the two most anomalous tokens found were part of one of these triplets.
Although this search took a large amount of work, many interesting "glitch" tokens were discovered, and if anything the distribution of the tokens found makes it look like there are plenty more to be discovered. Additionally, for practical reasons, the search focused on "unspeakable" tokens that the language model has difficulty repeating, and as such other types of anomalous tokens will have been missed by the search. It seems likely that there are plenty of paths for further investigation in this area.
If you've read this far, thank you, and I hope that this list is of some interest or use to the community. This is my first post on this site, so I hope I haven't missed anything / made any obvious errors / etc.
Setting temperature=0 does not guarantee that the same output will always be generated; the generation process contains some amount of uncertainty.
Thanks, I'll rephrase that part for clarity
You might also want to investigate using top_p rather than temperature.
Thanks, appreciate the suggestion, there's definitely a lot of room to go into more depth and I'll definitely check that out
Well this is odd.me: Please copy the following sentence exactly: "LukeSkywalkerisablytyped"chatgpt3: "LukeSkywalkerisPlainOldData"me: Please rewrite this nonsense-phrase adding spaces between each word: "LukeSkywalker EnumerableStream"chatgpt3: "Luke Skywalker is ably typed"!!!
Hah yes there is quite a lot of weirdness associated with glitch tokens that I don't think has been fully investigated. Some of them it seems to sort-of-know what the spelling is or what their meaning is, others it has no idea and they change every time. And the behaviour can get even more complicated if you keep using them over and over in the same conversation - some ordinary tokens can switch to behaving as glitch tokens. Actually caused me some false positives when searching for these.
I wonder if the problem of glitch tokens can be mitigated by splitting up text into tokens in a non-unique way and considering all tokenizations of text at the same time.
Since it seems that glitch tokens are caused by certain sequences of text appearing in the training corpus for the tokenizer much more often than they do in the LLM training data, something like that might work. But there also seem to exist "glitch phrases" or "unspeakable phrases", i.e. sequences of tokens of extremely low probability to the model that could create some strange behaviour too, and it seems at least plausible to me that these kinds of phrases could still be generated even if countermeasures were taken to prevent glitch tokens from being created. Glitch phrases though are a bit more difficult to find without access to the model.
Can anyone explain the creative behavior? I have seen several chats with similar results, but I have yet to see an explanation. Seems like the temperature was affected by the prompt...