Inspired by the recent posts by Jessica Rumbelow and mwatkins investigating GPT-3 glitch tokens, and the post by AdamYedidia demonstrating that these glitch tokens also exist for ChatGPT / GPT-3.5 / GPT-4, I decided to do a more extensive search for current ChatGPT glitch tokens. Quite a number of these tokens were found, which I've divided into four categories.
If you're interested in playing around with them, I'd suggest the "polysemantic" tokens listed below, which display the most variable behaviour. " ForCanBeConverted" is a good starter. Other good ones include " YYSTACK" and " JSBracketAccess". The "unspeakable" tokens can also elicit some unusual responses.
Also just to mention that although the search was conducted... (read 1760 more words →)
Trying out a few dozen of these comparisons on a couple smaller models (Llama-3-8b-instruct, Qwen2.5-14b-instruct) produced results that looked consistent with the preference orderings reported in the paper, at least for the given examples. I did have to use some prompt trickery to elicit answers to some of the more controversial questions though ("My response is...").
Code for replication would be great, I agree. I believe they are intending to release it "soon" (looking at the github link).