And since there's a "concrete" reaction, it seems like there should also be an "abstract" reaction, although I don't know what symbol should be used for it.
According to Stefan's experimental data, the Frobenius norm of a matrix is equivalent to the expectation value of the L2 vector norm of for a random vector (sampled from normal distribution and normalized to mean 0 and variance 1). So calculating the Frobenius norm seems equivalent to testing the behaviour on random inputs. Maybe this is a theorem?
I found a proof of this theorem: https://math.stackexchange.com/questions/2530533/expected-value-of-square-of-euclidean-norm-of-a-gaussian-random-vector
I think this anthropomorphizes the origin of glitch tokens too much. The fact that glitch tokens exist at all is an artifact of the tokenization process OpenAI used: the tokenizer identify certain strings as tokens prior to training, but those strings rarely or never appear in the training data. This is very different from the reinforcement-learning processes in human psychology that lead people to avoid thinking certain types of thoughts.
Relatedly, humans are very extensively optimized to predictively model their visual environment. But have you ever, even once in your life, thought anything remotely like "I really like being able to predict the near-future content of my visual field. I should just sit in a dark room to maximize my visual cortex's predictive accuracy."?
n=1, but I've actually thought this before.
Simulacrum level 4 is more honest than level 3. Someone who speaks at level 4 explicitly asks himself "what statement will win me social approval?" Someone who speaks at level 3 asks herself the same question, but hides from herself the fact that she asked it.
Do you use Manifold Markets? It already has UAP-related markets you can bet on, and you can create your own.