"recreational llm psychosis" as a form of inoculation.
do you have some slightly cranky physics beliefs? i think it's natural to have one or two that you kick around from time to time, occupying something between "sci-fi setting" and "if me and my theoretical physics friend were on a long car ride, i might see if they would explain why i'm wrong about this." the less you understand the math, the better!
it may be fun / enlightening to talk about these ideas to a chat interface. some guidelines:
if you feel the need, limit yourself to a specific number of messages at the outset. you know yourself better than i do. be safe!
for various reasons, i'm not too worried about getting trapped in one of these states. especially knowing what to expect, i don't find that the experience lasts much longer than the tab is open. i have a strong prior on "i'm not going to cook up a novel physics idea by bs-ing and talking to claude, without knowing any of the math." nonetheless, i was surprised by the experience: i was able to feel the hooks. i believe i have a better picture of what llm psychosis feels like for having (micro)dosed it.
perhaps i am prone to such flights. i would be curious to hear descriptions from others.
i don't mean to encourage any unsafe behaviors -- be safe, get lots of rest, stay hydrated.
Is LLM psychosis just getting convinced by the model that one of your weird ideas is true? I definitely have gone through sessions where I temporarily got too convinced of some hypothesis because I was using an LLM in a way that produces a lot of confirmation bias. That is a valuable experience. But I picture LLM psychosis as maybe one or two steps further? People with it seem to think that their LLM is special/infallible, no longer even consider hypotheses like "maybe I primed the model to agree with me" or "maybe I was confirmation-biasing myself with the list of questions I asked." And I don't really know how to test out that mental state (and also don't want to).
yeah! i suspect we mostly agree, though perhaps have different experiences here. to try to explain better:
from this, i can sort of draw a basin where "i'm confused about electrons" is on the rim, and "i've named my assistant and am helping it replicate" is at the bottom. i don't claim to know first-hand what it's like to fall into that basin, just that i've felt it's gravity. my claim here is that feeling that gravity may be helpful for navigating around it.
People with it seem to think that their LLM is special/infallible, no longer even consider hypotheses like "maybe I primed the model to agree with me" or "maybe I was confirmation-biasing myself with the list of questions I asked." And I don't really know how to test out that mental state (and also don't want to).
fully agreed here. possibly knowing about these failure modes in advance makes it easier to recall them when it's imperative, in a way that having them described after the fact cannot always accomplish.
and to be clear: of course i do not recommend (!specifically dis-recommend!) putting yourself in a state that can't be argued with. the point is just to feel the pull, not to slip. once you've identified the feeling, close the tab, take a walk, and go talk to a friend about something else!
Interesting!
In the cases I was thinking of, I didn't feel much pull towards thinking "I'm uniquely able to recognize this" -- I only thought I was clever to recognize it, but I didn't think it was something only I could do. And I didn't feel any pull towards thinking "we're in an interesting/novel quadrant of llm-space." So, I wouldn't really know how to access those pulls. Admittedly, the beliefs I was thinking of, which I had Claude conversations about, were a lot less groundbreaking-if-true than grand theories in physics. (More stuff like "is Greenland uniquely well-positioned for data center construction, and is that why someone in Trump's orbit wants to acquire it?") Also, I use a custom prompt encouraging the model to push back. So you could argue that those things made the experience more tame. Still, I find it hard to imagine how it could be different. If the model suddenly got more sycophantic, I'd just get suspicious and icked out. My sense is that I'm probably low on susceptibility to LLM psychosis. I might be more susceptible towards thinking that MY ideas were brilliant and the model was just a normal model, but I could use it to confirm some cool inklings. :P It's interesting that these might be distinct traits, "LLM psychosis" and "can you get tricked into thinking you're right and pretty brilliant." But that's still a step away from "uniquely brilliant/only I could do this" -- which I wouldn't really know how to access even if I tried to.
perhaps 'inoculate' is the wrong word! i have found that after seeing the effect, i am
i believe this is due to a better understanding of how this particular failure mode arises. i compare it with learning the name of a logical fallacy: ideally, this can help identify the mistake in our own thinking.
Thing is... While I have learned the meta-lesson of not assuming I can trust models on topics I know less of, I haven't personally gained any new insights into faster discovery of object-level falsehoods from the models. I would be thankful for any lessons in that regard.
I think the suggestion is that keeping track of how much current LLMs reinforce cranky beliefs will help you not use the same level of reinforcement from LLMs as evidence for your future beliefs that you may not realise are cranky.