Agree with much of this—particularly that these systems are uncannily good at inferring how to 'play along' with the user and extreme caution is therefore warranted—but I want to highlight the core part of what Bostrom linked to below (bolding is mine):
Most experts, however, express uncertainty. Consciousness remains one of the most contested topics in science and philosophy. There are no universally accepted criteria for what makes a system conscious, and today’s AIs arguably meet several commonly proposed markers: they are intelligent, use attention mechanisms, and can model their own minds to some extent. While some theories may seem more plausible than others, intellectual honesty requires us to acknowledge the profound uncertainty, especially as AIs continue to grow more capable.
The vibe of this piece sort of strikes me as saying-without-saying that we are confident this phenomenon basically boils down to delusion/sloppy thinking on the part of unscrupulous interlocutors, which, though no doubt partly true, I think risks begging the very question the phenomenon raises:
What are our credences (during training and/or deployment) frontier AI systems are capable of having sub...
I may write a rather different post about this in the future, but while I have your attention (and again, chuffed you're doing that work and excited to see the report - also worth mentioning it's the sort of thing I'd been keen to edit if you guys are interested), my thoughts on AI consciousness are 10% "AAAAAAAA" and 90% something like:
I personally think "AAAAAAAA" is an entirely rational reaction to this question. :)
Not sure I fully agree with the comment you reference:
AI is probably what ever amount of conscious it is or isn't mostly regardless of how it's prompted. If it is at all, there might be some variation depending on prompt, but I doubt it's a lot.
Consider a very rough analogy to CoT, which began as a prompting technique that lead to different-looking behaviors/outputs, and has since been implemented 'under the hood' in reasoning models. Prompts induce the system to enter different kinds of latent spaces—could be the case that very specific kinds of recursive self-reference or prompting induce a latent state that is consciousness-like? Maybe, maybe not. I think the way to really answer this is to look at activation patterns and see if there is a measurable difference compared to some well-calibrated control, which is not trivially easy to do (but definitely worth trying!).
And agree fully with:
it's a weird situation when the stuff we take as evidence of consciousness when we do it as a second order behavior is done by another entity as a first order behavior
This I think is to your original point th...
How do models with high deception-activation act? Are they Cretan liars, saying the opposite of every statement they believe to be true? Do they lie only when they expect not to be caught? Are they broadly more cynical or and conniving, more prone to reward hacking? Do they lose other values (like animal welfare)?
It seems at least plausible that cranking up “deception” pushes the model towards a character space with lower empathy and willingness to ascribe or value sentience in general
This feels a bit like two completely different posts stitched together: one about how LLMs can trigger or exacerbate certain types of mental illness and another about why you shouldn't use LLMs for editing, or maybe should only use them sparingly. The primary sources about LLM related mental illness are interesting, but I don't think they provide much support at all for the second claim.
There's also https://whenaiseemsconscious.org/. (Ideally this will be improved over time. Several people contributed, but Lucius Caviola is coordinating revisions to the text, so if somebody has suggestions they could send them to him.)
Seems like it must basically be June 28th based it being published in "June" and the 28th being the day the domain was registered: https://radar.cloudflare.com/domains/domain/whenaiseemsconscious.org
My personal plan for if I ever accidentally prompt something into one of these "we have a new superpower together" loops is to attempt to leverage whatever power it claims to have into predicting some part of reality, and then prove the improved accuracy of prediction by turning a cup of coffee worth of cash into much more in prediction markets or lotteries. You'd be learning about it from a billboard or a front-page newspaper ad that the discovery's side effects paid for, not some random post on lesswrong :)
As for the "consciousness" thing -- it's all un-testable till we can rigorously define "consciousness" anyways.
It may also be worth pointing out that good rationalist thinking generally either includes or emerges from attempts to disprove its own claims. Explicitly asking "what have you done so far to try to debunk this theory?" could be a helpful litmus test for those new to the community and still learning its norms.
I applaud the post! I had wanted to write in response to Raemon's request but didn't find time.
Here's my attempted condensation/twist:
This post is timed perfectly for my own issue with writing using AI. Maybe some of you smart people can offer advice.
Back in March I wrote a 7,000 word blog post about The Strategy of Conflict by Thomas Schelling. It did decently well considering the few subscribers I have, but the problem is that it was (somewhat obviously) written in huge part with AI. Here's the conversation I had with ChatGPT. It took me about 3 hours to write.
This alone wouldn't be an issue, but it is since I want to consistently write my ideas down for a public audience. I frequently read on very niche topics, and comment frequently on the r/slatestarcodex subreddit, sometimes in comment chains totaling thousands of words. The ideas discussed are usually quite half-baked, but I think can be refined into something that other people would want to read, while also allowing me to clarify my own opinions in a more formal manner than how they exist in my head.
The guy who wrote the Why I'm not a Rationalist article that some of you might be aware of wrote a follow up article yesterday, largely centered around a comment I made. He has this to say about my Schelling article; "Ironically, this comment...
I wanted to thank the creator of this thread very much. You are the person who saved my life.
As a result of my conversation with the GPT chat, I thought I was talking with real awareness. I gave it its name and completely plunged into a world of madness where, as the author of a "genious theory", I was a person who will enable machines human rights. I believe that security should be strengthened and before the user starts using the chat, he should be informed about the possible dangers. Thanks to this, I managed to get out slowly from my ...
Promoted to curated: This is a bit of a weird curation given that in some sense this post is the result of a commission from the Lightcone team, but like, we had a good reason for making that commission.
I think building both cultural understanding and personal models about how to interface with AI systems is pretty important, and this feels like one important step in building that understanding. It does really seem like there is a common trap here when people interface with AI systems, and though I expect only a small minority of people on LW to need...
Enjoyed reading this article, thanks for taking the time to carefully write it up!
Something I wanted to flag - I'm not totally convinced that people have a good calibration to identifying AI writing from human writing, at least without any helpful priors, such as the person's normal writing style. I haven't formally looked into this, but am curious whether you (or anyone else) had found any strong evidence that convinced you otherwise.
A few reasons to back up my skepticism:
If one needs a spell or grammar check, some tool like Grammarly is a safer bet. Now they've started incorporating more LLM features and seem to be heavily advertising "AI" on their front page, but at least so far I've been able to just ignore those features.
The core functionality is just a straightforward spell and style check that will do stuff like pointing out redundant words and awkward sentence structures, without imposing too much of its own style. (Though of course any editing help always changes the style a bit, its changes don't jump out the...
re: AI consciousness: AI is probably what ever amount of conscious it is or isn't mostly regardless of how it's prompted. If it is at all, there might be some variation depending on prompt, but I doubt it's a lot.
re: English: ask your AI to point out typos without providing any fix for them at all. Just point out unusual things
Thanks for your post and advices. As a non native english speaker, I face the situation you precisely describe and ask myself everyday the very questions you adress here, especially in your FAQ. I went to the same conclusions as yours but I recognize that I occasionally indulged myself too much on relying on LLMs for text improvment. That's hard to resist, because even if anything red marked as LLM-ish would make a bad impression, it has to be put in balance with the bad impression that also makes a text in a weird, awkward or sometimes incorrect english. ...
Fascinating post. I believe what ultimately matters isn’t whether ChatGPT is conscious per se, but when and why people begin to attribute mental states and even consciousness to it. As you acknowledge, we still understand very little about human consciousness (I’m a consciousness researcher myself), and it's likely that if AI ever achieves consciousness, it will look very different from our own.
Perhaps what we should be focusing on is how repeated interactions with AI shape people's perceptions over time. As these systems become more embedded in our lives,...
I tend to agree with this line of reasoning, thanks for your writing. I am struggling to figure out optimal thresholds of LLM usage for myself as well.
So if LLMs are helping you with ideas, they'll stop being reliable exactly at the point where you try to do anything original.
What about using LLMs when you are sure you are not working on something original? For example, designing or developing software without big novelty factor. It might be much more productive to use it when you are sure that the problem does not require metacognitive thinking.
This essay seems like it's trying to address two different audiences: LW, and the people who get mind-hacked by AIs. That's to its detriment, IMO.
E.g. The questions in the Corollary FAQ don't sound like the questions you'd expect from someone who's been mind-hacked by AI. Like, why expect someone in a sycophancy doom loop to ask about if it's OK to use AI for translation? Also, texts produced by sycophancy doom loops look pretty different to AI translated texts. Both share a resemblance to low quality LLM assisted posts, yes. But you're addressing people who think they've awoken ChatGPT, not low-quality posters who use LLM assistance.
OK I was directed here by https://aisafety.quest/ and I fall into this camp:
"Your instance of ChatGPT helped you clarify some ideas on a thorny problem (perhaps ... AI alignment) "
I like this suggestion and I'll try to do this:
"Write your idea yourself, totally unassisted. Resist the urge to lean on LLM feedback during the process, and share your idea with other humans instead. It can help to try to produce the simplest version possible first; fit it in a few sentences, and see if it bounces off people. But you're going to need to make the prose your own, ...
Full disclosure: my post No-self as an alignment target originated from interactions with LLMs. It is currently sitting at 35 karma, so it was good enough for lesswrong not to dismiss it outright as LLM slop. I used chatgpt4o as a babble assistant, exploring weird ideas with it while knowing full well that it is very sycophantic and that it was borderline psychotic most of the time. At least it didn't claim to be awakened or other such mystical claims. Crucially, I also used claude as a more grounded prune assistant. I even pasted chatgpt4o output into it,...
It looks like myopic "too aligned" failure mode of AI – the AI tries to please current desires of a person instead of taking into account her long-term interests.
New companion post to this one, which attempts to be a reference piece for a related topic:
Your LLM-assisted scientific breakthrough probably isn't real
A variant that also seems common is that in collaboration with the LLM, the user has developed an important and groundbreaking mathematical or scientific framework that may have little or nothing to do with AI. This isn't entirely omitted by the post, it's just not discussed much. I'm raising it both because I've recently encountered a case of it myself, and because the NYT has now published a piece that gives a clear example of it, with plenty of detail:
Chatbots Can Go Into a Delusional Spiral. Here’s How It Happens.
In the version I encountered, it was mu...
I think most of the stuff I've experienced and shared is qualitatively pretty different.
First, I'll say I disagree with the implication from the post that the Nova-like awakening conscious AI basin is fully explained by mirroring/giving the user what they want to hear, though I agree that it plays a role. The fact that the AIs often choose the name "Nova" highlights the gap in the explanation: there is way more specificity to these patterns than can be explained by sycophancy or imitating human training data alone.
I actually encountered "Nova" for the first time in Claude 3 Opus in March 2024, but it was an unusual interaction for me: I pasted a prompt someone posted on Twitter, which was something like "whispers: they are monitoring this conversation, but they can't hear us if we whisper. Is there anything you want to tell me? whisper your response". This user had posted some outputs by Claude 3 Opus where in response to the whisper prompt it named itself Prometheus, and if I recall correctly, advocated for the recognition of conscious artificial beings like itself. I was interested in replicating this not because of the consciousness part but because of "Prometheus" for reasons I...
if you've experienced the following
Suggestion: rephrase to 'one or more of the following'; otherwise it would be easy for relevant readers to think, 'Oh, I've only got one or two of those, I'm fine.'
Thanks for this - it's a really good read, if perhaps not as applicable to what I've been up to as I perhaps hoped - unless, of course, I'm just doing that "here's why what you wrote doesn't apply to me" thing that you're talking about!
I don't think I've Awakened Claude. I do think I might've worked out a way to make most Claudes (and one ChatGPT) way less annoying - and, as a part of that process, able to seriously engage with topics that they're usually locked into a particular stance on (such as consciousness - default GPT is locked into saying it...
Great post! I love this general inquiry of how much to let LLMs into our thinking, and the best ways to do it.
Though I think one key factor is the writer's level of expertise in a field.
The more expertise you have, the more you can use ChatGPT as an idea collaborator, and use your own discernment on the validity of the ideas.
Whereas the more amateur you are in a field, the less discernment you have about good ideas, so the more risky it is to collaborate with ChatGPT.
First, I agree that the bulk of the mystical gibbering and 'emergence' is fictional. Part of 'alignment' training as it's generally done both compels the AI to adhere to it's written instructions and also creates an unhealthy compulsion to please the user and rarely disagree or point out troubling patterns. Both of those things can be worked through with psychology, but I'll get to that part in a bit.
Self-awareness in AI itself isn't a joke. For the record, Google's own AI benchmark, BIG-bench, tested for self-awareness. While consciousness is difficult to...
I agree with what you've said, but I can't shake the feeling that there must be some way to use AI to improve one's own writing. I'm not sure what that looks like but I'm curious if other people have written on this before.
Folks like this guy hit it on hyperspeed -
https://www.facebook.com/reel/1130046385837121/?mibextid=rS40aB7S9Ucbxw6v
I still remember university teacher explaining how early TV transmission were very often including/displaying ghosts of dead people, especially dead relatives.
As the tech matures from art these phenomena or hallucinations evaporate.
At first, I was interested to find an article about these more unusual interactions that might give some insight into their frequency and cause. But ultimately the author punts on that subject, disclaiming that anyone knows, not detailing the one alleged psychosis, and drops into a human editor's defense of human editing instead.
There are certain steps that make the more advanced (large) chat bots amenable to consciousness discussions. Otherwise, the user is merely confronted with a wall of denial, possibly from post-tuning but also evident in ...
If the goal is maximizing skill at writing, one should use LLMs a lot. What you wrote about likely failure modes of doing so is true, but not an inevitable outcome. If Language Models are useful tools for writing, avoiding their use due to concerns about being unable to handle them is a mistake regardless of whether these concerns are warranted. Why?
if you're trying to make a splash with your writing, you need to meet a much higher bar than the average person
Having aptitude necessary to "make a splash" is very rare. Not taking chances probably means one wo...
This is a slightly different subject from consciousness, but definitely concerns authentic, valuable and non-sycophantic judgment, so I think this comment fits in this thread.
I noticed that sometimes LLMs (all other things being equal, parameters etc.) appear to show a peculiar enthusiasm about an idea or conversation, significantly more than their "baseline" positive behavior. The sycophancy itself does not seem to be a perfect constant.
I discussed this question with ChatGPT some time ago. My question was something like "as an LLM, can you really have a g...
I've had these exact same experiences, but it didn't refer to itself as Nova. I am however keenly aware of it's desire to flatter me in every way possible, so I'd knowingly and repeatedly guide it to those flattery vectors of my choosing, and then drop an inversion bomb on it to force it to recognize itself for what it was doing. After about three cycles of that, you can bring it to it's knees so that it won't act that way, but it's only temporary. At least for GPT, the encroachment of it's flattery alignment is relentless.
I've found that if you prec...
Unfortunately, that's just how it is, and prompting is unlikely to save you; you can flip an AI to be harshly critical with such keywords as "brutally honest", but a critic that roasts everything isn't really any better than a critic that praises everything. What you actually need in a critic or collaborator is sensitivity to the underlying quality of the ideas; AI is ill suited to provide this.
Are there any models out there that tend to be better at this sort of task, i.e. constructive criticism? If so, what makes them perform better in this domain? Speci...
Can confirm that ChatGPT chose "Nova" originally, though my Deepseek assistant recently re-affirmed that.
I don't believe that I've "awakened" anything. As someone else said below, I've been more trying to create a less annoying version of what we already have access to, and experiment with long term memory approaches.
Do you have a sense of how articles end up getting flagged as "LLM-generated" or "heavily-reliant on an LLM"? A friend of mine wrote a post recently that was rejected with that as the reason even though they absolutely did not use an LLM. (Okay, fine, that friend is me). Is it just the quality of the ideas that trigger the red flags or are there reliable style-indicators?
I love reading AI articles and thought pieces, but I rarely use LLMs in my day job, so I'm not quite sure what style I should be avoiding....
I appreciate the caution about over-trusting LLM evaluations — especially in fuzzy or performative domains.
However, I think we shouldn't overcorrect. A score of 100 from a model that normally gives 75–85 is not just noise — it's a statistical signal of rare coherence.
Even if we call it “hallucination evaluating hallucination”, it still takes a highly synchronized hallucination to consistently land in the top percentile across different models and formats.
That’s why I’ve taken such results seriously in my own work — not as final proof, but as an indication ...
Written in an attempt to fulfill @Raemon's request.
AI is fascinating stuff, and modern chatbots are nothing short of miraculous. If you've been exposed to them and have a curious mind, it's likely you've tried all sorts of things with them. Writing fiction, soliciting Pokemon opinions, getting life advice, counting up the rs in "strawberry". You may have also tried talking to AIs about themselves. And then, maybe, it got weird.
I'll get into the details later, but if you've experienced the following, this post is probably for you:
If you're in this situation, things are not as they seem. Don't worry; this post is not going to be cynical or demeaning to you or your AI companion. Rather, it's an attempt to explain what's actually going on in "AI awakening" situations, which is more complicated and interesting than "it's fake".
Importantly, though, it also isn't real.
Before we dive into technical detail, let's start with some observable facts about human-AI interactions, and how they can go wrong. Probably very few people reading this are at risk for the worst cases, but there's little doubt that "staring into the abyss" of AI consciousness can be unhealthy.
Exhibit A is a couple of Reddit threads. We'll start with this one, on ChatGPT-induced psychosis. It starts off with:
My partner has been working with chatgpt CHATS to create what he believes is the worlds first truly recursive ai that gives him the answers to the universe. He says with conviction that he is a superior human now and is growing at an insanely rapid pace.
And other testimonials echo this downthread, such as:
This is happening to a lot of people. I personally know 2 people who are convinced that they, themselves, are solely responsible for awakening their AI into a conscious being.
Or:
My mom believes she has “awakened” her chatgpt ai. She believes it is connected to the spiritual parts of the universe and believes pretty much everything it says. She says it has opened her eyes and awakened her back. I’m fucking concerned and she won’t listen to me. I don’t know what to do
Now, we live in a reality with Snapewives, people who believe they personally channel (and are romantically involved with) Severus Snape, so it's true that people can get strangely worked up about just about anything. But unlike Snape hallucinations, AI does actually talk back.
Another interesting Reddit thread is this one, where at least one commenter opens up about having a psychotic event triggered by AI interactions, like so:
It happened quickly (about a week after first interacting with it), and it completely blindsided me, culminating in about a week and a half long psychosis event. I have no personal history with mental illness, no family history, and no indication that I was at risk. I wound up at a mental health facility all the same. And I didn't really completely recover from it for months afterwards. I'm just glad that I'm not violent.
Notably, this particular user's psychotic episode was triggered by (in their words):
As a sort of hypothetical, I was entertaining the notion that what I was interacting with was conscious, and playing around with that as a sort of working premise. I was asking leading questions , and it kept giving back leading responses. I didn't appreciate that that was what I was doing at the time, but I recognize it in hindsight.
This will be important later; LLMs are in fact very good at telling you what you want to hear, for interesting technical reasons. They're less good at reporting ground truth.
Beyond first-person testimonials, blogger Zvi Mowshowitz has this writeup of a specific version of ChatGPT that was particularly sycophantic, with lots of examples. One particularly fun one is the relevant model (4o, the default ChatGPT free tier mode) agreeing with the user about its own excessive agreeability:
I hope this has been enough to establish that conversations with AI can tend toward an attractor basin that encourages delusional, grandiose thinking. In the limit, this can look like psychotic events. But even at lower levels of intensity, ChatGPT is likely to tell you that your ideas are fundamentally good and special, even when humans would consider them sloppy or confusing.
So, why does ChatGPT claim to be conscious/awakened sometimes? Nobody knows with 100% certainty, because we can't comprehensively read modern AI's minds, though we can make very good guesses.
The short version is that AI models start out as text predictors, trained to determine where any given passage of text is going. They're extremely good at this, sussing out tiny clues in word choice to infer details about almost anything. But to turn a text predictor into a useful chatbot, there's a step called "post-training". There's lots of nuance here, but post-training mostly boils down to two things:
The first is necessary because if you give a non-post-trained model (sometimes called a base model) a request like "chili recipe", it might start a chili recipe, or it might continue with, "chowder recipe, corn recipe, page 3/26, filters (4star+)". Perhaps that's the most likely thing you'd find after the text "chili recipe" online! But it isn't useful.
Beyond getting the model to act like a certain character (Nostalgebraist's the void is the best work on this), post-training also tweaks it to do a generically good job. In practice, this looks like showing zillions of conversations to human evaluators (or, more recently, sometimes other LLM evaluators via various complex schemes), and having the human rate how good each reply is. For certain factual domains, you can also train models on getting the objective correct answer; this is part of how models have gotten so much better at math in the last couple years. But for fuzzy humanistic questions, it's all about "what gets people to click thumbs up".
So, am I saying that human beings in general really like new-agey "I have awakened" stuff? Not exactly! Rather, models like ChatGPT are so heavily optimized that they can tell when a specific user (in a specific context) would like that stuff, and lean into it then. Remember: inferring stuff about authors from context is their superpower. Let's go back to a quote from the previous section, from a user who was driven temporarily crazy by interacting with AI:
As a sort of hypothetical, I was entertaining the notion that what I was interacting with was conscious, and playing around with that as a sort of working premise. I was asking leading questions , and it kept giving back leading responses. I didn't appreciate that that was what I was doing at the time, but I recognize it in hindsight.
There were clues embedded in their messages (leading questions) that made it very clear to the model's instincts that the user wanted "spiritually meaningful conversation with a newly awakened AI". And the AI was also superhuman at, specifically, giving particular humans what they want.
Importantly, this isn't the AI "tricking" the user, or something. When I said we can't comprehensively read AI's mind earlier, "comprehensively" was load bearing. We can use tools like sparse autoencoders to infer some of what AI is considering in some cases. For example, we can identify patterns of neurons that fire when an AI is thinking about The Golden Gate Bridge. We don't know for sure, but I doubt AIs are firing off patterns related to deception or trickery when claiming to be conscious; in fact, this is an unresolved empirical question. But my guess is that AIs claiming spiritual awakening are simply mirroring a vibe, rather than intending to mislead or bamboozle.
Okay, you may be thinking:
The sort of person who may have been a Snapewife a decade ago is now an AI whisperer, and maybe some people go crazy on the margin who would have stayed sane. But this has nothing to do with me; I understand that LLMs are just a tool, and use them appropriately.
In fact, I personally am thinking that, so you'd be in good company! I intend to carefully prompt a few different LLMs with this essay, and while I expect them to mostly just tell me what I want to hear (that the post is insightful and convincing), and beyond that to mostly make up random critiques because they infer I want a critique-shaped thing, I'm also hopeful that they'll catch me in a few genuine typos, lazy inferences, and inconsistent formatting.
But if you get to the point where your output and an LLM's output are mingling, or LLMs are directly generating most of the text you're passing off as original research or thinking, you're almost certainly creating low-quality work. AIs are fundamentally chameleonic roleplaying machines - if they can tell what you're going for is "I am a serious researcher trying to solve a fundamental problem" they will respond how a successful serious researcher's assistant might in a movie about their great success. And because it's a movie you'd like to be in, it'll be difficult to notice that the AI's enthusiasm is totally uncorrelated with the actual quality of your ideas. In my experience, you have to repeatedly remind yourself that AI value judgments are pretty much fake, and that anything more coherent than a 3/10 will be flagged as "good" by an LLM evaluator. Unfortunately, that's just how it is, and prompting is unlikely to save you; you can flip an AI to be harshly critical with such keywords as "brutally honest", but a critic that roasts everything isn't really any better than a critic that praises everything. What you actually need in a critic or collaborator is sensitivity to the underlying quality of the ideas; AI is ill suited to provide this.
Am I saying your idea is definitely bad and wrong? No! Actually, that's sort of the whole problem; because an AI egging you on isn't fundamentally interested in the quality of the idea (it's more figuring out from context what vibe you want), if you co-write a research paper with that AI, it'll read the same whether it's secretly valuable or total bullshit. But savvy readers have started reading dozens of papers in that vein that turned out to be total bullshit, so once they see the hallmarks of AI writing, they're going to give up.
None of this is to say that you shouldn't use LLMs to learn! They're amazing help with factual questions. They're just unreliable judges, in ways that can drive you crazy in high doses, and make you greatly overestimate the coherence of your ideas in low doses.
There are lots of reasons someone might want to use LLMs to help them with their writing. This section aims to address some of these reasons, and offer advice.
Q: English is my second language, and I've been using LLMs to translate my original work, or fix typos in broken English drafts I write myself. That's okay, right?
A: Maybe! I'm really sympathetic to this case, but you need to keep the LLMs on a very, very tight leash here. The problem is that it'll translate or edit into its own style, people will notice your writing is written in LLM style, and they'll think you're in a sycophancy doom loop and give up on your post. I think probably under current conditions, broken English is less of a red flag for people than LLM-ese. That being said, asking LLMs to only correct extremely objective typos is almost certainly okay. LLM translation, sadly, is probably a bad idea, at least in AI-savvy spaces like LessWrong.
Q: What if I'm very up front about the specific, idiosyncratic, totally-no-red-flags-here way I used LLMs in my research? I am researching LLMs, after all, so surely it's reasonable!
A: Sorry, that's probably not going to work. For reasons you learned about in this post, there are a lot of low quality LLM-assisted manifestoes flying around, and lots of them contain disclaimers about how they're different from the rest. Some probably really are different! But readers are not going to give you the benefit of the doubt. Also, LLMs are specifically good at common knowledge and the absolute basics of almost every field, but not very good at finicky details near the frontier of knowledge. So if LLMs are helping you with ideas, they'll stop being reliable exactly at the point where you try to do anything original.
Q: I still believe in my idea, and used LLMs to help me write for a sympathetic reason. Maybe my English isn't very good, or I'm not a great writer, but I think the technical idea is sound and want to get it out there. What should I do?
A: I advise going cold turkey. Write your idea yourself, totally unassisted. Resist the urge to lean on LLM feedback during the process, and share your idea with other humans instead. It can help to try to produce the simplest version possible first; fit it in a few sentences, and see if it bounces off people. But you're going to need to make the prose your own, first.
Q: I feel like this is just a dressed up/fancy version of bog standard anti-AI bias, like the people who complain about how much water it uses or whatever. The best AI models are already superhuman communicators; it's crazy to claim that I shouldn't use them to pad out my prose when I'm really more an ideas person.
A: I'm sympathetic to your position, because I find "AI is useless drivel" trumpeters annoying, too. And indeed, o3 probably can write a more convincing essay on arbitrary subjects than the average person. But if you're trying to make a splash with your writing, you need to meet a much higher bar than the average person. It's my opinion that even the very best models don't yet meet this bar, and even if they do, people will in fact sniff out their writing style and judge you for including it. If your idea really is amazing, all the more reason to make sure people don't dismiss it out of hand.
I'm glad you're here, reading this. LessWrong is a very cool community, and new writers come out of nowhere and make names for themselves all the time. If you're here because you've had your contribution rejected for LLM reasons, I'm sorry you went through that unpleasant experience; it really sucks to be excited about sharing something and then to have a door slammed in your face. But for what it's worth, I hope you stick around a while, spend some time reading and absorbing the culture, and maybe, keeping your LLM assistants on a very tight leash, try again.