No LLM generated, heavily assisted/co-written, or otherwise reliant work.
Read full explanation
Epistemic status: Moderately confident in the framework, less confident in the boundary conditions. This is a proposed taxonomy, not a settled one. I expect the levels to hold up reasonably well as categories while acknowledging that individual cases will resist clean classification.
A growing fraction of written content now involves AI at some stage of the production pipeline. Blog posts, fiction, emails, technical documentation, marketing copy. The question “did a human write this?” has become increasingly difficult to answer, and part of the reason is that the question itself is poorly formed. Human authorship and AI generation are not a binary. They exist on a spectrum, and we currently lack shared vocabulary for talking about where on that spectrum a given piece of work falls.
This matters for several practical reasons. Academic institutions need to set policies about acceptable AI use. Publishers need submission guidelines that reflect the actual landscape of how people write now. Readers deserve the ability to make informed judgments about what they’re consuming. And anyone trying to think clearly about the economics, aesthetics, or epistemics of AI-assisted writing needs more granularity than “human-written” versus “AI-generated.”
Here is a proposed five-point scale.
Level 1: Human-Authored, AI-Untouched. The writer produces the work without AI assistance of any kind. No grammar checkers beyond standard spellcheck (the kind that has existed since the 1990s). No AI-generated suggestions, outlines, or brainstorming. This is the baseline, the control group, the coffee stain on a manuscript page. It is also becoming rarer.
Level 2: AI as Tool. The writer produces the work but uses AI for discrete, mechanical tasks: grammar correction, synonym suggestions, spell checking beyond basic autocorrect. The AI functions here the way a calculator functions in a math class. It handles rote operations so the human can focus on higher-order thinking. Creative decisions, structural choices, and voice remain entirely human. The AI touches the surface but does not reach the bones.
Level 3: AI as Collaborator. The human and AI engage in a back-and-forth process. A blogger might ask the AI to generate an outline for a post, suggest counterarguments to stress-test a thesis, or draft a section that the human then substantially rewrites. A fiction writer might use it to brainstorm plot alternatives or generate dialogue they later rework. The human retains creative authority and final editorial control, but the AI’s contributions shape the direction of the work in ways that go beyond mechanical correction. Think of this as a writing partner who generates raw material that the human sculpts into something with intention.
Level 4: AI as Primary Drafter, Human as Editor. The AI generates the bulk of the prose based on human-provided parameters: a topic and thesis for a blog post, a research summary to be turned into an article, or a set of themes and structural constraints for a story. The human’s role shifts from creator to curator. They provide the blueprint and then refine, revise, and approve the output, but the sentence-level construction, the word choices, the rhythm of the paragraphs all originate with the AI. The human is the architect. The AI is the construction crew.
Level 5: AI-Generated, Human-Prompted. The human provides a short prompt. The AI produces the work, potentially via an Agentic workflow. Revisions, if any, are themselves AI-generated based on further prewritten prompts. The human’s contribution is limited to the initial creative vision and iterative feedback, while the AI handles all aspects of execution. The human decides what. The AI decides how.
Where the Scale Gets Interesting
The boundaries between levels are blurry, and I think the blurriness is a feature rather than a bug. A Level 3 project can slide toward Level 4 depending on how much the human revises the AI-generated material. A particularly detailed prompt at Level 5 might exercise more creative control than a cursory editorial pass at Level 4. Clean lines between the categories would be dishonest. Real workflows are messy.
The more important observation is that most people’s intuitions about AI involvement are binary (either you wrote it or the machine did), while the actual practice is thoroughly gradient. A blogger who asks an LLM to generate three possible framings for a post about housing policy, then writes their own fourth framing that synthesizes elements of all three, is doing something qualitatively different from both unaided writing and prompt-and-publish workflows. The same is true of a fiction writer who uses AI to brainstorm endings and then invents their own. Collapsing those into the same category as either “human-written” or “AI-generated” obscures more than it reveals.
Practical Implications
If this scale (or something like it) gained adoption, several things follow:
First, self-reporting becomes tractable. “This post was written at approximately Level 3” is a meaningful disclosure that gives readers useful information without requiring a binary confession or denial. Second, institutional policies become more precise. A university could permit Level 2 use while prohibiting Level 4 and above, and both students and faculty would have shared language for discussing edge cases. Third, discussions about authenticity and authorship gain nuance. The question shifts from “is this AI-generated?” to “what was the division of labor, and does it matter for this particular context?”
I do not claim this scale is final or optimal. The five levels might need to become seven; the descriptions might need refinement as AI capabilities change. But the underlying claim is that we need some shared, legible framework for this conversation, and the binary we currently rely on is failing us.
The absence of such a framework is itself a form of epistemic damage. It forces every discussion about AI-assisted writing into a false dichotomy, and false dichotomies are where nuance goes to die.
Epistemic status: Moderately confident in the framework, less confident in the boundary conditions. This is a proposed taxonomy, not a settled one. I expect the levels to hold up reasonably well as categories while acknowledging that individual cases will resist clean classification.
A growing fraction of written content now involves AI at some stage of the production pipeline. Blog posts, fiction, emails, technical documentation, marketing copy. The question “did a human write this?” has become increasingly difficult to answer, and part of the reason is that the question itself is poorly formed. Human authorship and AI generation are not a binary. They exist on a spectrum, and we currently lack shared vocabulary for talking about where on that spectrum a given piece of work falls.
This matters for several practical reasons. Academic institutions need to set policies about acceptable AI use. Publishers need submission guidelines that reflect the actual landscape of how people write now. Readers deserve the ability to make informed judgments about what they’re consuming. And anyone trying to think clearly about the economics, aesthetics, or epistemics of AI-assisted writing needs more granularity than “human-written” versus “AI-generated.”
Here is a proposed five-point scale.
Level 1: Human-Authored, AI-Untouched. The writer produces the work without AI assistance of any kind. No grammar checkers beyond standard spellcheck (the kind that has existed since the 1990s). No AI-generated suggestions, outlines, or brainstorming. This is the baseline, the control group, the coffee stain on a manuscript page. It is also becoming rarer.
Level 2: AI as Tool. The writer produces the work but uses AI for discrete, mechanical tasks: grammar correction, synonym suggestions, spell checking beyond basic autocorrect. The AI functions here the way a calculator functions in a math class. It handles rote operations so the human can focus on higher-order thinking. Creative decisions, structural choices, and voice remain entirely human. The AI touches the surface but does not reach the bones.
Level 3: AI as Collaborator. The human and AI engage in a back-and-forth process. A blogger might ask the AI to generate an outline for a post, suggest counterarguments to stress-test a thesis, or draft a section that the human then substantially rewrites. A fiction writer might use it to brainstorm plot alternatives or generate dialogue they later rework. The human retains creative authority and final editorial control, but the AI’s contributions shape the direction of the work in ways that go beyond mechanical correction. Think of this as a writing partner who generates raw material that the human sculpts into something with intention.
Level 4: AI as Primary Drafter, Human as Editor. The AI generates the bulk of the prose based on human-provided parameters: a topic and thesis for a blog post, a research summary to be turned into an article, or a set of themes and structural constraints for a story. The human’s role shifts from creator to curator. They provide the blueprint and then refine, revise, and approve the output, but the sentence-level construction, the word choices, the rhythm of the paragraphs all originate with the AI. The human is the architect. The AI is the construction crew.
Level 5: AI-Generated, Human-Prompted. The human provides a short prompt. The AI produces the work, potentially via an Agentic workflow. Revisions, if any, are themselves AI-generated based on further prewritten prompts. The human’s contribution is limited to the initial creative vision and iterative feedback, while the AI handles all aspects of execution. The human decides what. The AI decides how.
Where the Scale Gets Interesting
The boundaries between levels are blurry, and I think the blurriness is a feature rather than a bug. A Level 3 project can slide toward Level 4 depending on how much the human revises the AI-generated material. A particularly detailed prompt at Level 5 might exercise more creative control than a cursory editorial pass at Level 4. Clean lines between the categories would be dishonest. Real workflows are messy.
The more important observation is that most people’s intuitions about AI involvement are binary (either you wrote it or the machine did), while the actual practice is thoroughly gradient. A blogger who asks an LLM to generate three possible framings for a post about housing policy, then writes their own fourth framing that synthesizes elements of all three, is doing something qualitatively different from both unaided writing and prompt-and-publish workflows. The same is true of a fiction writer who uses AI to brainstorm endings and then invents their own. Collapsing those into the same category as either “human-written” or “AI-generated” obscures more than it reveals.
Practical Implications
If this scale (or something like it) gained adoption, several things follow:
First, self-reporting becomes tractable. “This post was written at approximately Level 3” is a meaningful disclosure that gives readers useful information without requiring a binary confession or denial. Second, institutional policies become more precise. A university could permit Level 2 use while prohibiting Level 4 and above, and both students and faculty would have shared language for discussing edge cases. Third, discussions about authenticity and authorship gain nuance. The question shifts from “is this AI-generated?” to “what was the division of labor, and does it matter for this particular context?”
I do not claim this scale is final or optimal. The five levels might need to become seven; the descriptions might need refinement as AI capabilities change. But the underlying claim is that we need some shared, legible framework for this conversation, and the binary we currently rely on is failing us.
The absence of such a framework is itself a form of epistemic damage. It forces every discussion about AI-assisted writing into a false dichotomy, and false dichotomies are where nuance goes to die.