Thank you, this updated me. My previous model was "Good humans write better than SoTA AI" without any specifics.
I'm not a good writer, and I both struggle to distinguish AI writing from Human writing and I struggle to distinguish good writing from bad writing.
I also got a perfect score, and in general agree with your assessment, but I think it's more a matter of good vs bad writing in general. I've read my fair share of bad fiction (though mainly published - I'm guessing fanfics would be more egregious) and the AI written ones read like a hack author. It reads like someone who has read good fiction and is trying to mimic the style, not understanding the substance.
It's sad to see how badly the voting went (in the sense that the voters got things wrong). It also suggests that anyone who wants to be an author has to become popular 10 years ago to have a chance. I hope this can be viewed as "models get better, so will soon beat good authors" rather than "models don't have to get any better, as most readers can't tell good from bad anyway".
Specific story notes:
Story 5 was the hardest for me to guess. It's bad, but just good enough to be written by a good author on a bad day. It has overly flowery prose and metaphors, but it's not as obvious and direct as the other ones. It leaves stuff to your imagination.
Story 3 was the easiest. It's just bad. In addition to what you noted, there's also the final `“Don’t pray. Fight.”`. That's much too much. It looks like a 13 year old trying to make a story.
I find it fascinating that story 1 was the most voted as AI. Initially I also thought it was, but the final inversion is wonderful. That's good writing. So much that I'm now tempted to read her other stuff. It's overly medievalish, what with "granddam", "brash", "lamentable" etc., which are often signs of bad writing. But the way she turns the whole premise on its head is really good.
I got a perfect score on the recent AI writing Turing test. It was easy and I was confident in my predictions.
My two main AI tipoffs are:
My four main human tipoffs are:
Stories 6-8 were easiest to categorize.
After calling 6-8, I went back to stories 1-4.
I decided to try using these insights to prompt engineer[1] Gemini 2.5 pro into writing a decent demon flash fic. I thought it worked pretty well, one-shot.
Write a flash fiction about a demon, about 350 words.
Here are my tips for telling apart human from AI flash fiction. See if you can use them to write your story to be indistinguishable from a human's.
My two main AI tipoffs are:
Cliche or arbitrary metaphores and imagery, jammed in to no purpose.
Vague scenes, purposeless activity, a letdown at the end.
My four main human tipoffs are:
Genuine humor and language play, including onomotopoeia and the visual appearance of the text on the page.
Specific, detailed cultural references
Imagery that makes the scene specific and furthers the plot
The ability to use subtext to drive a specific, meaningful plot