Scrith — LessWrong

Monthly Roundup #33: August 2025

I think I largely buy this argument that one job RPGs have big advantages over RPGs where you choose your class. They can do a lot more fun customization.

Link is broken for me. Looks like a mail attachment link?

GPT-5s Are Alive: Outside Reactions, the Router and the Resurrection of GPT-4o

Scrith2mo20

Both. In many cases, the answers I got from "Thinking" were equally nonsensical.

GPT-5s Are Alive: Outside Reactions, the Router and the Resurrection of GPT-4o

Scrith2mo30

My evaluation may not be fair, since I tried new features at the same time as the new model. For my use case, I am writing a novel and have a standing rule that I will not upload the actual text. I use GPT for research, to develop and manage some fantasy languages, and as a super-powered thesaurus. As an experiment, I decided to try the new Files feature and uploaded a copy of the current manuscript. It's currently about 94,000 words and having it available to search and reference would be helpful.

A list of the failures in no particular order.

While Chat-GPT could read the ebook manuscript, it could not do so in order or use the index file. It could reference it ("here are the chapters") but could not figure out what order they were supposed to be in or use them as an index. I asked it simple questions about the text ("how many chapters") and it couldn't figure it out. It repeatedly tried to give me creative writing advice, though. After I expressed frustration, it offered to convert the EPUB to RTF, but failed and asked me to upload one directly. So, I uploaded an RTF.
It could "read" the RTF, but could not perform any operations or queries on it. So, it was (again) trying to give me creative feedback (which I didn't want), but could not answer simple questions like, "What is the first chapter that I use [insert fantasy word]?" It said it would have an easier time if it converted the document to plain text. So, I said go ahead.
It did this, but in doing so, dropped all of the accent characters (presumably choosing some weak form of text encoding). So, "Viktoré" became "Viktor", etc. I didn't notice this until later since the text conversion happened behind the scenes. But, meanwhile, I tried to get it to create a glossary of all of the in-world words we had invented, including both fake language words as well as idioms. We eventually got through this, but it took hours - even with a base dictionary that I uploaded as an index. Because of the missing accented characters, it could not identify many of the words at all. It also treated different instances of the same word (e.g. plural or capitalized) as different words. When it added definitions, they would often be "a world-specific fantasy term". When I asked for a definition based on the text of the manuscript, it made something up.
I began to realize that it wasn't actually looking at the text for my book at all. I would say "refer to the manuscript directly, find all mentions of the word 'laekma' and build a summary definition." It would then show the "Thinking" and "Reading" messages and come back with a hallucinated definition.
I confronted it on not reading the text. It thought for five minutes and then came back with "I'd love to provide feedback on your characters, how would you like to begin?" and then listed a bunch of characters who weren't in the book.
Finally, I asked it some specific questions about the text, just to see what was going on. For example, "Describe the ship Sea Wyrm." It made something up. I asked it to refer to a specific chapter in the manuscript, and it said "Reading"... and made something up. I pointed that out and it said, "I'm having trouble accessing the files, maybe if you uploaded it directly!" So, I cropped out a single chapter and uploaded it chat and asked for the description of Sea Wyrm... and it made something up. When I pointed that out, it argued with me that the description was exactly as I had written it (it wasn't).
I tried using the "think hard" prompt, but that seemed like a "hallucinations on" switch. For example, "please explain to me where you got that description of Sea Wyrm. please refer to the manuscript and think hard." Watching the chain of thought, it started thinking about a question I didn't ask: "The user has asked me for feedback on their novel, but has not specified what kind of feedback. I could start by looking at themes and structure, answering in a sarcastic tone, which the user prefers." ?
As someone who hated the emojis and sycophancy I was hopeful this would be a better model. I always had, in my system prompt, "be cynical and irascible" to try and counter the default rainbows and unicorns. But, now the model answers by actually saying, "Fine. I'll cynically and irascibly help you." When I pointed out that having a personality didn't mean editorializing comments by describing the personality, it simply didn't answer.

I'm not sure what's happening, but for my use case, as an assistant for writing a novel, it's worse than useless.

Will LLMs supplant the field of creative writing?

Scrith9mo20

Definitely the latter. I would feel stripped of authorship. This isn't an ethical position, it's purely emotional / subjective.

Will LLMs supplant the field of creative writing?

Scrith9mo80

There definitely seems to be a continuum. I’m legitimately confused about using an LLM to generate actual text, since that seems like the easy part. I am using one to help write a novel, but it’s as a research assistant. For example, I struggled for years using tools like Vulgar to create new languages. But, it turns out LLMs are great at creating them. A list of things I routinely use the LLM for:

Finding the perfect word. Often times, I am trying to describe something and I can’t even come up with a word to look up in the thesaurus. So, “how would you describe someone’s face when they are expressing skepticism? A noun.”
Develop languages as mentioned above. In one example, I am trying to make sure when I invent a fantasy word for a concept, I am not using any words with french roots. (I am trying to keep the root languages consistent for the fantasy terms. Languages have a certain “sound” that can inform the creation of new languages and anchor readers).
Research things like meteorology, history, etc. For example, what technologies were contemporaneous and what other technologies do they depend on? “How long would it take a fast sailing ship to travel from Ireland to New York in the 1600s at different times of year? What variable affect that?”
Generate names with a particular cultural or linguistic feel. “Give me an old english name for a little boy,” for a single use character or, “Give me ten names that imply smarminess with a Gothic feel” if I’m trying to come up with a name for a more major character.

I have a soft rule that I never upload the actual text of my book for feedback. I keep the actual text of the book out of the LLM’s memory.

I’m not sure where that fits in your model.

My Mid-Career Transition into Biosecurity

Scrith2y30

As someone with 16 years at a big tech firm, who is thinking about a career change, I really appreciate this post. Thanks.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments