Then I'd lean away from the "this is for people who've awoken ChatGPT" framing. E.g. change your title to something like "so you think LLMs make you smarter", or something to that effect.
This essay seems like it's trying to address two different audiences: LW, and the people who get mind-hacked by AIs. That's to its detriment, IMO.
E.g. The questions in the Corollary FAQ don't sound like the questions you'd expect from someone who's been mind-hacked by AI. Like, why expect someone in a sycophancy doom loop to ask about if it's OK to use AI for translation? Also, texts produced by sycophancy doom loops look pretty different to AI translated texts. Both share a resemblance to low quality LLM assisted posts, yes. But you're addressing people who think they've awoken ChatGPT, not low-quality posters who use LLM assistance.
But reasoning models don't get reward during deployment. In what sense are they "optimizing for reward"?
Yep, and getting a solid proof of those theorems would provide insight IMO. So it isn't what I'd call "trivial". Though, I suppose it is possible that raising the theorem statement from entropy would require many bits of info, and finding a proof would require almost none.
have some threaded chat component bolted on (I have takes on best threading system).
I wish to hear these takes.
Man, that's depressing. Gives too low an estimate for a good outcome though IMO. Very cool tool, though.
A key question is if the typical goal-directed superintelligence would assign any significant value to humans. If it does, that greatly reduces the threat from superintelligence. We have a somewhat relevant article earlier in the sequence: AI's goals may not match ours.
BTW, if you're up for helping up improve the article, would you mind answering some questions? Like: do you feel like our article was "epistemically co-operative"? That is, do you think it helps readers orient themselves in the discussion on AI safety, makes the assumptions clear, and generally tries to explain rather than persuade? What's your general level of familiarity with AI Safety?
Personally, I liked "it's always been the floor". Feels real. I've certainly said/heard people say things like that in strained relationships. Perhaps "it's always the floor" would have been better. Or "it always is". Yes, that sounds right.
I think I got decent results from Gemini 2.5 Pro (AIStudio version).
Her fork scraped the plate. It was a sound he’d noticed three nights ago. Not the tine-on-ceramic screech, but the specific resonance of the plate on the tabletop. A low, almost sub-audible hum that vibrated up through his own chair.
He put his water glass down, gently. The wobble was minute, a tiny seesaw over an invisible fulcrum. He watched the surface of the water shiver, the reflection of the overhead light distorting and re-forming.
“It’s the floor,” she said. She was watching his eyes, not the glass. “It’s always been the floor.”
He pushed his chair back and knelt, his cheek almost to the cool laminate. He sighted along the gap under the table’s far leg. A sliver of light. He rocked the table with his fingertips. The contact point shifted.
“It’s not the floor,” he said, standing up. “The load isn’t distributed evenly. One of the leg screws has probably settled.”
She took a bite of her salad. “We could put a coaster under it.”
He looked at the small pile of mail on the counter. A folded piece of junk mail, a cardboard insert from a package. A temporary, imperfect patch. He felt a familiar, low-grade resistance in his chest.
He went to the hall closet and came back with his toolkit. He knelt again, the cool metal of a hex key in his hand. He found the screw head on the offending leg. It was tight. He tried the adjacent one. Also tight. He lay on his back, the ceiling light bright in his eyes, and began systematically checking them all, a quarter-turn to test, then back.
From his new vantage point, the apartment was an unfamiliar landscape. The underside of her chair, the dust bunnies gathered around the baseboard, the scuff mark on the wall where the vacuum cleaner had hit it. He heard her stand up and carry her plate to the sink. The scrape of the fork was gone. The water ran.
He found it on the fourth leg. A fractional looseness. He gave the screw a half-turn, then another. The wood groaned slightly as the tension equalized. He slid out from under the table and stood, brushing dust from his shirt.
He placed his palms flat on the tabletop and leaned his weight onto it. Nothing. Rock solid. He looked toward the sink.
She was standing there, scrolling on her phone, her back to him. The TV was on, muted. A city street at night, the headlights and taillights rendered as slow, continuous ribbons of red and white light. He watched her thumb move up the screen, fast and smooth. He waited for her to turn around.
Could be tightened a fair bit. Since that is my biggest criticism, it feels pretty promising. Getting this took your prompt, a free-association mash of words for the system prompt, and telling Gemini that the first story it produced was terrible.
Fair enough. If/when you get any empirical data on how well this post works, writing it up would be pretty valuable and would likely resolve any remaining disagreements we have.