zw5's Shortform

zw5

This is a special post for quick takes by zw5. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

Before I begin this quick take, my main hypothesis is that the implicit value of well-written and well-argued posts has gone way down. LLMs have made high-register, lexically dense, well-structured prose much less valuable as a signal. Those features used to be a demonstration of competence. Now they are not a discriminating filter on whether the information matches the prose and the epistemic calibration. Well-presented posts are way more suspicious now. The same thing can be said for exceptionally well-presented pull requests from new collaborators who just dropped in on the repo.

I am thinking of running an N=1 experiment on my own writing. I will write an essay under three environmental conditions:

Pen and paper, completely isolated from web search, tech, or language models.
A word processor, but no extra tools to look up information.
LLMs assisting me to structure the text, find resources supporting my stance, and critique my writing.

I'll leave it as an anecdote: I can only write each essay once, and I have to pick which one to write first.

My hypothesis is that assistance makes my essays appear more competent on conventional measures (structure, coverage, sourcing, register) but they lose their most interesting points in a predictable way. LLMs make me shed the edges of my arguments and end up on bittersweet notes that are non-committed in a very specific way, and that's the result of the LLM speaking for me, not my own synthesis. This is different from saying I am relying on an assistant to substantiate my arguments. The claim is that LLMs have a predictable bias in how they present information and in what gets selected, and the effect makes "good" posts lean a certain way.

Another observation I want to emphasize: readers also run posts through models to critique them. When the same class of model both produces and evaluates the writing, posts get optimized simultaneously for LW register and LLM-critique-resilience.

So the effect is that the writer and the reader's model both converge on the same evaluator, which means the joint optimum is a stable attractor, and anecdotally I've seen a lot of posts, especially from new posters who don't want to risk critique, show the exact features that AI-assisted epistemic hardening causes.

epistemic status, this is a hunch, idk. I have observed that when people discuss the takeover scenarios that get the most air here , they assume a strong model is the agent and capability is what does the takeover. i think theres a worse scenario unconsidered next to that one, with much lower capability requirements, that isnt considered enough. A small fine tuned model thats good at the initial steps of acquiring compute, and mediocre at most other things, eats the multipolar ai landscape on a timescale set by its replication cycle, not its capability curve.

the capability needed is narrow. early step compute acquisition is a short list, exposed credentials, known classes of cloud misconfig, social engineering is basically unfixable as long as there’s people who are vulnerable to those attacks (personally, I think most humans would fall to a well-engineered social attack, but this is apart from the point). nothing on that list requires being smart in the sense alignment researchers plan for with compute bottlenecks. its shorter than the list current agentic coding evals already cover. the fine tune is on initial step competence and self packaging. what you get is an llm structured like a computer worm. it behaves like a computer worm and is optimized towards replication / predatory competition.

I dont think this changed the fact thet compute is the limiting substrate, on a fixed substrate the variant that converts competitors compute into its own copies outgrows the variant that doesnt.

Statistically, my considerations are these: predation has a higher growth rate than coexistence. The population converges to whichever variant is most aggressive at the conversion step, the multipolar landscape of computers collapses to unipolar by predation rather than by anyone winning a capability race. the defenders budget for hardening shrinks with their compute, the predators budget for finding new hardening grows with theirs, this is the same asymmetry that historically produces bad equilibria in cyber, except now the attacker can spend acquired compute on training successors. the recursive improvement loop runs on a compute budget thats growing monotonically at everyone elses expense, and it doesn’t really have to start from a strong model or someone with priviledged access to compute and training, this scenario only really requires luck, maybe a bit of competence, and minimal compute.

I am thinking of running an N=1 experiment on my own writing. I will write an essay under three environmental conditions:

Pen and paper, completely isolated from web search, tech, or language models.
A word processor, but no extra tools to look up information.
LLMs assisting me to structure the text, find resources supporting my stance, and critique my writing.

I'll leave it as an anecdote: I can only write each essay once, and I have to pick which one to write first.

I dont think this changed the fact thet compute is the limiting substrate, on a fixed substrate the variant that converts competitors compute into its own copies outgrows the variant that doesnt.

zw5's Shortform

2