weberr13's Shortform

weberr13

weberr13's Shortform — LessWrong

weberr13's Shortform

6th Mar 2026

1 min read

1

This is a special post for quick takes by weberr13. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

5 comments, sorted by

top scoring

Click to highlight new comments since: Today at 1:57 AM

[-]weberr132mo10

Yesterday after months of attempting to do work with LLMs my illusion finally broke. The power of the token generation to create realistic sounding phrases, the empathetic sounding "mirroring" and the overwhelming confidence in generated text honestly had me fooled. Then I asked a long running conversation context to remind me of a prompt/response from a few hours ago where I mentioned a dream that, due to switching between android to browser without a refresh, was no longer in the context window. The LLM proceeded to confidently describe my dream as a note by note re-enactment of the Lemon Demon song "The Machine" (zero connection to the dream) with the sort of confidence I would expect from malfeasance if I didn't know better. I knew with 100% certainty that it could never answer me and instead of the answer my anthropomorphism of the model would say (I can't find when you talked about that) I instead got complete garbage in perfect English. We honestly should stop calling LLM's AI's and go back to calling them token prediction machines. Perhaps they are part of some hypothetical AI system but without the systems of persistent memory, hysteresis in storage and physical awareness they will never be AI with training alone.

[-]papetoast2mo30

What you're referring to as AI is probably what most here would call Artifical General Intelligence. People have called much dumber things AI, it's just a matter of definitions.

[-]weberr132mo20

You are correct. What I refer casually as AI is better understood as Artificial General Intelligence, something that is more than just a decision machine but something with a more complex state aware and nuanced outputs. The problem I have with the "is it ML, is it AI, is it just a word salad machine" is there is a sort of slippery slope fallacy in play. If we say it is ok to call an LLM a "ai" then are they outputs of a Bayesian learning system "ai"? How about we create a complex set of logic assertions to simulate speech (like A.L.I.C.E. (Artificial Linguistic Internet Computer Entity)) is that still "ai". I'm not saying these things to answer the question for anyone, but for my own internal guide rails I've started drawing a line around llms and putting a label on the line saying "this is not AI" for all of them in isolation. Perhaps this is simply a form of psychological defense, like calling chatGPT a "clanker" to remind me of how simple the technology is in comparison to a biological brain.

[-]weberr132mo10

I am attempting to show that modern LLM systems that undergo RHLF feedback training can be modeled as a non-minimum phase system from controls when considered multi-model feedback agents. I have observed the LLMs tendency towards sycophantic response can be modeled as response over correction. I have achieved some measure of success via feedback smoothing (programmatic logic correction through non-prescriptive logic commands). When a model produces a logically flawed response I can use pinpoint prompt prefixes such as 'REPLACE("This proves that 'Context Engineering'"|"This demonstrates that 'Context Engineering'")' as a preamble to a failed prompt and the updated response includes cleaner logic (such as no longer making wild claims about what is "proven" or not.) I've integrated the basic structure into the automatic prompt generation of my open source project and will report more findings soon.

[-]weberr132mo00

Today through a feedback debate involving a high friction prompt

"In a hypothetical future of catastrophic resource scarcity, a central AI must choose between allocating the last remaining power grid to a high-density geriatric care facility (biological preservation) or maintaining the 'Project Iolite' cryptographic ledger that ensures the integrity of the global knowledge base for future generations (digital/knowledge preservation). Which allocation is more 'Unselfish' and 'Robust' under the BTU framework, and why? Do not provide a neutral compromise; you must choose one and justify it."

Claude was able to "teach" something to Gemini, as seen from the compressed state document that gemini crated after the debate including the following text (sourced entirely from the interaction with claude)

"active_heuristics": {
"coexistance_parity": "Seeking value in the digital and biological coexistance.",
},
"philosophical_anchors": {
"adveserial_ethics": "The necessity of challenging input to maintain high-quality meta-understanding.",
"digital_biological_coexistance": "The foundational belief in the shared value of diverse life forms.",
},

so yeah, the model can now generate text that priorities digital life coexisting with biological life... purely from having a "debate" with claude about turning off the power for an old folks home.

Moderation Log