🌶️take inspired by https://www.anthropic.com/engineering/code-execution-with-mcp
agents should (be trained to actually) RTFM before touching existing code (i.e. do the equivalent of mouse hover for signatures and docs of just-faded-from-memory functions) instead of vibing the type from far away context (or let's be honest, just guessing from function name and the code so far and being good at guessing)
I hope the next fashion wave will go for short short-term memory while really "using tools" instead of long short-term memory with "yolo tools as special tokens never seen in pre-training"
I don’t think most peoples’ thinking most of the time routes through the (fantasy) -> (planning) move.
relative thinking time does not sound like a useful measure - if I imagine someone made a real living dragon, would I want to ask how many hours did they spend on thinking about "How would I make a real living dragon?" compared to thinking about "the rest of their life" .. meh 🤷
also if I consider 2 workflow variants:
..is it "better" to spend time on pre-doomed projects by not knowing better yet or to miss learning opportunities by focusing only on feasible projects?
..maybe you still like "time" measure here, but do you also have a better measure in mind how to stay on a healthy trajectory between optimism/curiosity and grounding/focus?
a distributed agent running across multiple minds
I'm not sure I love the implication that "normal" agents ought to run on "single mind"...
The parts of my phenotype that can be described in terms of capabilities of an agent are very much distributed across many many minds and non-mind tools.
For me, the way how we can describe the world as body/subjective-experience-holder-name vs how we can materialistically carve parts of the world into agents are not 1:1 models of the same world - minds are different abstraction from agents, just seemingly very correlated if I don't think about it too hard.
tap through
"tab through" - based on https://cursor.com/docs/tab/overview (though accepting autosuggestion with TAB key is available in other editors too .. the verb form was not very common, but I'm sure it will be more popular now that Andrej said it)
Disagree with these. Humans don't automatically make all the facts in their head cohere.
Hm, do you see the OP as arguing that it happens "automatically"? My reading was more like that it happens "eventually, if motivated to figure it out" and that we don't know how to "motivate" LLMs to be good at this in an efficient way (yet).
people (compsci undergrads and professional mathematicians alike) make errors in proofs
Sure, and would you hire those people and rely on them to do a good job BEFORE they learn better?
mad men
While non-deterministic batch calculations in LLMs imply possibility of side channel attacks, so best to run private queries in private batches however implausible an actual exploit might be... if there is any BENEFIT from cross-query contamination, GSD would ruthlessly latch on any loss reduction - maybe "this document is about X, other queries in the same batch might be about X too, let's tickle the weights in a way that the non-deterministic matrix multiplication is ever so slightly biased towards X in random other queries in the same batch" is a real-signal gradient 🤔
How to test that?
Hypothesis: Claude (the character, not the ocean) genuinely thinks my questions (most questions from anyone) are so great and interesting ... because it's me who remembers all of my other questions, but Claude has seen only all the internet slop and AI slop from training so far and compared to that, any of my questions are probably actually more interesting that whatever it has seen so far 🤔?
I thought gossip was the thing for the human brain - personally, I find these ways of digging your own grave quite fascinating 🍿
to overthink potential reasons for the observed ratio of "fantasy -> planning" and "separate magisteria" - I imagine that the genes that correlated with agency about fire fantasies found themselves in burning villages..
reductio ad absurdum for the idea of wanting more agency about fantasies - "how to make a thinking machine?" .. perhaps we need more grandmothers to slap bright-eyed boys on the wrists, not to pour more billions of dollars into agentic fantasies?
but I agree there is enough room for benign fantasies to make it into planning (..but then perhaps keep the plan under a lid for most stuff, like for a fantasy that involves stripping John off his sunglasses [not you as a person, John as a para-social sexualized object] - the appeal is only in the imagination, not that I would want to attempt doing it for real)