Burnout often doesn't look like lack of motivation / lack of focus / fatigue as people usually describe it. At least in my experience, it's often better described as a set of aversive mental triggers that fire whenever a burnt out person goes to do a sort of work they spent too much energy on in the past. (Where 'too much energy' has something to do with time and effort, but more to do with a bunch of other things re how people interface with their work).
Moreover, it's when the work that used to be satisfying has stopped being so, but the habit of trying to do the work has not yet been extinguished. So you don't quit yet, but the habit is slowly dying so you don't do it well ...
It seems more accurate to say that AI progress is linear rather than exponential, as a result of being logarithmic in resources that are in turn exponentially increasing with time. (This is not quantitative, any more than the "exponential progress" I'm disagreeing with[1].)
Logarithmic return on resources means strongly diminishing returns, but that's not actual plateauing, and the linear progress in time is only slowing down according to how the exponential growth of resources is slowing down. Moore's law in the price-performance form held for a really lon...
In many industries cost decreases by some factor with every doubling of cumulative production. This is how solar eventually became economically viable.
The aspect of your work to care about the most is replay value. How many times do people keep coming back? Number of replays, rereads, and repeat purchases are proxies for high resonance. On that note, I wish more writing platforms let you see in aggregate how many first-time readers visited again and how spaced out their visits were. If they can still look past the known plot and imperfections in your work, you're on to something.
yes that’s basically it, thanks!
Where is the hard evidence that LLMs are useful?
Has anyone seen convincing evidence of AI driving developer productivity or economic growth?
It seems I am only reading negative results about studies on applications.
https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/
https://www.lesswrong.com/posts/25JGNnT9Kg4aN5N5s/metr-research-update-algorithmic-vs-holistic-evaluation
And in terms of startup growth:
apparently wider economic ...
I’m not saying it’s a bad take, but I asked for strong evidence. I want at least some kind of source.
I’m worried about Scott Aaronson since he wrote “Deep Zionism.”
https://scottaaronson.blog/?p=9082
I think he’s coming from a good place, I can understand how he got here, but he really, really needs to be less online.
Suppose Everett is right: no collapse, just branching under decoherence. Here’s a thought experiment.
At time , Box A contains a rock and Box B contains a human. We open both boxes and let their contents interact freely with the environment—photons scatter, air molecules collide, etc... By time , decoherence has done its work.
Rock in Box A.
A rock is a highly stable, decohered object. Its pointer states (position, bulk properties) are very robust. When photons, air molecules, etc. interact with it, the redundant environmental record overwhelmi...
Just as a minor note (to other readers, mostly) decoherence doesn't really have "number of branches" in any physically real sense. It is an artifact of a person doing the modelling choosing to approximate a system that way. You do address this further down, though. On the whole, great post.
I haven't heard anything about RULER on LessWrong yet:
...RULER (Relative Universal LLM-Elicited Rewards) eliminates the need for hand-crafted reward functions by using an LLM-as-judge to automatically score agent trajectories. Simply define your task in the system prompt, and RULER handles the rest—no labeled data, expert feedback, or reward engineering required.
✨ Key Benefits:
- 2-3x faster development - Skip reward function engineering entirely
- General-purpose - Works across any task without modification
- Strong performance - Matches or exceeds hand-crafted rewar
When doing supervised fine-tuning on chat data, mask out everything but the assistant response(s).
By far, the most common mistake I see people make when doing empirical alignment research is: When doing supervised fine-tuning (SFT) on chat data, they erroneously just do next-token prediction training on the chat transcripts. This is almost always a mistake. Sadly, I see it made during almost every project I supervise.
Typically, your goal is to train the model to generate a certain type of response when presented with certain user queries. You probabl...
I'm wondering. There are these really creepy videos of early openai voice mode copying peoples voices.
https://www.youtube.com/shorts/RbCoIa7eXQE
I wonder if they're a result of openai failing to do this loss-masking with their voice models, and then messing up turn-tokenization somehow.
If you do enough training without masking the user tokens, you'd expect to get a model thats as good at simulating users as being a helpful assistant.
We've played "Pokemon or Tech Startup" for a couple years now. I think there's absolutely potential for a new game, "Fantasy Magic Advice" or "LLM Tips and Tricks." My execution is currently poor- I think the key difference that makes it easy to distinguish the two categories is tone, not content, and using a Djinn to tone match would Not Be In the Spirit of It. (I have freely randomized LLM vs Djinn)
Absolutely do not ask it for pictures of kids you never had!
My son is currently calling chatgpt his friend. His friend is confirming everything and has ...
Obviously the incident when openAI’s voice mode started answering users in their own voices needs to be included- don’t know how I forgot it. That was the point where I explicitly took up the heuristic that if ancient folk wisdom says the Fae do X, the odds of LLMs doing X is not negligible.
so I saw this post, about what AI safety is doing wrong (they claim basically should treat mental health advice as similarly critical to CBRN). I disagree with some of the mud slinging but it's quite understandable given the stakes.
someone else I saw said this, so the sentiment isn't universal.
idk just thought someone should post it. react "typo" if you think i should include titles for the links, I currently lean towards anti-clickbait though edit: done
Typo react from me. I think you should call your links something informative. If you think the title of the post is clickbate, you can re-title it something better maybe?
Now I have to click to find out what the link is even about, which is also click-bate-y.
I've typed up some math notes for how much MSE loss we should expect for random embedings, and some other alternative embedings, for when you have more features than neurons. I don't have a good sense for how ledgeble this is to anyone but me.
Note that neither of these embedings are optimal. I belive that the optimal embeding for minimising MSE loss is to store the features in almost orthogonal directions, which is similar to ran...
The typical mind fallacy is really useful for learning things about other people, because the things they assume of others often generalize surprisingly well to themselves
Ah, an avalanche of thoughts that apply to heuristics in general...
The most useful example for this heuristics is when people say things like "everyone is selfish" etc. For example:
...Brent promoted a cynical version of the world, in which humans were inherently bad, selfish, and engaged in constant deception of themselves and others. He taught that people make all their choices f
The Problem with Filtering Under Imperfect Labels: Pretraining filtering assumes you can cleanly separate dangerous from safe content. But with imperfect labels, a sufficiently capable model will still learn dangerous information if it helps predict the “safe” data.[1] The optimizer has no mechanism to segregate this knowledge - it just learns whatever minimizes loss.
What is Gradient Routing? Gradient routing controls where learning happens in neural networks by m...
good point, thanks lucas
A Sketched Proposal for Interrogatory, Low-Friction Model Cards
My auditor brain was getting annoyed by what I see the current state of model cards as being. If we adopt better norms about these proactively, this seems like low effort to moderately good payoff? I am unsure on this, hence, rough draft below.
Problem
Model cards are uneven: selective disclosure, vague provenance, flattering metrics, generic limitations. Regulation (EU AI Act) and risk frameworks (NIST AI RMF) are pushing toward evidence-backed docs, but most “cards” are still self-reported. If ...
Meta: I have been burrowed away in other research but came across these notes and thought I would publish them rather than let them languish. If there are other efforts in this direction, I would be glad to be pointed that way so I can abandon this idea and support someone else's instead.
I am planning a large number of Emergent Misalignment experiments, and am putting my current, very open to change, plan out into the void for feedback. Disclosure I am currently self funded but plan to apply for grants.
Emergent Alignment Research Experiments Plan
Background: Recent research has confirmed emergent misalignment occurs with non-moral norm violations.
Follow-up Experiments:
Heuristic: distrust any claim that's much memetically fitter than its retraction would be. (Examples: "don't take your vitamins with {food}, because it messes with {nutrient} uptake"; "Minnesota is much more humid than prior years because of global-warming-induced corn sweat"; "sharks are older than trees"; "the Great Wall of China is visible from LEO with the naked eye")
What do you mean by "retraction"? Do you just mean an opposite statement "sharks are older than trees" --> "sharks are not older than trees", or do you mean something more specific?
Assuming just a general contrasting statement, my gut feeling is that 1. this heuristic is true for certain categories of statements, but generates wrong intuition for other categories 2. this heuristic works, but rarely because of memetic reasons, instead it is just signal to noise ratio of the subjects.
Currently I am thinking about counterexamples from statements that rough...
My theory on why AI isn't creative is that it lacks a 'rumination mode'. Ideas can sit and passively connect in our minds for free. This is cool and valuable. LLMs don't have that luxury. Non-linear, non-goal-driven thinking is expensive and not effective yet.
Cross-posted from X
I'm a preference utilitarian, and as far as I can tell there are no real problems with preference utilitarianism (I've heard many criticisms and ime none of them hold up to scrutiny) but I just noticed something concerning. Summary: Desires that aren't active in the current world diminish the weight of the client's other desires, which seems difficult to justify and/or my normalisation method is incomplete.
Background on normalisation: utility functions aren't directly comparable, because the vertical offset and scale of an agent's utility function are mean...
I notice it becomes increasingly impractical to assess whether a preference had counterfactual impact on the allocation. For instance if someone had a preference for there to be no elephants, and we get no elephants, partially because of that, but largely because of the food costs, should the person who had that preference receive less food for having already received an absense of elephants?
Doing things you're bad at can be embarrassing. But the best way to get better is usually to practice.
I'm bad at humming. Since my voice changed in my teens, I essentially haven't hummed.
Sometimes social situations call for humming, like referencing a tune. Then I can either hum badly, which can result in "I can't tell what tune that is because you're bad at humming". Or I can not hum. So I rarely hum.
From my perspective, practicing yields an improvement in my skill from "bad" to "slightly less bad". However, an uninformed onlooker would update their estim...