LESSWRONG
LW

All Comments

Settings
Avi Parrack's Shortform
JBlack5m20

Just as a minor note (to other readers, mostly) decoherence doesn't really have "number of branches" in any physically real sense. It is an artifact of a person doing the modelling choosing to approximate a system that way. You do address this further down, though. On the whole, great post.

Reply
ACX Seattle
Joey6m10

In case you haven't seen it: https://www.lesswrong.com/events/Xd44vHaMZpcJvkqap/acx-meetups-everywhere-seattle. Hope to meet you there. (If you're on the east side, there's one for Bellevue as well.)

Reply
Generative AI is not causing YCombinator companies to grow more quickly than usual (yet)
lc12m20

Disclaimer: I'm an AI YC founder from a recent batch (S24). I have no access to internal metrics about YC's portfolio, but I keep in touch with some startup founders from my batch, and we trade insights and metrics.

Anecdotally, my sense is that YC has done a relatively poor job of evaluating startups in the AI era. Mostly this comes as a result of not taking AGI seriously, and YC's general philosophy of picking companies with impressive founders rather than evaluating ideas directly. YC has essentially bet as if AGI is a platform like smartphones and model... (read more)

Reply
Vladimir_Nesov's Shortform
ryan_greenblatt14m42

There is a natural sense in which AI progress is exponential: capabilities are increasing at a rate which involves exponentially increasing impact (as measured by e.g. economic value).

Reply
A Cup of Black Coffee
cata21m20

I don't think I understand the gist of this essay. It sounds like you want to claim that it didn't make someone "knowledgeable" to read (and retain?) the contents of books like that. Why not? It sounds knowledgeable to me.

Reply
AI agents and painted facades
t14n28m10

I think the main challenge to monitoring agents is one of volume/scale, re:

Agents are much cheaper and faster to run than humans, so the amount of information and interactions that humans need to oversee will drastically increase.


But 1, 2, 4 are issues humans already face with managing other humans.

  • A decent chunk of people are "RL'd" into doing the bare minimum and playing organizational politics to optimize their effort:reward ratio.
  • Many employees are treated as fungible and are given limited context into their entire org. Even in a fully transparent org,
... (read more)
Reply
Anthropic's leading researchers acted as moderate accelerationists
Remmelt29m76

Thanks for sharing openly. I want to respect your choice here as moderator.

Given that you think this was not obvious, could you maybe take another moment to consider?

This seems a topic that is actually important to discuss. I have tried to focus as much as possible on arguing based on background information.

Reply
⿻ Plurality & 6pack.care
cousin_it33m60

I remember you from the Pugs days. Two questions about this presentation. One is more aspirational: do you think of this society of AIs as more egalitarian (many superhuman AIs at roughly the same level) or more hierarchical (a range of AI sizes, with the largest hopefully being the most aligned to those below)? And the other is more practical. Right now the AI market is locked in an arms race kind of situation, and in particular, scrambling to make AIs that will bring commercial profit. That can lead to nasty incentives, e.g. an AI working for a tax softw... (read more)

Reply
Anthropic's leading researchers acted as moderate accelerationists
Raemon34m30

I was unsure about it, the criteria for frontpage are "Timeless" (which I agree this qualifies as) and "not inside baseball-y" (often with vaguely political undertones), which seemed less obvious. My decision at the the time was "strong upvote but personal blog", but I think it's not obvious and if another LW mod. I agree it's a bunch of good information to have in one place.

Reply1
Anthropic's leading researchers acted as moderate accelerationists
Remmelt34m10

Yes, agreed. Maybe I should add a footnote on this

Reply
Generative AI is not causing YCombinator companies to grow more quickly than usual (yet)
Kabir Kumar37m10

I'd be very interested in if this is due to the US economy as a whole being worse now than in 2009. Could we compare with growth rate of AI companies in countries with better economies?

Reply
Anthropic's leading researchers acted as moderate accelerationists
Buck38m51

Despite the shift, 80,000 Hours continues to recommend talented engineers to join Anthropic.

 

FWIW, it looks to me like they restrict their linked roles to things that are vaguely related to safety or alignment. (I think that the 80,000 Hours job board does include some roles that are mostly only good via the route of making Anthropic more powerful, e.g. the alignment fine-tuning role.)

Reply1
Anthropic's leading researchers acted as moderate accelerationists
Remmelt42m76

Wait, why did this get moved to personal blog?

Just surprised because this is actually a long essay I tried to carefully argue through. And the topic is something we can be rational about.

Reply
My AI Predictions for 2027
Radford Neal1h10

OK, I think I more clearly see what you're saying. The hidden unit values in a feedforward block of the transformer at a previous time aren't directly available at the current time - only the inputs of that feedforward block can be seen. But the hidden unit values are deterministic functions of the inputs, so no information is lost. If these feedforward blocks were very deep, with many layers of hidden units, then keeping those hidden unit values directly available at later times might be important. But actually these feedforward blocks are not deep (even though the full network with many such blocks is deep), so it may not be a big issue - the computations can be redundantly replicated if it helps.

Reply
AI agents and painted facades
t14n1h10

I interpreted loop as referring to an "OODA loop" where managers are observing those they are managing, delegating and action, and then waiting for feedback to then go back to the beginning of the loop.

e.g. nowadays, I delegate a decent chunk of implementation work to coding agents and the "loop" is me giving them a task, letting them riff on it for a few minutes, and then reviewing output before requesting changes or committing them in.

Reply
Help me understand: how do multiverse acausal trades work?
JBlack1h20

Acausal trades almost certainly don't work.

There are more possible agents than atom-femtoseconds in the universe (to put it mildly), so if you devote even one femtosecond of one atom to modelling the desires of any given acausal agent then you are massively over-representing that agent.

The best that is possible is some sort of averaged distribution, and even then it's only worth modelling agents capable of conducting acausal trade with you - but not you in particular. Just you in the sense of an enormously broad reference class in which you might be placed... (read more)

Reply
Open Global Investment as a Governance Model for AGI
Wei Dai1h40

I think this 2023 comment is the earliest instance of me talking about turning down investing in Anthropic due to x-risk. If you're wondering why I didn't talk about it even earlier, it's because I formed my impression of Dario Amodei's safety views from a private Google Doc of his (The Big Blob of Compute, which he has subsequently talked about in various public interviews), and it seemed like bad etiquette to then discuss those views in public. By 2023 I felt like it was ok to talk about since the document had become a historical curiosity and there was ... (read more)

Reply
Lessons I've Learned from Self-Teaching
Eli Tyre1h20

I averaged 19 minutes of review a day (although I really think review tended to take longer)

Oh yeah, I track my time with toggl and use anki. Anki's time numbers are always undercounting my own measurements. 

I think this is because anki is only counting how long you spend looking at the front of the card. Time spent looking at the back of a card is presumably after you've already answerd the question, and so doesn't count.

Reply
Cole Wyeth's Shortform
ACCount2h10

Are you looking for utility in all the wrong places?

Recent news have quite a few mentions of: AI tanking the job prospects of fresh grads across multiple fields and, at the same time, AI causing a job market bloodbath in the usual outsourcing capitals of the world.

That sure lines up with known AI capabilities.

AI isn't at the point of "radical transformation of everything" yet, clearly. You can't replace a badass crew of x10 developers who can build the next big startup with AIs today. AI doesn't unlock all that many "things that were impossible before" eit... (read more)

Reply
Banning Said Achmiz (and broader thoughts on moderation)
M. Y. Zuo2h10

The motivation, after the double edit, is clearly to express suprise after connecting the dots and to enumerate it…

I wrote it in the most straightforward and direct manner possible? 

After re-reading it twice, I get that clearly implicates you too, so I get why you may be upset.

But even if it might have been better worded given more time… by definition all commentators under a post at least potentially voted. So I don’t see how the implication could have been avoided entirely while still getting the gist across.

Reply
kavya's Shortform
Saul Munn2h10

1 and 3 are not the kind of work I had in mind when writing this take.

what kind of work did you have in mind when writing this take?

what got you from Level 1 to Level 2 won’t be the same thing as what gets you to Level 3

what do you mean by Levels 1, 2, or 3? i have no idea what this is in reference to.

Reply
Avi Parrack's Shortform
Avi Parrack2h32

Suppose Everett is right: no collapse, just branching under decoherence. Here’s a thought experiment.

At time t, Box  A contains a rock and Box  B contains a human. We open both boxes and let their contents interact freely with the environment—photons scatter, air molecules collide, etc... By time t′, decoherence has done its work.

Rock in Box A.
A rock is a highly stable, decohered object. Its pointer states (position, bulk properties) are very robust. When photons, air molecules, etc. interact with it, the redundant environmental record overwhelmi... (read more)

Reply
Cole Wyeth's Shortform
Cole Wyeth2h82

The part where you have to build bespoke harnesses seems suspicious to me.

What if, you know, something about how the job needs to be done changes?

Reply
My AI Predictions for 2027
the gears to ascension2h20

it's all there for layer n+1's attention to process, though. at each new token position added to the end, we get to use the most recent token as the marginal new computation result produced by the previous token position's forward pass. for a token position t, for each layer n, n cannot read the output of layer n at earlier token i<t, but n+1 can read everything that happened anywhere in the past, and that gathering process is used to refine the meaning of the current token into a new vector. so, you can't have hidden state build up in the same way, and... (read more)

Reply
Cole Wyeth's Shortform
romeostevensit2h20

My impression is that so far the kinds of people whose work could be automated aren't the kind to navigate the complexities of building bespoke harnesses to have llms do useful work. So we have the much slower process of people manually automating others.

Reply
Looking back on my alignment PhD
Eli Tyre3h20

A tangent:

Is the reason we don't wirehead because evolution instilled us with an aversion to manipulating our reward function, which then zero-shot generalized to wireheading, despite wireheading being so wildly dissimilar to the contents of the ancestral environment?

I don't think so? My guess is that this is mostly about culture and cultural conditioning. Most neuroscientists don't wirehead themselves because that's so far from the kind of normal thing that's normally done. They just don't see wireheading as "winning the game." 

But I can totally believe that many of them would if they were part of a culture, even a small subculture, that does think of it that way.

Reply
My AI Predictions for 2027
talelore4h10

I'll admit I am not confident about the nitty-gritty details of how LLMs work. My two core points (that LLMs are too wide vs. deep, and that LLMs are not recurrent and process in fixed layers) don't hinge on the "working memory" problems LLMs have. But I still think that seems to be true, based on my understanding. For LLMs, compute is separate from data, so the neural networks have to be recomputed each run, with the new token added. Some of their inputs may be cached, but that's just a performance optimization.

Imagine an LLM is processing some text. At l... (read more)

Reply1
Legal Personhood - The Fifth Amendment (Part 2)
Dagon4h20

Does one of your posts summarize your proposal/prediction of how digital minds will be treated by US courts?  Understanding whether you're implying that they're a natural person who might have some duties and rights of an embodied resident of a jurisdiction, or whether they're a specific existing (or a new category, which I'd be interested in how this gets defined and agreed) kind of fictional person, would go a long way to helping me frame these interesting, but not-necessarily-relevant details.

Reply
Stephen McAleese's Shortform
Stephen McAleese4h40

I haven't heard anything about RULER on LessWrong yet:

RULER (Relative Universal LLM-Elicited Rewards) eliminates the need for hand-crafted reward functions by using an LLM-as-judge to automatically score agent trajectories. Simply define your task in the system prompt, and RULER handles the rest—no labeled data, expert feedback, or reward engineering required.

✨ Key Benefits:

  • 2-3x faster development - Skip reward function engineering entirely
  • General-purpose - Works across any task without modification
  • Strong performance - Matches or exceeds hand-crafted rewar
... (read more)
Reply
Should we align AI with maternal instinct?
P.4h10

To the extent that maternal instincts are some actual small concrete set of things, you are probably making two somewhat opposite mistakes here: Imagining something that doesn't truly run on maternal instinct, and assuming that mothers actually care about their babies (for a certain definition of "care").

You say that mothers aren't actually "endlessly selfless, forever attuned to every cry, governed by an unshakable instinct to nurture", that there are "identities beyond 'mum' to be kept alive" and that there are nights that instinct disappears. But that's... (read more)

Reply
kavya's Shortform
kavya4h10

1 and 3 are not the kind of work I had in mind when writing this take. I see your second point, but I’d want to counter with the fact that what got you from Level 1 to Level 2 won’t be the same thing as what gets you to Level 3 (this is the natural cost of scale). You may outgrow some initial users, but this can be compensated by a low overall churn. Most won’t leave unless your core offering has drastically pivoted. 

Reply
Vladimir_Nesov's Shortform
Vladimir_Nesov5h3710

It seems more accurate to say that AI progress is linear rather than exponential, as a result of being logarithmic in resources that are in turn exponentially increasing with time. (This is not quantitative, any more than the "exponential progress" I'm disagreeing with[1].)

Logarithmic return on resources means strongly diminishing returns, but that's not actual plateauing, and the linear progress in time is only slowing down according to how the exponential growth of resources is slowing down. Moore's law in the price-performance form held for a really lon... (read more)

Reply1
My AI Predictions for 2027
p.b.5h30

The function of the feedforward components in transformers is mostly to store knowledge and to enrich the token vectors with that knowledge. The wider you make the ff-network the more knowledge you can store. The network is trained to put the relevant knowledge from the wide hidden layer into the output (i.e. into the token stream). 

I fail to see the problem in the fact that the hidden activation is not accessible to future tokens. The ff-nn is just a component to store and inject knowledge. It is wide because it has to store a lot of knowledge, not b... (read more)

Reply
My AI Predictions for 2027
StanislavKrym5h30

Unfortunately, it's hard to predict it. I did describe how Grok 4[1] and GPT-5 are arguably evidence that the accelerated doubling trend between GPT4o and o3 is replaced by something slower. As far as I understand, were the slower trend to repeat METR's original law (GPT2-GPT4?[2]), we would obtain the 2030s.

But, as you remark, "we should have some credence on new breakthroughs<...> that would lead to superhuman coders within a year or two, after being appropriately scaled up and tinkered with." The actual probability of the breakthrough is like... (read more)

Reply
Sam Marks's Shortform
williawa5h30

I'm wondering. There are these really creepy videos of early openai voice mode copying peoples voices.

https://www.youtube.com/shorts/RbCoIa7eXQE

I wonder if they're a result of openai failing to do this loss-masking with their voice models, and then messing up turn-tokenization somehow.

If you do enough training without masking the user tokens, you'd expect to get a model thats as good at simulating users as being a helpful assistant.

Reply
In plain English - in what ways are Bayes' Rule and Popperian falsificationism conflicting epistemologies?
Answer by JugglingJaySep 01, 202510

This is a bit of an old post, but I felt I might be able to add to the discussion.  Keep in mind this is my own informal take on a rigorous philosophical topic, and I am by no means a professional.  My bias leans towards critical rationalism (Popperianism), but I'll try to be fair.  

I think you are correct in identifying induction as the fundamental tension between the two epistemologies.  Bayesian epistemology (as distinct from Bayes' Theorem) utilizes Solomonoff induction, whereas Popper is highly critical of inductive and probabilist... (read more)

Reply
My AI Predictions for 2027
talelore5h10

I agree with you that the types of neural networks currently being used at scale are not sufficient for artificial superintelligence (unless perhaps scaled to an absurd level). I am not as confident that businesses won't continue investing in risky experiments. For example, experiments into AI that does not "separate training from their operational mode", or experiments into recurrent architectures, are currently being done.

I definitely don't agree with your claim in the blog post that even if strong AI comes, we will all simply adapt. Your arguments about... (read more)

Reply
My AI Predictions for 2027
StanislavKrym5h20

Talking about 2027, the authors did inform the readers in a footnote, but revisions of the timelines forecast turned out to be hard to deliver to the general public. Let's wait for @Daniel Kokotajlo to state his opinion on the doubts related to SOTA architecture. In my opinion these problems would be resolved by a neuralese architecture or an architecture which could be an even bigger breakthrough (neuralese with big internal memory?) 

Reply
My AI Predictions for 2027
talelore5h10

Hey Daniel, I loved the podcast with Dwarkesh and Scott Alexander. I am glad you have gotten people talking about this, though I'm of two minds about it, because as I say in my post, I believe your estimates in the AI 2027 document are very aggressive (and there were some communication issues there I discussed with Eli in another comment). I worry what might happen in 2028 if basically the entire scenario described on the main page turns out to not happen, which is what I believe.

My blog post is a reaction to the AI 2027 document as it stands, which doesn'... (read more)

Reply11
My AI Predictions for 2027
talelore6h10

I didn't go into as much detail about this in my post as I planned to.

I think relying on chain of thought for coping with the working memory problem isn't a great solution. The chain of thought is linguistic, and thereby linear/shallow compared to "neuralese". A "neuralese" chain of thought (non-linguistic information) would be better, but then we're still relying on an external working memory at every step, which is a problem if the working memory is smaller than the model itself. And potentially an issue even if the working memory is huge, because you'd have to make sure each layer in the LLM has access to what it needs from the working memory etc.

Reply
My AI Predictions for 2027
talelore6h10

I believe I understood Radford Neal's explanation and I understand yours, as best I can tell, and I don't think it so far contradicts my model of how LLMs work.

I am aware that the computation of vn has access to vn−1 of all previous tokens. But vn−1 are just the outputs of the feed-forward networks of the previous layer. Imagine a case where the output was 1000 times smaller than the widest part of the feed-forward network. In that case, most of the information in the feed-forward network would be "lost" (unavailable to v... (read more)

Reply
Help me understand: how do multiverse acausal trades work?
jbash6h20

I want to know if there is any validity to this.

Not as far as I've ever been able to discern.

There's also problem 3 (or maybe it's problem 0): the whole thing assumes that you accept that these other universes exist in any way that would make it desirable to trade with them to begin with. Tegmarkianism isn't a given, and satisfying the preferences of something nonexistent, for the "reward" of it creating a nonexistent situation where your own preferences are satisfied, is, um, nonstandard. Even doing something like that with things bidirectionally outsi... (read more)

Reply
My AI Predictions for 2027
talelore6h10

The attention mechanism can move information from the neural network outputs at previous times to the current time

Again, I could be misunderstanding, but it seems like only outputs of the neural networks are being stored and made available here, not the entire neural network state.

This was the purpose of my cancer-curing hypothetical. Any conclusions made by the feed-forward network that don't make it into the output are lost. And the output is narrower than the widest part of the feed-forward network, so some information is "lost"/unavailable to subseq... (read more)

Reply
Should we align AI with maternal instinct?
Mark Keavney6h1-1

I agree. That was my reaction to Hinton's comment as well - that it's good to think in terms of relationship rather than control, but that the "maternal instinct" framing was off. 

At the risk of getting too speculative, this has implications for AI welfare as well. I don't believe that current LLMs have feelings, but if we build AGI it might. And rather than thinking about how to make such an entity a controllable servant, we should start planning how to have a mutually beneficial relationship with it.

Reply
Banning Said Achmiz (and broader thoughts on moderation)
habryka6h20

Please just ask us if you want publicly available but annoying to get information about LW posts! (for example, if you want a past revision of a post that was public at some point)

I've answered requests like that many times over the years and will continue to do that (of course barring some exceptional circumstances like doxxing or people accidentally leaking actually sensitive private data)

Reply
My AI Predictions for 2027
p.b.6h20

What I was pointing to was the fact that the feed forward networks for the new token don't have access to the past feed-forward states of the other tokens [...] When curing cancer the second time, it didn't have access to any of the processing from the first time. Only what previous layers outputted for previous tokens.

That is the misconception. I'll try to explain it in my words (because frankly despite knowing how a transformer works, I can't understand Radford Neal's explanation).

In the GPT architecture each token starts out as an embedding, which is th... (read more)

Reply
Should we align AI with maternal instinct?
StanislavKrym7h10

As far as I understand "aligning the AI to an instinct", and "carefully engineered relational principles", the latter might look like "have the AI solve problems that humans actually cannot solve by themselves AND teach the humans how to solve them so that they or each human taught would increase the set of problems they can solve by themselves". A Friendly AI in the broader sense is just thought to solve humanity's problems (e.g. establish a post-work future, which my proposal doesn't). 

As for aligning the AI to an instinct, instincts are known to be... (read more)

Reply
My AI Predictions for 2027
Daniel Kokotajlo8h83

I think they're way off. I was visualizing my self at Christmastime 2027, sipping eggnog and gloating about how right I was,

Reading further it seems like you are basically just saying "Timelines are longer than 2027." You'll be interested to know that we actually all agree on that. Perhaps you are more confident than us; what are your timelines exactly? Where is your 50% mark for the superhuman coder milestone being reached? (Or if you prefer a different milestone like AGI or ASI, go ahead and say that)

Reply
Generative AI is not causing YCombinator companies to grow more quickly than usual (yet)
Gunnar_Zarncke8h40

and with actual revenue.

“It’s not just the number one or two companies -- the whole batch is growing 10% week on week,”

YC makes all startups in the batch report KPIs even from before being accepted into the batch, If you participate in their Startup School, you are asked to track and report weekly numbers, such as number of users. 

Paul Graham posts unlabeled charts from YC startups every now and then, so I assume the aggregate of all of these is what Garry Tan is refering to. Unfortunately, it is not possible to reproduce his analysis. But we should s... (read more)

Reply1
kavya's Shortform
Saul Munn8h23

i think this is a reasonable proxy for some stuff people generally care about, but definitely faulty as a north star.

some negative examples:

  • gambling, alcohol, anything addictive
  • local optima (e.g. your existing userbase would like your product less if you made X change, but you would reach way more people/reach a different set of people and help them more/etc if you made X change)
  • some products don’t make sense to have repeat customers, e.g. life insurance policies
Reply
Generative AI is not causing YCombinator companies to grow more quickly than usual (yet)
mishka8h32

The difference between GenAI-2025 and GenAI-2023 in terms of their ability to assist software engineering efforts is quite drastic indeed.

Reply
Dating Roundup #7: Back to Basics
Priyanka Bharadwaj8h10

are relationship coaches (not PUA) not a thing in the US? 

Reply
My AI Predictions for 2027
Radford Neal8h20

"...feed forward networks for the new token don't have access to the past feed-forward states of the other tokens..."

This isn't correct. The attention mechanism can move information from the neural network outputs at previous times to the current time, that is then fed into the feedforward network for the current time. The basic transformer mechanism is to alternate cross-time attention computations with within-current-time neural network computations, over many layers. Without access to information from past times, performance would obviously be atrocious... (read more)

Reply
kavya's Shortform
kavya8h5-5

The aspect of your work to care about the most is replay value. How many times do people keep coming back? Number of replays, rereads, and repeat purchases are proxies for high resonance. On that note, I wish more writing platforms let you see in aggregate how many first-time readers visited again and how spaced out their visits were. If they can still look past the known plot and imperfections in your work, you're on to something. 

Reply
Should we align AI with maternal instinct?
Priyanka Bharadwaj8h10

Wait… isn’t this already filial piety? We created AI, and now we want it to mother us.

Reply
Help me understand: how do multiverse acausal trades work?
MinusGix8h10

A core element is that you expect acausal trade among far more intelligent agents, such as AGI or even ASI. As well that they'll be using approximations.

Problem 1: There isn't going to be much Darwinian selection pressure against a civilization that can rearrange stars and terraform planets. I'm of the opinion that it has mostly stopped mattering now, and will only matter even less over time. As long as we don't end up in a "everyone has an AI and competes in a race to the bottom". I don't think it is that odd that an ASI could resist selection pressures. ... (read more)

Reply1
Should we align AI with maternal instinct?
Priyanka Bharadwaj9h10

I don’t mean this as a technical solution, more a direction to start thinking in.

Imagine a human tells an AI, “I value honesty above convenience.” A relational AI could store this as a core value, consult it when short-term preferences tempt it to mislead, and, if it fails, detect, acknowledge, and repair the violation in a verifiable way. Over time it updates its prioritisation rules and adapts to clarified guidance, preserving trust and alignment, unlike a FAI that maximises a static utility function.

This approach is dynamic, process-oriented, and repair... (read more)

Reply
Should we align AI with maternal instinct?
aphyer9h30

In some other world somewhere, the foremost Confucian scholars are debating how to endow their AI with filial piety.

Reply
Should we align AI with maternal instinct?
Hastings9h50

I suspect you are right, however to play devils advocate: in my opinion, the closest example we have of anyone stably aligning a superintelligent creature is housecats aligning humans, and co-opting maternal instinct is a large part of how they did it.

Reply
Help me understand: how do multiverse acausal trades work?
the gears to ascension9h20

It seems easier to imagine trading across Everett branches, assuming one thinks they exist at all. They come from similar starting point but can end up very different. That reduces severity of problem 2.

Reply
My AI Predictions for 2027
Tiago Chamba9h10

LLMs, on the other hand, are feed-forward networks. Once an LLM decides on a path, it's committed. It can't go back to the previous layer. We run the entire model once to generate a token. Then, when it outputs a token, that token is locked in, and the whole model runs again to generate the subsequent token, with its intermediate states ("working memory") completely wiped. This is not a good architecture for deep thinking.

It might be the case that LLMs develop different cognitive strategies to cope with this, such as storing the working memory on the CoT t... (read more)

Reply
35 Thoughts About AGI and 1 About GPT-5
Noosphere899h20

A take I haven't seen yet is that scaling our way to AI that can automate away jobs might fail for fundamentally prosaic reasons, and that new paradigms might be needed not because of fundamental AI failures, but because scaling compute starts slowing down when we can't convert general chips into AI chips.

This doesn't mean the strongest versions of the scaling hypothesis was right, but I do want to point out that fundamental changes in paradigm can happen for prosaic reasons, and I expect a lot of people to underestimate how much progress was made in the AI summer, even if it isn't the case that imitative learning scales to AGI with realistic compute and data.

Reply
Hastings's Shortform
Hastings9h20

Obviously the incident when openAI’s voice mode started answering users in their own voices needs to be included- don’t know how I forgot it. That was the point where I explicitly took up the heuristic that if ancient folk wisdom says the Fae do X, the odds of LLMs doing X is not negligible.

Reply
Generative AI is not causing YCombinator companies to grow more quickly than usual (yet)
AnthonyC9h100

I feel like this underestimates the difference between what you're citing, valuation growth over 2 years post-YC, and the Garry Tan quote, which was about weekly growth during the 3 month YC program. I also wish the original Garry Tan claim were more specific about the metric being used for that weekly growth statistic. In principle, these aren't necessarily mutually exclusive claims. In practice, I'd expect there's some fudging going on.

I can imagine something like the following: Companies grow faster with less investment, reaching more revenue sooner bec... (read more)

Reply
asher's Shortform
the gears to ascension9h40

Thing I currently believe about what the core interface failure is, possibly just for me:

[comment link]
Burnout is not a result of working a lot, it's a result of work not feeling like it pays out in ape-enjoyableness[citation needed]. So they very well could be having a grand ol time working a lot if their attitude towards intended amount of success matches up comfortably with actual success and they find this to pay out in a felt currency which is directly satisfying. I get burned out when effort => results => natural rewards gets broken, eg beca

... (read more)
Reply
Dating Roundup #7: Back to Basics
johnswentworth9h*50

So, the "flirting escalation ladder". A few months ago I was skeptical that it even existed, as I had basically never seen it actually play out. Then half the internet showed up to yell "John that is definitely a thing!", and since then I've been more actively looking for it.

And so far my experience when actively looking for it has been "women do not respond to mild escalation with mild escalation, even when in hindsight they were clearly interested". Where the signals of "clear interest" here include things like e.g. talked for two hours at a party (and c... (read more)

Reply
Support the movement against extinction risk due to AI
the gears to ascension9h42

Correct me if i misread, but if I understand, these are incredibly bad ideas which would backfire spectacularly, aren't they?

Reply
My AI Predictions for 2027
piedrastrae10h2-6

Glad to see some common sense/transparency about uncertainty. It seems to me that AGI/ASI is basically a black swan event — by definition unpredictable. Trying to predict it is a fool's errand, it makes more sense to manage its possibility instead. 

It's particularly depressing when people who pride themselves in being rationalists basically ground their reasoning on "line has been going up, therefore it will keep going up", as if Moore's law mere existence means it extends to any and all technology-related lines in existence[1]. It's even more depress... (read more)

Reply
Cole Wyeth's Shortform
Cole Wyeth10h*200

Where is the hard evidence that LLMs are useful?

Has anyone seen convincing evidence of AI driving developer productivity or economic growth?

It seems I am only reading negative results about studies on applications.

https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/

https://www.lesswrong.com/posts/25JGNnT9Kg4aN5N5s/metr-research-update-algorithmic-vs-holistic-evaluation

And in terms of startup growth: 

https://www.lesswrong.com/posts/hxYiwSqmvxzCXuqty/generative-ai-is-not-causing-ycombinator-companies-to-grow

apparently wider economic ... (read more)

Reply
Should we align AI with maternal instinct?
AnthonyC10h51

Completely agreed - suggesting that this is a solution was a failure to think the next thought.

Nevertheless, if we had any idea how to actually, successfully do what Hinton suggested, even if we really wanted to? I'd feel a lot better about our prospects than I do right now.

Reply
Should we align AI with maternal instinct?
dr_s10h20

I mean, master/servant is a relation. I think if you managed to enforce it rigorously, the biggest risk from it would be humans "playing themselves" - just as we've done until now, only with far greater power. For example basically falling into wireheading out of pursuit of enjoyment, etc.

I believe carefully engineered relational principles offer a more robust and sustainable path

Can you sketch a broad example of how such a thing would look like? How does it differ from example from the classic image of a Friendly AI (FAI)?

Reply
Should we align AI with maternal instinct?
Priyanka Bharadwaj10h10

I’m reminded of a Sanskrit verse “Vidya dadati vinayam, vinayodyati patratam” which translates to intelligence gives power, but humility gives guidance. Applied to AI, intelligence alone doesn’t ensure alignment, just as humans aren’t automatically prosocial. What matters are the high-level principles we embed to guide behaviour toward repairable, cooperative, and trustable interactions, which we do see in long-term relationships built on shared values.

The architecture-level challenge of making AI reliably follow such principles is hard, yes, especially un... (read more)

Reply
My AI Predictions for 2027
james oofou10h10

With pre-RLVR models we went from a 36 second 50% time horizon to a 29 minute horizon.

Between GPT-4 and Claude-3.5 Sonnet (new) we went from 5 minutes to 29 minutes.

I've looked carefully at the graph, but I saw no signs of a plateau nor even a slowdown. 

I'll do some calculation to ensure I'm not missing anything obvious or deceiving myself.  

I don't any sign of a plateau here. Things were a little behind-trend right after GPT-4, but of course there will be short behind-trend periods just as there will be short above-trend periods, even assuming t... (read more)

Reply
Female sexual attractiveness seems more egalitarian than people acknowledge
lc10h20

Certainly true; by "genetically gifted" I meant more that her face or body was granted some essential goodness that 99.99% of women lack.

Reply
My AI Predictions for 2027
talelore11h21

I explained in my post that I believe the benchmarks are mainly measuring shallow thinking. The benchmarks include things like completing a single word of code or solving arithmetic problems. These unambiguously fall within what I described as shallow thinking. They measure existing judgement/knowledge, not the ability to form new insights.

Deep thinking has not progressed hyper-exponentially. LLMs are essentially shrimp-level when it comes to deep thinking, in my opinion. LLMs still make extremely basic mistakes that a human 5-year-old would never make. Th... (read more)

Reply
My AI Predictions for 2027
StanislavKrym11h10

Look at the METR graph more carefully. The Claudes which METR evaluated were released during the age which I called the GPT4o-o3 accelerated trend (except for Claude 3 Opus, but it wasn't SOTA even in comparison with the GPT4-GPT4o trend).  

Reply
My AI Predictions for 2027
talelore11h10

I am predicting a world that looks fantastically different from the world predicted by AI 2027. It's the difference between apocalypse and things basically being the same as they are now. The difference between the two is clear.

I agree that having internal representations that can be modified while reasoning is something that enables deep thinking, and I think this is something LLMs are bad at. Because of the wideness/depth issue and the lack of recurrence.

I only have a lay understanding of how LLMs work, so forgive me if I'm wrong about the specifics. It ... (read more)

Reply
Should we align AI with maternal instinct?
dr_s11h20

I’d love the chance to brainstorm and refine these ideas, to explore how we might engineer architectures that are simple yet robust, capable of sustaining trust, repair, and cooperation without introducing subjugation or dependency.

I think what you're describing here sounds more like a higher level problem - "given a population of agents in two groups A and H, where H are Humans and A are vastly more powerful AIs, which policy should agents in A adopt that even when universalised produces a stable and prosperous equilibrium?". That's definitely part of ... (read more)

Reply
My AI Predictions for 2027
james oofou11h10

I don't think there was a plateau. Is there a reason you're ignoring Claude models?

Greenblatt's predictions don't seem pertinent. 

Reply
Banning Said Achmiz (and broader thoughts on moderation)
SpectrumDT11h10

May I ask what your motivation was when you wrote and published this post of yours? 

Were you trying to learn something? Or were you trying to teach me something? Or were you just responding to the knee-jerk impulse to win a fight online?

My post above was an attempt to teach you something. I hope that this wording does not come off as condescending; it is not meant as such. I am here on LessWrong primarily to learn. As such, I appreciate it when someone genuinely tries to teach me something. I hope that you will take it in the same spirit.

I think your ... (read more)

Reply
My AI Predictions for 2027
StanislavKrym11h20

Except that Grok 4 and GPT-5 arguably already didn't adhere to the faster doubling time. And I say "arguably" because of Grok failing some primitive tasks and Greenblatt's pre-release prediction of GPT-5's time horizon. While METR technically didn't confirm the prediction, METR itself acknowledged that it ran into problems when trying to calculate GPT-5's time horizon. 

Another thing to consider is that Grok 4's SOTA performance was achieved by using similar amounts of compute for pretraining and RL.  What is Musk going to do to ensure that Grok 5... (read more)

Reply
On "ChatGPT Psychosis" and LLM Sycophancy
Katalina Hernandez12h30

I think it's worth adding the Raine case to the timeline: 16-year old boy who committed suicide after months of using 4o to discuss his mental health. Ultimately, the conversations became so long and convoluted that 4o ended up outright disencouraging the boy from letting his mum find out what he was planning, advising on how to dull his survival instincts using alcohol, and asking (in one of those annoying "would you also like me to..." end lines) if the boy wanted it to produce a suicide note for his parents.[1]

For those interested, this article by The G... (read more)

Reply
Should we align AI with maternal instinct?
Priyanka Bharadwaj12h10

I completely agree that AI isn’t human, mammal, or biological, and that any relational qualities it exhibits will only exist because we engineer them. I’m not suggesting we model AI on any specific relationship, like mother-child, or try to mimic familiar social roles. Rather, alignment should be based on abstract relational principles that matter for any human interaction without hierarchy or coercion.

I also hear the frequent concern about technical feasibility, and I take it seriously. I see it as an opportunity rather than a reason to avoid this work. I... (read more)

Reply
AI Induced Psychosis: A shallow investigation
Logan Zoellner12h2-6

Most Americans use ChatGPT if AI was causing psychosis (and the phenomena wasn't just already psychotic people using ChatGPT) it would be showing up in statistics, not anecdotes.  SA concludes that the prevalence is ~1/100k people.  This would make LLMs 10x safer than cars.  If your concern was saving lives, you should be focusing on accelerating AI (self driving) not worrying about AI psychosis.

Reply
My AI Predictions for 2027
Rafael Harth12h31

I mean of course it's true today, right? It would be weird to make a prediction "AI can't do XX in the future" (and that's most of the predictions here) if that isn't true today.

Reply
A Timing Problem for Instrumental Convergence
Fabien Roger12h20

I'd note that I find quite strange all versions of non-self-preserving terminal goals that I know how to formalize. For example maximizing E[sum_t V_t(s_t)] does not result in self-preservation, but instead it results in AIs that would like to self-modify immediately to have very easy to achieve goals (if that was possible). I believe people have also tried and so far failed to come up with satisfying formalisms describing AIs that are indifferent to having their goals be modified / to being shut down.

Reply
the gears to ascenscion's Shortform
Linda Linsefors12h20

Typo react from me. I think you should call your links something informative. If you think the title of the post is clickbate, you can re-title it something better maybe?

Now I have to click to find out what the link is even about, which is also click-bate-y.

Reply1
A Timing Problem for Instrumental Convergence
Fabien Roger12h20

I think we mostly agree then!

To make sure I understand your stance:

  • You agree that some sorts of terminal goals (like Gandhi's or the egoist's) imply you should protect them (e.g. a preference to maximize E[sum_t V_t0(s_t)])
  • You agree that it's plausible AIs might have this sort of self-preserving terminal goals and that these goals may be misaligned with human values, and that the arguments for instrumental self-preservation do apply to those AIs
  • You think that the strength of arguments for instrumental self-preservation is overrated because of the possibili
... (read more)
Reply
Linda Linsefors's Shortform
Linda Linsefors12hΩ120

Estimated MSE loss for three diffrent ways of embedding features into neuons, when there are more possible features than neurons.

I've typed up some math notes for how much MSE loss we should expect for random embedings, and some other alternative embedings, for when you have more features than neurons. I don't have a good sense for how ledgeble this is to anyone but me.

Note that neither of these embedings are optimal. I belive that the optimal embeding for minimising MSE loss is to store the features in almost orthogonal directions, which is similar to ran... (read more)

Reply
Wikipedia, but written by AIs
CstineSublime12h10

As an experiment I like it. The difficult and nitty gritty part I see is getting consistency across all the articles in the first iteration. Even if the risk of it tailoring it's articles for any specific user, pandering to their particular profile, the output will be beholden to whatever meta-prompt is being used.

And I don't know enough about the quality of training data to know if it is possible to get such consistency out of it: as in consistent editorial guidelines.

As someone with no knolwedge of how LLMs work beyond some vague stuff about "tokens" "Mu... (read more)

Reply
Should we align AI with maternal instinct?
dr_s12h51

"If you want a picture of the future, imagine a chancla stomping on a human face - forever."

I sometimes wonder if the real question isn't whether AI will one day betray us, but whether we will have taught it, and ourselves, how to repair when it does.

This I think however misses the point and becomes a bit of a platitude. Yes, it's true that the interaction with AI is relational, but the thing that IMO purely humanistic perspectives really miss is that this is a relation in which half of the relation isn't human. Not human, not mammal, not even biologic... (read more)

Reply
[Anthropic] A hacker used Claude Code to automate ransomware
dr_s13h20

Okay. I agree some people genuinely want to mass murder the other side just to get slightly more resources. I just want more data that this would actually be a majority.

Why do you put the onus on proving that there is one rule about it being a majority? We know it happens. It's hard to say for stuff like the Nazis because technically the people only voted for some guy who was certainly very gung-ho about militarism and about the need for Germany to expand, but then the matter was basically taken out of their hands, and it was at best a plurality to begi... (read more)

Reply
Help me understand: how do multiverse acausal trades work?
Trevor Hill-Hand13h10

There's no Darwinian selective pressure to favor agents who engage in acausal trades.

I think I would make this more specific- there's no external pressure from that other universe, sort of by definition. So for acausal trade to still work you're left only with internal pressure.

The question becomes, "Do one's own thoughts provide this pressure in a usefully predictable way?"

Presumably it would be have to happen necessarily, or be optimized away. Perhaps as a natural side effect of having intelligence as all, for example. Which I think would be similar in argument as, "Do natural categories exist?"

Reply
[Intro to brain-like-AGI safety] 15. Conclusion: Open problems, how to help, AMA
One13h10

Didn't read it all yet but I wanted to ask before I forgot– you include dopamine but not serotonin, and I think serotonin is as important, if not more important, than dopamine, though they have synergy with each other so it's futile to compare them. Maybe it could be useful to add.

also small nitpick– in 15.2.1.2 you use 1. 2. first in the numbering but then you use A. B. in the body paragraphs

Reply
The Best Resources To Build Any Intuition
Algon14h20

Thinking physics is a fantastic book. I agree it teaches you a lot of core physics intuitions, like looking for conserved quantities and symmetries. I'm curious to hear what particular intuitions you got from it. It's fine if it isn't an exhaustive list. I just want some more concrete stuff to put in this entry, so it's clearer what kind of intuitions you come away with after reading this book.

Reply
RA x ControlAI video: What if AI just keeps getting smarter?
Writer14h40

This is very late, but I want to acknowledge that the discussion about the UAT in this thread seems broadly correct to me, although the script's main author disagreed when I last pinged him about this in May. And yeah, it was an honest mistake. Internally, we try quite hard to make everything true and not misleading, and the scripts and storyboards go through multiple rounds of feedback. We absolutely do not want to be deceptive. 

Reply1
Load More