LESSWRONG
LW

707
ACCount
386Ω21800
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
The Tale of the Top-Tier Intellect
ACCount7d75

This one is bad. "I struggled to find a reason to keep reading it" level of bad. And that was despite me agreeing with basically every part of the message.

Far too preachy and condescending to be either convincing or entertaining. Takes too long to get to the points, makes too many points, and isn't at all elegant in conveying them. The vibe I get from this: an exhausted rant, sloppily dressed up into something resembling a story.

It's hard for me to imagine this piece actually changing minds, which seems to be the likely goal of writing it? But if that's the intent, then the execution falls short. Every point made here was already made elsewhere - more eloquent and more convincingly - including in other essays by the same author.

Reply
Origins and dangers of future AI capability denial
ACCount17d30

Seeing "stochastic parrots" repeated by people who don't know what "stochastic" is will never stop being funny.

Reply
Origins and dangers of future AI capability denial
ACCount17d21

I suspect that one significant source of underestimating AI impact is that a lot of people had no good "baseline" of machine capabilities in the first place.

If you're in IT, or as much as taken a CS 101 course, then you've been told over and over again: computers have NO common sense. Computers DO NOT understand informal language. Their capability profile is completely inhuman: they live in a world where factoring a 20-digit integer is pretty easy but telling whether there is a cat or a dog in a photo is pretty damn hard. This is something you have to learn, remember, understand and internalize to be able to use computers effectively.

And if this was your baseline: it's obvious that current AI capabilities represent a major advancement.

But people in IT came up with loads and loads of clever tricks to make computers usable by common people - to conceal the inhuman nature of machines, to use their strengths to compensate for their weaknesses. Normal people look at ChatGPT and say: "isn't this just a slightly better Google" or "isn't that just Siri but better". Without having any concept of the mountain of research and engineering and clever hacks that went into dancing around the limitations of poor NLP and NLU to get web search to work as well as it did in year 1999, or how hard it was to get Siri to work even as well as it did in an age before GPT-2.

In a way, for a normal person, ChatGPT just brings the capabilities of machines closer to what they already expect machines to be capable of. There's no jump. The shift from "I think machines can do X, even though they can't do X at all, and it's actually just Y with some clever tricks, which looks like X if you don't look too hard" to "I think machines can do X, and they actually can do X" is hard to perceive.

And if a person knew barely anything about IT, just enough to be dangerous? Then ChatGPT may instead pattern match to the same tricks as what we typically use to imitate those unnatural-for-machines capabilities. "It can't really think, it just uses statistics and smokes and mirrors to make it look like it thinks."

To a normal person, Sora was way more impressive than o3.

Reply
Learning to Interpret Weight Differences in Language Models
ACCount18d30

Putting more metacognitive skills into LLMs is always fun.

I wonder if you can train something like that for other perturbation types - like deployment misconfiguration, quantization or noise injection. Recalling how poor GPT-OSS reception was in part caused by avoidable inference issues, how Claude had a genuine "they made the LLM dumber!" inference bug, and so it goes.

Reply
eggsyntax's Shortform
ACCount26d30

This is an entangled behavior, thought to be related to multi-turn instruction following.

We know our AIs make dumb mistakes, and we want an AI to self-correct when the user points out its mistakes. We definitely don't want it to double down on being wrong, Sydney style. The common side effect of training for that is that it can make the AI into too much of a suck up when the user pushes back.

Which then feeds into the usual "context defines behavior" mechanisms, and results in increasingly amplified sycophancy down the line for the duration of that entire conversation.

Reply1
Towards a Typology of Strange LLM Chains-of-Thought
ACCount1mo71

Repeated token sequences - is it possible that those tokens are computational? Detached from their meaning by RL, now emitted solely to perform some specific sort of computation in the hidden state? Top left quadrant - useful thought, just not at all a language.

Did anyone replicate this specific quirk in an open source LLM?

"Spandrel" is very plausible for that too. LLMs have a well known repetition bias, so it's easy to see how that kind of behavior could pop up randomly and then get reinforced by an accident. So is "use those tokens to navigate into the right frame of mind", it seems to get at one common issue with LLM thinking.

Reply
shortplav
ACCount1mo30

Chat or API?

API access gives way better tools for this kind of thing.

Reply
faul_sname's Shortform
ACCount1mo10

Does this quirk reproduce on open weights models, i.e. GPT-OSS? Similar reasoning trace quirks in different model families?

Sounds like a fun target for some mechinterpret work. Might be a meaningful behavior, might be meaningless noise, plenty of room to try different things to figure that out.

But, of course, OpenAI wouldn't let anyone have fun with their proprietary models, so we'd need to replicate this in an open model to start.

Reply
lemonhope's Shortform
ACCount2mo10

There's a lot of "they used user data to shoot themselves in the foot" and not nearly enough "they used user data to improve performance" happening in the industry.

Maybe frontier labs will finally crack applying user feedback once the training data bottleneck begins to bite? I imagine that getting good utility out of user data is hard, both from the standpoint of the engineering required, and the computation required.

Reply
Immigration to Poland
ACCount2mo2-1

Don't mistake my "very good for economic growth" for "any good for social cohesion".

I make no such claim. My claim is that there is a lot of economic incentives to overlook the negatives of immigration.

My honest opinion is that immigration is not going to be good for social cohesion unless the immigration policy is nothing short of immaculate. And the gap between "spectacularly failed" and "nothing short of immaculate" is where most immigration policies currently reside.

Reply4
Load More
2ACCount's Shortform
7mo
1
2ACCount's Shortform
7mo
1