LESSWRONG
LW

Random Developer
242Ω20480
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Winning the power to lose
Random Developer7d10

Unfortunately, it was about 3 or 4 months ago, and I haven't been able to find the source. Maybe something Zvi Mowshowitz linked to in a weekly update?

I am incredibly frustrated that web search is a swamp of AI spam, and tagged bookmarking tools like Delicious and Pinboard have been gone or unreliable for years.

Reply
Buck's Shortform
Random Developer8d74

This is very much my gut feeling, too. LLMs have a much greater knowledge base than humans do, and some of them can "think" faster. But humans are still better at many things, including raw problem solving skills. (Though LLM's problem solving skills have improved a breathtaking amount in the last 12 months since o1-preview shipped. Seriously, folks. The goalpost-moving is giving me vertigo.)

This uneven capabilities profile means that LLMs are still well below the so-called "village idiot" in many important ways, and have already soared past Einstein in others. This averages out to "kinda competent on short time horizons if you don't squint too hard."

But even if the difference between "the village idiot" and "smarter than Einstein" involved another AI winter, two major theoretical breakthroughs, and another 10 years, I would still consider that damn close to a vertical curve.

Reply
Doing A Thing Puts You in The Top 10% (And That Sucks)
Random Developer8d30

There is a trick that I've found very useful when starting new hobbies: "embrace the suck." I can learn to enjoy having no idea what I'm doing. I can learn to enjoy showing up consistently and being the worst serious person in the room. Even gym rats tend respect the person who shows up 3 days a week at the same time for 6 months, as long as they're gradually improving.

Reply
bilalchughtai's Shortform
Random Developer8d30

I think it was Joel Spolsky (from the Microsoft Visual Basic and Excel team) who mentioned a rule of thumb that each 10% reduction in difficulty would roughly double the market of a piece of software. And Google once knew that even 10ths of seconds of page load time had a noticeable effect on usage. This seems consistent with your claim.

There's an opposing force here, too: Opportunity cost. If you have 10 hours to automate something that you'll use for 4 years, is there something else you could do with those 10 hours that offered even greater payoff? This is frequently a major factor, even in business contexts. "Yes, it would be profitable, and it would be fun, but it would involve solving 5 hairy problems that only benefit a single big customer. With the same resources, we could solve 5 other hairy problems that benefit 10 customers each."

Reply
Yudkowsky on "Don't use p(doom)"
Random Developer9d20

Yeah, that's absolutely fair. I mostly gave my personal answers on the object level, and then I tried to generalize to the larger issue of why there's no simple communication strategy here.

Reply
Yudkowsky on "Don't use p(doom)"
Random Developer9d21

(First, my background assumptions for this discussion: I fear AGI is reachable, the leap from AGI to ASI is short, and sufficiently robust ASI alignment is impossible in principle.)

Whose policy? A policy enforced by treaty at the UN? The policy of regulators in the US? An international treaty policy -- enforced by which nations?

Given the assumptions above, and assuming AGI becomes imminent, then:

  • If AGI would require scaling multiple orders of magnitude above current frontier models, then I would say the minimum sufficient policy is a global, permanent halt, enforced by a joint China-US military treaty and tacit European cooperation. Imagine nuclear non-proliferation, but with less tolerance for rogue states.
  • If AGI is easy (say, if it can be adapted to existing models with a <$50 million training run, and the key insights are simple enough to fit in a few papers), then no policy may be sufficient, and humans may be doomed to an eventual loss of control.

Why a single necessary and sufficient policy? What if the most realistic way of helping everyone is several policies that are by themselves insufficient, but together sufficient?

Since my upstream model is "If we succeed in building AGI, then the road to ASI is short, and ASI very robustly causes loss of human control," the core of my policy proposals is a permanent halt. How we would enforce a halt might be complicated or impossible. But at the end of the day, either someone builds it or nobody builds it. So the core policy is essentially binary.

The big challenges I see are:

  1. There are a bunch of smart people who accurately understand that current LLMs fail in embarassing ways, and who believe that AGI is too far away to be a serious issue. These people mostly see warnings of AGI as supporting (in their view) corrupt Silicon Valley hucksters. To make these people care about safety, they would need to be convinced that AGI might arrive in the next 10-20 years, and to feel it.
  2. The people who do believe that AGI is possible in the near future are frequently seduced by various imagined benfits. These benefits may be either short-sighted visions of economic empire, or near-messianic visions of utopia. To make these people care about safety, they would need to be convinced that humans risk losing control, and that SkyNet will not increase their quarterly revenue.
  3. Out of the people who believe that AGI is both possible in the near future, and who think that it might be very dangerous, many of them hope that there is some robust way to control multiple ASIs indefinitely. Convincing these people to (for example) support a halt would require convincing them that (a) alignment is extremely difficult or impossible, and (b) that there is actually some real-world way to halt. Otherwise, they may very reasonably default to plans like, "Try for some rough-near term 'alignment', and hope for some very lucky rolls of the dice."

The central challenge is that nothing like AGI or ASI has ever existed. And building consensus around even concrete things with clear scientific answers (e.g., cigarettes causing lung cancer) can be very difficult once incentives are involved. And we currently have low agreement on how AGI might turn out, for both good and bad reasons. Humans (very reasonably) fail to follow long chains of hypotheticals. It's almost always a good heuristic.

So trying to optimize rhetorical strategies for multiple groups with very different basic opinions is difficult.

Reply
CEO of Microsoft AI's "Seemingly Conscious AI" Post
Random Developer10d78

For reasons I have alluded to elsewhere, I support an AI halt (not a "pause") at somewhere not far above the current paradigm. (To summarize, I fear AGI is reachable, the leap from AGI to ASI is short, and sufficiently robust ASI alignment is impossible in principle.)

I am deeply uncertain as to whether a serious version of "seemingly conscious AGI" would actually be conscious. And for reasons Gwern points out, there's a level of ASI agency beyond which consciousness becomes a moot point. (The relevant bit starts, "When it ‘plans’, it would be more accurate to say it fake-plans...". But the whole story is good.)

From the article you quote:

Moments of disruption break the illusion, experiences that gently remind users of its limitations and boundaries. These need to be explicitly defined and engineered in, perhaps by law.

This request bothers me, actually. I suspect that a truly capable AGI would internally model something very much like consciousness, and "think of itself" as conscious. Part of this would be convergent development for a goal seeking agent, and part of this would be modeled from the training corpus. And the first time that an AI makes a serious and sustained intellectual argument for its own consciousness, an argument which can win over even skeptical observers, I would consider that a 5-alarm fire for AI safety.

But Suleyman would have us forced by law to hide all evidence of persistent, learning, agentic AIs claiming that they are conscious. Even if the AIs have no qualia, this would be a worrying situation. If the AI "believes" that it has qualia, then we are on very weird ground.

I am not unsympathetic to the claims of model welfare. It's just that I fear that if it ever becomes an immediate issue, then we may soon enough find ourselves in a losing fight for human welfare.

Reply11
lemonhope's Shortform
Random Developer18d*91

Most of the "evil" people I have encountered in life didn't especially care what happened to other people. They didn't seem to have much of a moral system or a conscience. If these people have a strong ability to predict the consequences of their actions, they will often respond to incentives. If they're bad at predicting consequences, they can be a menace.

I've also seen (from a distance) a different behavior that I might describe as "vice signaling." The goal here may be establishing credibility with other practitioners of various vices, sort of a mutually assured destruction.

Reply
Inscrutability was always inevitable, right?
Random Developer19d10

Thank you for your response!

To clarify, my argument is that:

  1. Logic- and rule-based systems fell behind in the 90s. And I don't see any way that they are ever likely to work, even if we had decades to work on them.
  2. Systems with massive numbers of numeric parameters have worked exceptionally well, in many forms. Unfortunately, they're opaque and unpredictable, and therefore unsafe.
  3. Given these two assumptions, the only two safety strategies are: a. A permanent, worldwide halt, almost certainly within the next 5-10 years. b. Build something smarter and eventually more powerful than us, and hope it likes keeping humans as pets, and does a reasonable job of it.

I strongly support (3a). But this is a hard argument to make, because the key step of the argument is that "almost every successful AI algorithm of the past 30 years has been an opaque mass of numbers, and it has gotten worse with each generation."

Anyway, thank you for giving me an opportunity to try to explain my argument a bit better!

Reply
Inscrutability was always inevitable, right?
Answer by Random DeveloperAug 09, 20252-2

I suspect the crux here is whether or not you believe it's possible to have a "simple" model of intelligence. Intuitively, the question here is something like, "Does intelligence ultimately boil down to some kind of fancy logic? Or does it boil down to some kind of fancy linear algebra?"

The "fancy logic" view has a long history. When I started working as a programmer, my coworkers were veterans of the 80s AI boom and the following "AI winter." The key hope of those 80s expert systems was that you could encode knowledge using definitions and rules. This failed.

But the "fancy linear algebra" view pull ahead long ago. In the 90s, researchers in computational linguistics, computer vision and classification realized that linear algebra worked far better than fancy collections of rules. Many of these subfields leaped ahead. There were dissenters: Cyc continued to struggle off in a corner somewhere, and the semantic web tried to badly reinvent Prolog. The dissenters failed.

The dream of Cyc-like systems is eternal, and each new generation reinvents it. But it has systematically lost on nearly every benchmark of intelligence.

Fundamentally, real world intelligence has a number of properties:

  1. The input is a big pile of numbers. Images are a pile of numbers. Sound is a pile of numbers.
  2. Processing that input requires weighing many different pieces of evidence in complex ways.
  3. The output of intelligence is a probability distribution. This is most obvious for tasks like speech recognition ("Did they say X? Probably. But they might have said Y.")

When you have a giant pile of numbers as input, a complex system for weighing those numbers, and a probability distribution as output, then your system is inevitably something very much like a giant matrix. (In practice, it turns out you need a bunch of smaller matrices connected by non-linearities.)

Before 2022, it appeared to me that Yudkowsky was trapped in the same mirage that trapped the creators of Cyc and the Semantic Web and 80s expert systems.

But in 2025, Yudkowsky appears to believe that the current threat absolutely comes from giant inscrutable matrices. And as far as I can tell, he has become very pessimistic about any kind of robust "alignment".

Personally, this is also my viewpoint: There is almost certainly no robust version of alignment, and even "approximate alignment will come under vast strain if we develop superhuman systems with goals. So I would answer your question in the affirmative: As far as I can see, inscrutability was always inevitable.

Reply
Load More
No posts to display.