Heated, tense arguments can often be unproductive and unpleasant. Neither side feels heard, and they are often working desperately to defend something they feel is very important. Ruby explores this problem and some solutions.

17alkjash
This feels like an extremely important point. A huge number of arguments devolve into exactly this dynamic because each side only feels one of (the Rock|the Hard Place) as a viscerally real threat, while agreeing that the other is intellectually possible.  Figuring out that many, if not most, life decisions are "damned if you do, damned if you don't" was an extremely important tool for me to let go of big, arbitrary psychological attachments which I initially developed out of fear of one nasty outcome.
Customize
Kaj_Sotala4211
4
I think the term "AGI" is a bit of a historical artifact, it was coined before the deep learning era when previous AI winters had made everyone in the field reluctant to think they could make any progress toward general intelligence. Instead, all AI had to be very extensively hand-crafted to the application in question. And then some people felt like they still wanted to do research on what the original ambition of AI had been, and wanted a term that'd distinguish them from all the other people who said they were doing "AI".  So it was a useful term to distinguish yourself from the very-narrow AI research back then, but now that AI systems are already increasingly general, it doesn't seem like a very useful concept anymore and it'd be better to talk in terms of more specific cognitive capabilities that a system has or doesn't have.
Phib4211
0
Re: AI safety summit, one thought I have is that the first couple summits were to some extent captured by the people like us who cared most about this technology and the risks. Those events, prior to the meaningful entrance of governments and hundreds of billions in funding, were easier to 'control' to be about the AI safety narrative. Now, the people optimizing generally for power have entered the picture, captured the summit, and changed the narrative for the dominant one rather than the niche AI safety one. So I don't see this so much as a 'stark reversal' so much as a return to status quo once something went mainstream.
Simplified the solomonoff prior is the distribution you get when you take a uniform distribution over all strings and feed them to a turing machine. Since the outputs are also strings: What happens if we iterate this? What is the stationary distribution? Is there even one? The fixed points will be quines, programs that copy their source code to the output. But how are they weighted? By their length? Presumably you can also have quine-cycles of programs that generate each other in turn, in a manner reminiscent metagenesis. Do these quine cycles capture all probability mass or does some diverge? Very grateful for answers and literature suggestions.
This shortform discusses the current state of responsible scaling policies (RSPs). They're mostly toothless, unfortunately. The Paris summit was this week. Many companies had committed to make something like an RSP by the summit. Half of them did, including Microsoft, Meta, xAI, and Amazon. (NVIDIA did not—shame on them—but I hear they are writing something.) Unfortunately but unsurprisingly, these policies are all vague and weak. RSPs essentially have four components: capability thresholds beyond which a model might be dangerous by default, an evaluation protocol to determine when models reach those thresholds, a plan for how to respond when various thresholds are reached, and accountability measures. A maximally lazy RSP—a document intended to look like an RSP without making the company do anything differently—would have capability thresholds be vague or extremely high, evaluation be unspecified or low-quality, response be like we will make it safe rather than substantive mitigations or robustness guarantees, and no accountability measures. Such a policy would be little better than the company saying "we promise to deploy AIs safely." The new RSPs are basically like that.[1] Some aspects of some RSPs that existed before the summit are slightly better.[2] If existing RSPs are weak, how would a strong RSP be different? * Evals: eval should measure relevant capabilities (including cyber, bio, and scheming), evals should be sufficiently difficult, and labs should do good elicitation. (As a lower bar, the evals should exist; many companies say they will do evals but don't seem to have a plan for what evals to do.) * See generally Model evals for dangerous capabilities and OpenAI's CBRN tests seem unclear * Response: misuse * Rather than just saying that you'll implement mitigations such that users can't access dangerous capabilities, say how you'll tell if your mitigations are good enough. For example, say that you'll have a skilled red-team attempt to el
harfe620
12
A potentially impactful thing: someone competent runs as a candidate for the 2028 election on an AI notkilleveryoneism[1] platform. Maybe even two people should run, one for the democratic primary, and one in the republican primary. While getting the nomination is rather unlikely, there could be lots of benefits even if you fail to gain the nomination (like other presidential candidates becoming sympathetic to AI notkilleveryoneism, or more popularity of AI notkilleveryoneism in the population, etc.) On the other hand, attempting a presidential run can easily backfire. A relevant previous example to this kind of approach is the 2020 campaign by Andrew Yang, which focussed on universal basic income (and downsides of automation). While the campaign attracted some attention, it seems like it didn't succeed in making UBI a popular policy among democrats. ---------------------------------------- 1. Not necessarily using that name. ↩︎

Popular Comments

Recent Discussion

Epistemic status: exploratory thoughts about the present and future of AI sexting.

OpenAI says it is continuing to explore its models’ ability to generate “erotica and gore in age-appropriate contexts.” I’m glad they haven’t forgotten about this since the release of the first Model Spec, because I think it could be quite interesting, and it’s a real challenge in alignment and instruction-following that could have other applications. In addition, I’ve always thought it makes little logical sense for these models to act like the birds and the bees are all there is to human sexuality. Plus, people have been sexting with ChatGPT and just ignoring the in-app warnings anyway.

One thing I’ve been thinking about a lot is what limits a commercial NSFW model should have. In my experience,...

Viliam20

At the same time, empowering only the user and making the assistant play along with almost every kind of legal NSFW roleplaying content (if that’s what OpenAI ends up shipping) seems very undesirable in the long term.

Why? Do dildos sometimes refuse consent? Would it be better for humanity if they did? Should erotic e-books refuse to be read on certain days? Should pornography be disabled on screens if the user is not sufficiently respectful? What about pornography generated by AIs?

Similar to other people's shortform feeds, short stuff that people on LW might be interested in, but which doesn't feel like it's worth a separate post. (Will probably be mostly cross-posted from my Facebook wall.)

2abramdemski
> now that AI systems are already increasingly general I want to point out that if you tried to quantify this properly, the argument falls apart (at least in my view). "All AI systems are increasingly general" would be false; there are still many useful but very narrow AI systems. "Some AI systems" would be true, but this highlights the continuing usefulness of the distinction. One way out of this would be to declare that only LLMs and their ilk count as "AI" now, with more narrow machine learning just being statistics or something. I don't like this because of the commonality of methods between LLMs and the rest of ML; it is still deep learning (and in many cases, transformers), just scaled down in every way.

Hmm I guess that didn't properly convey what I meant. More like, LLMs are general in a sense, but in a very weird sense where they can perform some things at a PhD level while simultaneously failing at some elementary-school level problems. You could say that they are not "general as in capable of learning widely runtime" but "general as in they can be trained to do an immensely wide set of tasks at training-time".

And this is then a sign that the original concept is no longer very useful, since okay LLMs are "general" in a sense. But probably if you'd told... (read more)

2jbash
I think the point is kind of that what matter is not what specific cognitive capabilities it has, but whether whatever set it has is, in total, enough to allow it to address a sufficiently broad class of problems, more or less equivalent to what a human can do. It doesn't matter how it does it.
1MinusGix
I'm confused, why does that make the term no longer useful? There's still a large distinction between companies focusing on developing AGI (OpenAI, Anthropic, etc.) vs those focusing on more 'mundane' advancements (Stability, Black Forest, the majority of ML research results). Though I do disagree that it was only used to distinguish them from narrow AI. Perhaps that was what it was originally, but it quickly turned into the roughly "general intelligence like a smart human" approximate meaning we have today. I agree 'AGI' has become an increasingly vague term, but that's because it is a useful distinction and so certain groups use it to hype. I don't think abandoning a term because it is getting weakened is a great idea. We should talk more about specific cognitive capabilities, but that isn't stopped by us using the term AGI, it is stopped by not having people analyzing whether X is an important capability for risk or capability for stopping risk.

Epistemic status: You probably already know if you want to read this kind of post, but in case you have not decided: my impression is that people are acting very confused about what we can conclude about scaling LLMs from the evidence, and I believe my mental model cuts through a lot of this confusion - I have tried to rebut what I believe to be misconceptions in a scattershot way, but will attempt to collect the whole picture here. I am a theoretical computer scientist and this is a theory. Soon I want to do some more serious empirical research around it - but be aware that most of my ideas about LLMs have not had the kind of careful, detailed contact with reality that I...

2Nick_Tarleton
What about current reasoning models trained using RL? (Do you think something like, we don't know, and won't easily figure out, how to make that work well outside a narrow class of tasks that don't include 'anything important'?)

Yes, that is what I think. 

3Cole Wyeth
I think things will be "interesting" by 2045 in one way or another - so it sounds like our disagreement is small on a log scale :) 
1Cole Wyeth
I see - I mean, clearly AlexNet didn't just invent all the algorithms it relied on, I believe the main novel contribution was to train on GPU's and get it working well enough to blow everything else out of the water? The fact that it took decades of research to go from the Perceptron to great image classification indicates to me that there might be further decades of research between holding an intelligent-ish conversation and being a human agent level agent. This seems like the natural expectation given the story so far, no?  

[legal status: not financial advice™]

Most crypto is held by individuals[1]

Individual crypto holders are disproportionately tech savvy, often programmers

Source: Well known, just look around you.

AI is starting to eat the software engineers market

Already entry level jobs, which doesn't matter for crypto markets that much.[2]

But judging by progress at other tasks AI climb the seniority ladder to where most crypto holders are within the next few years. SWE Bench Verified went from single digit %s to 64% in a year and a bit, and the METR evals are not looking hopeful for humanity's lead. Tech giants are doing increasing amounts of their work with AI.[3]

Some of this effect is across many industries, but software seems worst hit.

Many hodlers who lose their job income will sell crypto to maintain lifestyle

Many...

Viliam20

'Poor' people no longer starve in winter when their farm's food storage runs out.

Homeless people sometimes starve, and also freeze in winter.

(But I agree that the fraction of the starving poor was much larger in the past.)

2Kaj_Sotala
In the world we live in, there is strong political and cultural resistance to the kinds of basic income schemes that would eliminate genuine poverty. The problem isn't that resource consumption would always need to inevitably increase - once people's wealth gets past a certain point, plenty of them prefer to reduce their working hours, forgoing material resources in favor of having more spare time. The problem is that large numbers of people don't like the idea of others being given tax money without doing anything to directly earn it.
2Garrett Baker
Infinite eg energy would just push your scarcity to other resources, eg.
2Noosphere89
Compute, information/entropy and what people can do with their property all become abundant if we assume an infinite energy source. Compute and information/entropy become cheap, because the costs of running computations/getting information and entropy like the Landauef limit become mostly irrelevant if you can assume you can always generate the energy you need. Somewhat similarly, what people can do with their property becomes way more abundant with infinite energy machines, though here it depends on how the machine works, primarily because it allows people to set up their own governments with their own laws given enough time (because everything comes from energy, in the end), and this could end up undermining traditional governments.

Summary and Table of Contents

The goal of this post is to discuss the so-called “sharp left turn”, the lessons that we learn from analogizing evolution to AGI development, and the claim that “capabilities generalize farther than alignment” … and the competing claims that all three of those things are complete baloney. In particular,

  • Section 1 talks about “autonomous learning”, and the related human ability to discern whether ideas hang together and make sense, and how and if that applies to current and future AIs.
  • Section 2 presents the case that “capabilities generalize farther than alignment”, by analogy with the evolution of humans.
  • Section 3 argues that the analogy between AGI and the evolution of humans is not a great analogy. Instead, I offer a new and (I claim) better analogy between
...
RobertM60

Curated.  This post does at least two things I find very valuable:

  1. Accurately represents differing perspectives on a contentious topic
  2. Makes clear, epistemically legible arguments on a confusing topic

And so I think that this post both describes and advances the canonical "state of the argument" with respect to the Sharp Left Turn (and similar concerns).  I hope that other people will also find it helpful in improving their understanding of e.g. objections to basic evolutionary analogies (and why those objections shouldn't make you very optimistic).

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with

[Thanks to Steven Byrnes for feedback and the idea for section §3.1. Also thanks to Justis from the LW feedback team.]

Remember this?

Or this?

The images are from WaitButWhy, but the idea was voiced by many prominent alignment people, including Eliezer Yudkowsky and Nick Bostrom. The argument is that the difference in brain architecture between the dumbest and smartest human is so small that the step from subhuman to superhuman AI should go extremely quickly. This idea was very pervasive at the time. It's also wrong. I don't think most people on LessWrong have a good model of why it's wrong, and I think because of this, they don't have a good model of AI timelines going forward.

1. Why Village Idiot to Einstein is a Long Road: The Two-Component

...

I would find this post much more useful to engage with if you more concretely described the type of tasks that you think AIs will remain bad and gave a bunch of examples. (Or at least made an argument for why it is hard to construct examples if that is your perspective.)

I think you're pointing to a category like "tasks that require lots of serial reasoning for humans, e.g., hard math problems particularly ones where the output should be a proof". But, I find this confusing, because we've pretty clearly seen huge progress on this in the last year such that ... (read more)

The main event this week was the disastrous Paris AI Anti-Safety Summit. Not only did we not build upon the promise of the Bletchley and Seoul Summits, the French and Americans did their best to actively destroy what hope remained, transforming the event into a push for a mix of nationalist jingoism, accelerationism and anarchism. It’s vital and also difficult not to panic or despair, but it doesn’t look good. Another major twist was that Elon Musk made a $97 billion bid for OpenAI’s nonprofit arm and its profit and control interests in OpenAI’s for-profit arm. This is a serious complication for Sam Altman’s attempt to buy those same assets for $40 billion, in what I’ve described as potentially the largest theft in human history.
I’ll be...
1Multicore
The Y-axis on that political graph is weird. It seems like it's measuring moderate vs extremist, which you would think would already be captured by someone's position on the left vs right axis. Then again the label shows that the Y axis only accounts for 7% of the variance while the X axis accounts for 70%, so I guess it's just an artifact of the way the statistics were done.
1kaiwilliams
I can't tell you the exact source, but I saw an OpenAI person tweet that this isn't the actual CoT

Noam Brown: "These aren't the raw CoTs but it's a big step closer."

1teradimich
But Bostrom estimated the probability of extinction within a century as <20%. Scott Alexander estimated the risk from AI as 33%.  They could have changed their forecasts. But it seems strange to refer to them as a justification for confident doom.

This post is inspired by the post "Why it's so hard to talk about Consciousness" by Rafael Harth. In that post, Harth says that the people who participate in debates about consciousness can be roughly divided into two "camps":

Camp #1 tends to think of consciousness as a non-special high-level phenomenon. Solving consciousness is then tantamount to solving the Meta-Problem of consciousness, which is to explain why we think/claim to have consciousness. In other words, once we've explained the full causal chain that ends with people uttering the sounds kon-shush-nuhs, we've explained all the hard observable facts, and the idea that there's anything else seems dangerously speculative/unscientific. No complicated metaphysics is required for this approach.

Conversely, Camp #2 is convinced that there is an experience thing that exists in

...
2Mitchell_Porter
The denial of a self has long seemed to me a kind of delusion. I am very clearly having a particular stream of consciousness. It's not an arbitrary abstraction to say that it includes some experiences and does not include others. To say there is a self, is just to say that there is a being experiencing that stream of consciousness. Would you deny that, or are you saying something else? 

I've got an idea what meditation people might be talking about with doing away with the self. Once you start thinking about what the lower-level mechanics of the brain are like, you start thinking about representations. Instead of the straightforward assertion "there's a red apple on that table", you might start thinking "my brain is holding a phenomenal representation of a red apple on a table". You'll still assume there's probably a real apple out there in the world too, though if you're meditating you might specifically try to not assign meanings to phe... (read more)

2Rafael Harth
I don't the experience of no-self contradicts any of the above. In general, I think you could probably make some factual statements about the nature of consciousness that's true and that you learn from attaining no-self, if you phrased it very carefully, but I don't think that's the point. The way I'd phrase what happens would be mostly in terms of attachment. You don't feel as implicated by things that affect you anymore, you have less anxiety, that kind of thing. I think a really good analogy is just that regular consciousness starts to resemble consciousness during a flow state.