LESSWRONG
LW

1847
elifland
2963Ω181141747
Message
Dialogue
Subscribe

https://www.elilifland.com/. You can give me anonymous feedback here. I often change my mind and don't necessarily endorse past writings.

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
2elifland's Shortform
Ω
4y
Ω
12
plex's Shortform
elifland11h20

You might be interested in my retrospective here / here (direct link to raw predictions/rationales here)

Reply
Checking in on AI-2027
elifland25d50

I was only referring to our AI timelines mode, in this case it’s defined as the most likely year in which superhuman coder arrives. 

In general the concept of mode for most of the scenario decisions seems not well defined as e.g. for non-naturally-numeric choices it depends on how you define the categories and what past events you condition on (for the timelines mode we’re conditioning on the starting point but in other cases one might condition on all events thus far). 

I would personally describe our process as some mixture of sampling what intuitively feels most likely at each point (which might e.g. correspond to the mode of a natural categorical breakdown or of a distribution conditional on all events thus far, but we mostly didn’t explicitly calculate this), while also optimizing for making things not too degenerate and overall intuitively feel like a plausible trajectory (because by default doing mode every time would look unlike what we actually expect in some sense, because in the real world there will be many surprises).


As an example of how much definitions matter here, if we just conditioned on the previous conditions for each month and sampled what big algorithmic improvements might happen treating this as a categorical variable which enumerated many possible improvements, we might never end up with any specific algorithmic improvements or end up with them quite late in the game. But if we instead assume that we think overall probably some will come before superhuman coder and then pick what we think are the most likely ones even though any individual one may be <50% this quickly (though not totally clear in this case) and <<50% in any individual month, then we end up with neuralese recurrence and shared memory bank right before SC.


Perhaps a simpler example of how categorization matters is that if we break down possible AIs’ goals very granularly then we have the most peobabilities of AIs being very well aligned, relative to any very specific misaligned goal. But we overall have more probability on misalignment in this scenario so we first make that high level choice, then we choose one of the most likely specific misaligned goals.

Reply
Checking in on AI-2027
elifland26d304

The authors have emphasized repeatedly that AI 2027 was and is faster than their mode scenario, which makes doing this kind of evaluation annoying,

We've said that it was faster than our median, not our mode. I think it was close to most of our modes at the time of publication, mostly we were at around 2027-2028.

But the evaluation itself seems useful either way, in terms of checking in on how things are going relative to the trajectory that was our best guess conditional on the AGI timelines depicted.

Reply21
The title is reasonable
elifland1mo20

"Slower takeoff should be correlated with 'harder' alignment (in terms of cognitive labor requirements) because slower takeoff implies returns to cognitive labor in capabilities R&D are relatively lower and we should expect this means that alignment returns to cognitive labor are relatively lower (due to common causes like 'small experiments and theory don't generalize well and it is hard to work around this'). For the same reasons, faster takeoff should be correlated with 'easier' alignment."

Yes, that is what I'm saying. In general a lot of prosaic alignment activities seem pretty correlated with capabilities in terms of their effectiveness.

some reasons for anti-correlation, e.g., worlds where there is a small simple core to intelligence which can be found substantially from first principles make alignment harder, in practice there is an epistemic correlation among humans between absolute alignment difficulty (in terms of cognitive labor requirements) and slower takeoff.

Good points.

I don't really understand why this should extremize my probabilities

For the "Does aligned DAI suffice?" section, as I understand it you define an alignment labor requirement, then you combine that with your uncertainty over takeoff speed to see if the alignment labor requirement would be met. 

I guess I'm making a claim that if you added uncertainty over the alignment labor requirement, then you added the correlation, the latter change would extremize the probability. 

This is because slower takeoff corresponds to better outcomes, while harder alignment corresponds to worse outcomes, so making them correlated results in more clustering toward worlds with median easiness, which means that if you think the easiness requirement to get alignment is low, the probability of success goes up, and vice versa. This is glossing a bit but I think it's probably right.

Reply
The title is reasonable
elifland1mo20

Seems like diminishing returns to capabiltiies r&d should be at least somewhat correlated with diminishing returns to safety r&d, which I believe should extremize your probability (because e.g. if before you were counting on worlds with slow takeoff and low alignment requirements, these become less likely; and the inverse if you’re optimistic)

Reply
Contra Collier on IABIED
elifland1mo70

I agree about what is more evidence in my view, but that could be consistent with current AIs and the pace of their advancement being more compelling to the average reader, particularly people who strongly prefer empirical evidence to conceptual arguments.

Not sure whether Collier was referring to it being more compelling in her view, readers', or both.

edit: also of course current AIs and the pace of advancement are very relevant evidence for whether superhuman AGIs will arrive soon. And I think often people (imo wrongly in this case, but still) round off "won't happen for 10-20+ years" to "we don't need to worry about it now."

Reply
My AI Predictions for 2027
elifland2mo*62

Okay, it sounds like our disagrement basically boils down to the value of the forecasts as well as the value of the scenario format (does that seem right?), which I don't think is something we'll come to agreement on.

Thanks again for writing this up! I hope you're right about timelines being much longer and 2027 being insane (as I mentioned, it's faster than my median has ever been, but I think it's plausible enough to take seriously).

edit: I'd also be curious for you to specify what you mean by academic? The scenario itself seems like a very unusual format for academia. I think it would have seemed more serious academic-y if we had ditched the scenario format.

Reply
My AI Predictions for 2027
elifland2mo400

Thanks for writing this up, glad to see the engagement! I've only skimmed and have not run this by any other AI 2027 authors, but a few thoughts on particular sections:

My predictions for AI by the end of 2027

I agree with most but not all of these in the median case, AI 2027 was roughly my 80th percentile aggressiveness prediction at the time.

Edited to add, I feel like I should list the ones that I have <50% on explicitly:

AI still can't tell novel funny jokes, write clever prose, generate great business ideas, invent new in-demand products, or generate important scientific breakthroughs, except by accident.

I disagree re: novel funny jokes, seems plausible that this bar has already been passed. I agree with the rest except maybe clever prose, depending on the operationalization.

LLMs are broadly acknowledged to be plateauing, and there is a broader discussion about what kind of AI will have to replace it.

Disagree but not super confident.

Most breakthroughs in AI are not a result of directly increasing the general intelligence/"IQ" of the model, e.g. advances in memory, reasoning or agency. AI can stay on task much longer than before without supervision, especially for well-specified, simple tasks. Especially since AI coding platforms will have gotten better at tool use and allowing AI to manually test the thing they're working on. By the end of 2027, AI can beat a wide variety of video games is hasn't played before.

I disagree with the first clause, but I'm not sure what you mean because advances in reasoning and agency seem to me like examples of increases in general intelligence. Especially staying on task for longer without supervision. Are you saying that these reasoning and agency advances will mostly come from scaffolding rather than the underlying model getting smarter? That I disagree with.

There is more public discussion on e.g. Hacker News about AI code rot and the downsides of using AI. People have been burned by relying too much on AI. But I think non-coders running businesses will still by hyped about AI in 2027.

Disagree on the first two sentences.

AI still can't drive a damned car well enough that if I bought a car I wouldn't have to.

I don't follow self-driving stuff much, but this might depend on location? Seems like good self-driving cars are getting rolled out in limited areas at the moment.


As you touch on later in your post, it's plausible that we made a mistake by focusing on 2027 in particular:

But I do worry about what happens in 2028, when everyone realizes none of the doomsday stuff predicted in 2025 actually came true, or even came close. Then the AI alignment project as a whole may risk being taken as seriously as the 2012 apocalypse theory was in 2013. The last thing you want is to be seen as crackpots.

I think this is a very reasonable concern and we probably should have done better in our initial release making our uncertainty about timelines clear (and/or taking the time to rewrite and push back to a later time frame, e.g. once Daniel's median changed to 2028). We are hoping to do better on this in future releases, including via just having scenarios be further out, and perhaps better communicating our timelines distributions.

Also:

Listening to several of the authors discuss the AI 2027 predictions after they were published leads me to believe they don't intuitively believe their own estimates.

What do you mean by this? My guess is that it's related to the communciation issues on timelines?

The Takeoff Forecast is Based on Guesswork

Agree.

The Presentation was Misleading

Nothing wrong with guesswork, of course, if it's all you've got! But I would have felt a lot better if the front page of the document had said "AI 2027: Our best guess about what AI progress might look like, formulated by using math to combine our arbitrary intuitions about what might happen."

But instead it claims to be based on "trend extrapolations, wargames, expert feedback, experience at OpenAI, and previous forecasting successes", and links to 193 pages of data/theory/evidence.

They never outright stated it wasn't based on vibes, of course, and if you dig into the document, that's what you find out.

I very much understand this take and understand where you're coming from because it's a complaint I've had regarding some previous timelines/takeoff forecasts. 

Probably some of our disagreement is very tied-in to the object-level disagreements about the usefulness of doing this sort of forecasting; I personally think that although the timelines and takeoff forecasts clearly involved a ton of guesswork, they are still some of the best forecasts out there, and we need to base our timelines and takeoff forecasts on something in the absence of good data.

But still, since we both agree that the forecasts rely on lots of guesswork, even if we disagree on their usefulness, we might be able to have some common ground when discussing whether the presentation was misleading in this respect. I'll share a few thoughts from my perspective below:

  1. I think it's a very tricky problem to communicate that we think that AI 2027 and its associated background research is some of the best stuff out there, but is still relying on tons of guesswork because there's simply not enough empirical data to forecast when AGI will arrive, how fast takeoff will be, and what effects it will have precisely. It's very plausible that we messed up in some ways, including in the direction that you posit.
  2. Keep in mind that we have to optimize for a bunch of different audiences, I'd guess that for each direction (i.e. taking the forecast too seriously, vs. not seriously enough) many people came away with conclusions too far in that direction, from my perspective. This also means that some others have advertised our work in a way that seems overselling to me, though others have IMO undersold it.
  3. As you say, we tried to take care to not overclaim regarding the forecast, in terms of the level of vibes it was based on. We also explicitly disclaimed our uncertainty in several places, e.g. in the expandables "Why our uncertainty increases substantially beyond 2026" and "Our uncertainty continues to increase." as well as "Why is it valuable?" right below the foreword.
  4. Should we have had something stronger in the foreword or otherwise more prominent on the frontpage? Yeah, perhaps, we iterated on the language a bunch to try to make it convey all of (a) that we put quite a lot of work into it, (b) that we think it's state-of-the-art or close on most dimensions and represents subtantial intellectual progress, but also (c) giving the right impression about our uncertainty level and (d) not overclaiming regarding the methodology. But we might have messed up these tradeoffs.
    1. You proposed "AI 2027: Our best guess about what AI progress might look like, formulated by using math to combine our arbitrary intuitions about what might happen." This seems pretty reasonable to me except as you might guess I take issue with the connotation of arbitary. In particular, I think there's reason to trust our intuitions regarding guesswork given that we've put more thinking time into this sort of thing than all but a few people in the world, our guesswork was also sometimes informed by surveys (which were still very non-robust, to be clear, but I think improving upon previous work in terms of connecting surveys to takeoff estimates), and we have a track record to at least some extent. So I agree with arbitrary in some sense in that we can't ground out our intuitions into solid data, but my guess is that it gives the wrong connotation in terms of to what weight the guesswork should be given relative to other forms of evidence,
      1. I'd also not emphasize math if we're discussing the scenario as opposed to timelines or takeoff speeds in particular.
  5. My best guess is for the timelines and takeoff forecast, we should have had a stronger disclaimer or otherwise made more clear in the summary that they are based on lots of guesswork. I also agree that the summaries at the top had pretty substantial room for improvement.
    1. I'm curious what you would think of something like this disclaimer in the timelines forecast summary (and a corresponding one in takeoff): Disclaimer: This forecast relies substantially on intuitive judgment, and involves high levels of uncertainty. Unfortunately, we believe that incorporating intuitive judgment is necessary to forecast timelines to highly advanced AIs, since there simply isn’t enough evidence to extrapolate conclusively.
      1. I've been considering adding something like this but haven't quite gotten to it due to various reasons, but potentially I should prioritize it more highly.
    2. We're also working on updates to these models and will aim to do better at communicating in the future! And will take into account suggestions.
    3. I think this might have happened because to us it's clear to us that we can't make these sorts of forecasts without tons of guesswork, and we didn't have much slack in terms of the time spent thinking about how these supplements would read to others; I perhaps made a similar mistake to one that I have previously criticized others for.
  6. (I had edited to add this paragraph in, but I'm going to actually strike it out for now because I'm not sure I'm doing a good job accurately representing what happened and it seems important to do so precisely, but I'll still leave it up because I don't want to feel like I'm censoring something that I already had in a version of the comment.) Potentially important context is that our median expectation is that AI 2027 would do much worse than it did, so we were mostly spending time trying to increase the expected readership (while of course following other constraints like properly disclaiming uncertainty). I think we potentially should have spent a larger fraction of our time thinking "if this got a ton of readership then what would happen" and to be clear we did spend time thinking about this, but I think it might be important context to note that we did not expect AI 2027 to get so many readers so a lot of our headspace was around increasing readership.

Linking to some other comments I've written that are relevant to this: here, here

Reply
Buck's Shortform
elifland2mo41

In terms of general intelligence including long-horizon agency, reliability, etc., do we think AIs are yet, for example, as autonomously good as the worst professionals? My instinct is no for many of them, even though the AIs might be better at the majority of sub-tasks and are very helpful as collaborators rather than fully replacing someone. But I'm uncertain, it might depend on the operalization and profession, for some professions the answer seems clearly yes.[1][2] It also seems harder to reason about than the literally least capable professional something like the 10th percentile.

If the answer is no and we're looking at the ability to fully autonomously replace humans, this would mean the village idiot -> Einstein claim might technically not be falsified. The spirit of the claim might be though, e.g. in terms of the claimed implications.

  1. ^

    There's also a question of whether we should include phyiscal abilities, if so then the answer would clearly be no for those professions or tasks.

  2. ^

    One profession for which it seems likely that the AIs are better than the least capable humans is therapy. Also teaching/tutoring. In general this seems true for professions that can be done via remote work, don't involve heavy required computer use or long horizon agency.

Reply
Buck's Shortform
elifland2mo132

I'd be excited for people (with aid of LLMs) to go back and grade how various past predictions from MIRI folks are doing, plus ideally others who disagreed. I just read back through part of https://www.lesswrong.com/posts/vwLxd6hhFvPbvKmBH/yudkowsky-and-christiano-discuss-takeoff-speeds and my quick take is that Paul looks mildly better than Eliezer due to predicting larger impacts/revenue/investment pre-AGI (which we appear to be on track for and to some extent already seeing) and predicitng a more smooth increase in coding abilities, but hard to say in part because Eliezer mostly didn't want to make confident predictions, also I think Paul was wrong about Nvidia but that felt like an aside.

edit: oh also there's the IMO bet, I didn't get to that part on my partial re-read, that one goes to Eliezer.

Looking through IEM and the Yudkowsky-Hanson debate also seems like potentially useful sources, as well as things that I'm probably forgetting or unaware of.

Reply2
Load More
46Recent and forecasted rates of software and hardware progress
Ω
4mo
Ω
0
11How 2025 AI Forecasts Fared So Far
5mo
2
91Slow corporations as an intuition pump for AI R&D automation
Ω
6mo
Ω
23
35Forecasting time to automated superhuman coders [AI 2027 Timelines Forecast]
Ω
7mo
Ω
0
668AI 2027: What Superintelligence Looks Like
Ω
7mo
Ω
222
55Predict 2025 AI capabilities (by Sunday)
9mo
3
50Scenario Forecasting Workshop: Materials and Learnings
Ω
2y
Ω
3
31Forecasting future gains due to post-training enhancements
Ω
2y
Ω
2
37Discussing how to align Transformative AI if it’s developed very soon
3y
2
67Eli's review of "Is power-seeking AI an existential risk?"
Ω
3y
Ω
0
Load More
Successor alignment
9 months ago
(+138)
Responsible Scaling Policies
2 years ago
(+69/-48)
Responsible Scaling Policies
2 years ago
(+48)
Chain-of-Thought Alignment
3 years ago
(+46)
Chain-of-Thought Alignment
3 years ago
2017-2019 AI Alignment Prize
3 years ago
2017-2019 AI Alignment Prize
3 years ago
(+19/-16)