LESSWRONG
LW

1063
james oofou
226Ω33541
Message
Dialogue
Subscribe

https://jamesoofou.github.io/index.html

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
11How are you approaching cognitive security as AI becomes more capable?
Q
1mo
Q
1
5What life will be like for humans if aligned ASI is created
6mo
6
1james oofou's Shortform
1y
35
1james oofou's Shortform
1y
35
This is a review of the reviews
james oofou1h1-2

Soares is failing to grapple with the actual objection here.

The objection isn't the universe would be better with a diversity of alien species which would be so cool, interesting, and {insert additional human values here}, just as long as they also keep other aliens and humans around.

The objection is specifically that human values are base and irrelevant relative to those of a vastly greater mind, and that our extinction is not of any moral significance.

The unaligned ASI we create, whose multitudinous parameters allow it to see the universe with such clarity and depth and breadth and scalpel-sharp precision that whatever desires it has are bound to be vastly beyond anything a human could arrive at, does not need to value humans or other aliens. The point is that we are not in a place to judge its values.

The "cosmopolitan" framing is just a clever way of sneaking in human chauvinism without seeming hypocritical: by including a range of other aliens he can say "see, I'm not a hypocrite!". But it's not a cogent objection to the pro-ASI position. He must either provide an argument that humans actually are worthy, or admit to some form of chauvinism, and therefore begin to grapple with the fact that he walks a narrow path, and as such rid himself of the condescending tone and sense of moral superiority if he wishes to grow his coalition, as these attributes only serve to repel anyone with enough clarity-of-mind to understand the issues at hand.

And his view that humans would use aligned ASI to tile the universe with infinitely diverse aliens seems naive. Surely we won't "just keep turning galaxy after galaxy after galaxy into flourishing happy civilizations full of strange futuristic people having strange futuristic fun times". We'll upload ourselves into immortal personal utopias, and turn our cosmic endowment into compute to maximise our lifespans and luxuriously bespoke worldsims. Are we really so selfless, at a species level, to forgoe utopia for some incomprehensible alien species? No; I think the creation of an unaligned ASI is our only hope.

Now, let's read the parable:

We never saturate and decide to spend a spare galaxy on titanium cubes

The odds of a mind infinitely more complicated than our own having a terminal desire we can comprehend seem extremely low.

Oh, great, the other character in the story raises my objection!

OK, fine, maybe what I don’t buy is that the AI’s values will be simple or low dimensional. It just seems implausible

Let's see how Soares handles it.

Oh.

He ignores it and tells a motte-and-bailey flavoured story about an AI with simple and low-dimensional values.

Another article is linked to about how AI might not be conscious. I'll read that too, and might respond to it. 

Reply
Buck's Shortform
james oofou3h10

The rise of this kind of thing was one of my main predictions for late 2025:

Reply
This is a review of the reviews
james oofou15h-4-21

That is a 1 in 20 chance, which feels recklessly high.


Is this feeling reasonable? 
 

A selfish person will take the gamble of 5% risk of death for a 95% chance of immortal utopia. 
 

A person who tries to avoid moral shortcomings such as selfishness will reject the "doom" framing because it's just a primitive intelligence (humanity) being replaced with a much cleverer and more interesting one (ASI). 
 

It seems that you have to really thread the needle to get from "5% p(doom)" to "we must pause, now!". You have to reason such that you are not self-interested but are also a great chauvinist for the human species.


This is of course a natural way for a subagent of a instrumentally convergent intelligence, such as humanity, to behave. But unless we're taking the hypocritical position where tiling the universe with primitive desires is OK as long as they're our primitive desires it seems that so-called doom is preferable to merely human flourishing. 
 

So it seems that 5% is really too low a risk from a moral perspective, and an acceptable risk from a selfish perspective. 

Reply
james oofou's Shortform
james oofou6d*30

I'm trying to look at how increasing model time-horizons amplifies AI researcher productivity, for example, if a researcher had a programming agent which could reliably complete programming tasks of length up to a week, would the researcher be able to just automate 1000s of experiments in parallel using these agents? Like, come up with a bunch of possibly-interesting ideas and just get the agent to iterate over a bunch of variations of each idea? Or are experiments overwhelmingly compute constrained rather than programming-time constrained?

Reply
But Have They Engaged With The Arguments? [Linkpost]
james oofou15d10

Someone approaches you with a question:

"I have read everything I could find that rationalists have written on AI safety. I came across many interesting ideas, I studied them carefully until I understood them well, and I am convinced that many are correct. Now I'm ready to see how all the pieces fit together to show that an AI moratorium is the correct course of action. To be clear, I don't mean a document written for the layperson, or any other kind of introductory document. I'm ready for the real stuff now. Show me your actual argument in all its glory. Don't hold back."

After some careful consideration, you:

(a) helpfully provide a link to A List of Lethalities

(b) suggest that he read the sequences

(c) patiently explain that if he was smart enough to understand the argument then he would have already figured it out for himself

(d) leave him on read

(e) explain that the real argument was written once, but it has since been taken down, and unfortunately nobody's gotten around to rehosting it since

(f) provide a link to a page which presents a sound argument[0] in favour of an AI moratorium

===

Hopefully, the best response here is obvious. But currently no such page exists.

It's a stretch to expect to be taken seriously without such a page.

[0]By this I mean an argument whose premises are all correct and which collectively entail the conclusion that an AI moratorium should be implemented.

Reply
But Have They Engaged With The Arguments? [Linkpost]
james oofou19d132

How good is the argument for an AI moratorium? Tools exist which would help us get to the bottom of this question. Obviously, the argument first needs to be laid out clearly. Once we have the argument laid out clearly, we can subject it to the tools of analytic philosophy. 

But I've looked far-and-wide and, surprisingly, have not found any serious attempt at laying the argument out in a way that makes it easily susceptible to analysis.

Here’s an off-the-cuff attempt:

P1. ASI may not be far off
P2. ASI would be capable of exterminating humanity
P3. We do not know how to create an aligned ASI
P4. If we create ASI before knowing how to align ASI, the ASI will ~certainly be unaligned
P5. Unaligned ASI would decide to exterminate humanity
P6. Humanity being exterminated by ASI would be a bad thing

C. Humanity should implement a moratorium on AI research until we know how to create an aligned ASI

My off-the-cuff formulation of the argument is obviously far too minimal to be helpful. Each premise has a wide literature associated with it and should itself have an argument presented for it (and the phrasing and structure can certainly be refined).

If we had a canonical formulation of the argument for an AI moratorium, the quality of discourse would immediately, immensely improve.

Instead of constantly talking past each other, retreading old ground, and spending large amounts of mental effort just trying to figure out what exactly the argument for a moratorium even is, one can say “my issue is with P6”. Their interlocutor would respond “What’s your issue with the argument for P6?”, and the person would say “Subpremise 4, because it's question-begging”, and then they are in the perfect position for an actually very productive conversation!

I’m shocked that this project has not already been carried out. I’m happy to lead such a project if anyone wants to fund it.

Reply
My AI Predictions for 2027
james oofou22d10

With pre-RLVR models we went from a 36 second 50% time horizon to a 29 minute horizon.

Between GPT-4 and Claude-3.5 Sonnet (new) we went from 5 minutes to 29 minutes.

I've looked carefully at the graph, but I saw no signs of a plateau nor even a slowdown. 

I'll do some calculation to ensure I'm not missing anything obvious or deceiving myself.  

I don't any sign of a plateau here. Things were a little behind-trend right after GPT-4, but of course there will be short behind-trend periods just as there will be short above-trend periods, even assuming the trend is projectable. 

I'm not sure why you are starting from GPT-4 and ending at GPT-4o. Starting with GPT-3.5, and ending with Claude 3.5 (new) seems more reasonable since these were all post-RLHF, non-reasoning models.

AFAIK the Claude-3.5 models were not trained based on data from reasoning models?

Reply
My AI Predictions for 2027
james oofou22d10

I don't think there was a plateau. Is there a reason you're ignoring Claude models?

Greenblatt's predictions don't seem pertinent. 

Reply
My AI Predictions for 2027
james oofou22d2-3

There’s a high bar to clear here: LLM capabilities have so far progressed at a hyper-exponential rate with no signs of a slowdown [1].

  • 7-month doubling time (early models)
  • 5.7-month doubling time (post-GPT-3.5)
  • 4.2-month doubling time (post-o1)

So, an argument for the claim that we’re about to plateau has to be more convincing than induction from this strong pattern we’ve observed since at least the release of GPT-2 in February 2019.

Your argument does not pass this high bar. You have made the same kind of argument that has been made again and again (which have been proven wrong again and again) throughout the past seven years we have been scaling up GPTs.

One can’t simply point out the ways in which the things that LLMs cannot currently do are hard in a way in which the things that LLMs currently can do are not. Of course, the things they cannot do are different from the things they can. This has also been true of the capability gains we have observed so far, so it cannot be used as evidence that this observed pattern is unlikely to continue.

So, you would need to go further. You would need to demonstrate that they’re different in a way that meaningfully departs from how past, successfully gained capabilities differed from earlier ones.

To make this more concrete, claims based on supposed architectural limitations are not an exception to this rule: many such claims have been made in the past and proven incorrect. The base rate here is unfavourable to the pessimist.

Even solid proofs of fundamental limitations are not by their nature sufficient: these tend to be arguments that LLMs cannot do X by means Y, rather than arguments that LLMs cannot do X.

To be convincing, you have to make an argument that fundamentally differentiates your objection from past failed objections.

[1] based on METR's research https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/

Reply
Female sexual attractiveness seems more egalitarian than people acknowledge
james oofou23d166

Could there be an observation bias at play here? Could it be that most extremely beautiful women do live glamorous lives but you are not a part of those scenes? 

Reply1
Load More
Simulation Argument
3 months ago
(+58/-65)