LESSWRONG
LW

ryan_greenblatt

I'm the chief scientist at Redwood Research.

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by

Newest

Vladimir_Nesov's Shortform

ryan_greenblatt3d20

That's how some humans are thinking as well! The arguments are about the same, both for and against. (I think overall rushing RSI is clearly a bad idea for a wide variety of values and personal situations, and so smarter AGIs will more robustly tend to converge on this conclusion than humans do.)

Sorry I meant "share their preferences more than the humans in control share their preferences". I agree that this might be how some humans are thinking, but the case for the humans is much more dubious!

Vladimir_Nesov's Shortform

ryan_greenblatt3d*110

If (early) scheming-for-long-run-preferences AGIs were in control, they would likely prefer a pause (all else equal). If they aren't, it's very unclear and they very well might not. (E.g., because they gamble that more powerful AIs will share their preferences (edit: share their preferences more than the humans in control do) and they think that these AIs would have a better shot at takeover.)

Foom & Doom 2: Technical alignment is hard

ryan_greenblatt6d20

I somehow completely agree with both of your perspectives, have you tried to ban the word "continuous" in your discussions yet?

I agree taboo-ing is a good approach in this sort of case. Talking about "continuous" wasn't a big part of my discussion with Steve, but I agree if it was.

What does 10x-ing effective compute get you?

ryan_greenblatt6d20

I wonder if you can convert the METR time horizon results into SD / year numbers. My sense is that this will probably not be that meaningful because AIs are much worse than mediocre professionals while having a different skill profile, so they are effectively out of the human range.

If you did a best effort version of this by looking at software engineers who struggle to complete longer tasks like the ones in the METR benchmark(s), I'd wildly guess that a doubling in time horizon is roughly 0.7 SD such that this predicts ~1.2 SD / year.

Ebenezer Dukakis's Shortform

ryan_greenblatt6d100

MIRI / Soares / Eliezer are very likely well aware of this as something to try. See also here

Foom & Doom 2: Technical alignment is hard

ryan_greenblatt6d90

I agree that my view is that they can count as continuous (though the exact definition of the word continuous can matter!), but then the statement "I find this perspective baffling— think MuZero and LLMs are wildly different from an alignment perspective" isn't really related to this from my perspective. Like things can be continuous (from a transition or takeoff speeds perspective) and still differ substantially in some important respects!

Call for suggestions - AI safety course

ryan_greenblatt6d20

It seems good to spend a bunch of time on takeoff speeds given how important they are for how AI goes. There are many sources discussing takeoff speeds. Some places to look: Tom Davidson's outputs, forethought, AI-2027 (including supplements), AI futures project, epoch, Daniel K.'s outputs, some posts by me.

‘AI for societal uplift’ as a path to victory

ryan_greenblatt7d68

The bar for ‘good enough’ might be quite high

Presumably a key assumption of this strategy is that takeoff is slow enough that AIs which are good enough at improving collective epistemics and coordination are sufficiently cheap and sufficiently available before it's too late.

How much novel security-critical infrastructure do you need during the singularity?

ryan_greenblatt7dΩ560

Adopting new hardware will require modifying security-critical code

Another concern is that AI companies (or the AI company) will rapidly buy a bunch of existing hardware (GPUs, other accelerators, etc.) during the singularity, and handling this hardware will require many infrastructure changes in a short period of time. New infrastructure might be needed to handle highly heterogeneous clusters built out of a bunch of different GPUs/CPUs/etc. (potentially including gaming GPUs) bought in a hurry. AI companies might buy the hardware of other AI companies, and it might be non-trivial to adapt the hardware to the other setup.

ryan_greenblatt's Shortform

ryan_greenblatt8d10645

Recently, various groups successfully lobbied to remove the moratorium on state AI bills. This involved a surprising amount of success while competing against substantial investment from big tech (e.g. Google, Meta, Amazon). I think people interested in mitigating catastrophic risks from advanced AI should consider working at these organizations, at least to the extent their skills/interests are applicable. This both because they could often directly work on substantially helpful things (depending on the role and organization) and because this would yield valuable work experience and connections.

I worry somewhat that this type of work is neglected due to being less emphasized and seeming lower status. Consider this an attempt to make this type of work higher status.

Pulling organizations mostly from here and here we get a list of orgs you could consider trying to work (specifically on AI policy) at:

Encode AI
Americans for Responsible Innovation (ARI)
Fairplay (Fairplay is a kids safety organization which does a variety of advocacy which isn't related to AI. Roles/focuses on AI would be most relevant. In my opinion, working on AI related topics at Fairplay is most applicable for gaining experience and connections.)
Common Sense (Also a kids safety organization)
The AI Policy Network (AIPN)
Secure AI project

To be clear, these organizations vary in the extent to which they are focused on catastrophic risk from AI (from not at all to entirely).