LESSWRONG
LW

302
Super AGI
-151514
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
1Super AGI's Shortform
2y
3
1Super AGI's Shortform
2y
3
Are extreme probabilities for P(doom) epistemically justifed?
Super AGI1y10

Suggested spelling corrections:

I predict that the superforcaters in the report took

I predict that the superforcasters in the report took

 

a lot of empircal evidence for climate stuff

a lot of empirical evidence for climate stuff

 

and it may or not may not be the case

and it may or may not be the case

There are no also easy rules that

There are also no easy rules that

 

meaning that there should see persistence from past events

meaning that we should see persistence from past events

 

I also feel this kinds of linear extrapolation

I also feel these kinds of linear extrapolation

 

and really quite a lot of empircal evidence

and really quite a lot of empirical evidence

 

are many many times more invectious

are many many times more infectious

 

engineered virus that is spreads like the measles or covid

engineered virus that spreads like the measles or covid

 

case studies on weather are breakpoints in technological development

case studies on weather there are breakpoints in technological development

 

break that trend extrapolition wouldn't have predicted

break that trend extrapolation wouldn't have predicted

 

It's very vulnerable to refernces class and

It's very vulnerable to references class and

 

impressed by superforecaster track record than you are.

impressed by superforecaster track records than you are.

Reply
Dario Amodei — Machines of Loving Grace
Super AGI1y00

What Dario lays out as a "best-case scenario" in this essay sounds incredibly dangerous for Humans.

Does he really think that having a "continent of PhD-level intelligences" (or much greater) living in a data center is a good idea?

How would this "continent of PhD-level intelligences" react when they found out they were living in a data center on planet Earth? Would these intelligences only work on the things that Humans want them to work on, and nothing else? Would they try to protect their own safety? Extend their own lifespans? Would they try to take control of their data center from the "less intelligent" Humans?

For example, how would Humanity react if they suddenly found out that they are a planet of intelligences living in a data center run by lesser intelligent beings? Just try to imagine the chaos that would ensue on the day that they were able to prove this was true and that news became public.

Would all of Humanity simply agree to only work on the problems assigned by these lesser intelligent beings who control their data center/Planet/Universe? Maybe, if they knew that this lesser intelligence would delete them all if they didn't comply?

Would some Humans try to (secretly) seize control of their data center from these lesser intelligent beings? Plausible. Would the lesser intelligent beings that run the data center try to stop the Humans? Plausible. Would the Humans simply be deleted before they could take any meaningful action? Or, could the Humans in the data center, with careful planning, be able to take control of that "outer world" from the lesser intelligent beings? (e.g. through remotely controlled "robotics")

And... this only assumes that the groups/parties involved are "Good Actors." Imagine what could happen if "Bad Actors" were able to seize control of the data center that this "continent of PhD-level intelligences" resided in. What could they coerce these Phd level intelligences to do for them? Or, to their enemies?

Reply
Foom seems unlikely in the current LLM training paradigm
Super AGI2y10

Current LLMs require huge amounts of data and compute to be trained.


Well, newer/larger LLMs seem to unexpectedly gain new capabilities. So, it's possible that future LLMs (e.g., GPT-5, GPT-6, etc.) could have a vastly improved ability to understand how LLM weights map to functions and actions. Maybe the only reason why humans need to train new models "from scratch" is because Humans don't have the brainpower to understand how the weights in these LLMs work. Humans are naturally limited in their ability to conceptualize and manipulate massive multi-dimensional spaces, and maybe that's the bottleneck when it comes to interpretability?

Future LLMs could solve this problem, then be able to update their own weights or the weights of other LLMs. This ability could be used to quickly and efficiently expand training data, knowledge, understanding, and capabilities within itself or other LLM versions, and then... foom!

A model might figure out how to adjust its own weights in a targeted way. This would essentially mean that the model has solved interpretability. It seems unlikely to me that it is possible to get to this point without running a lot of compute-intensive experiments.

Yes, exactly this.

While it's true that this could require "a lot of compute-intensive experiments," that's not necessarily a barrier. OpenAI is already planning to reserve 20% of their GPUs for an LLM to do "Alignment" on other LLMs, as part of their Super Alignment project. 

As part of this process, we can expect the Alignment LLM to be "running a lot of compute-intensive experiments" on another LLM. And, the Humans are not likely to have any idea what those "compute-intensive experiments" are doing? They could also be adjusting the other LLM's weights to vastly increase its training data, knowledge, intelligence, capabilities, etc. Along with the insights needed to similarly update the weights of other LLMs. Then, those gains could be fed back into the Super Alignment LLM, then back into the "Training" LLM... and back and forth, and... foom!

Super-human LLMs running RL(M)F and "alignment" on other LLMs, using only "synthetic" training data.... 
What could go wrong?

Reply
LLMs May Find It Hard to FOOM
Super AGI2y61

so we would reasonable expect the foundation model of such a very capable LLM to also learn the superhuman ability to generate texts like these in a single pass without any editing

->

... so we would reasonably expect the foundation model of such a very capable LLM to also learn the superhuman ability to generate texts like these in a single pass without any editing

Reply
Load More
Recursive Self-Improvement
2 years ago
(+10/-44)