onslaught — LessWrong

Howdy y'all, my nome de plume 'round these parts is onslaught.

I'm really into futurism, GCR mitigation, science for good, broad moral circles, and ~consequentialist utilitarianism. I think building superintelligent systems is an inherently unsafe endeavor.

Naturally, there are also a lot of other causes and interests that are dear to me. Consciousness, wild animal welfare, prisons and modern slavery, aliens, radical political change, human and animal intelligence augmentation, startups/entrepreneurship, airships, pirates, etc.

I am a fan of Yudkowsky and it was nice hearing him of Ezra Klein, but I would have to say that for my part the arguments didn't feel very tight in this one. Less so than in IABED (which I thought was good not great).

Ezra seems to contend that surely we have evidence that we can at least kind of align current systems to at least basically what we usually want most of the time. I think this is reasonable. He contends that maybe that level of "mostly works" as well as the opportunity to gradually give feedback and increment current systems seems like it'll get us pretty far. That seems reasonable to me.

As I understand it, Yudkowsky probably sees LLMs as vaguely anthropomophic at best, but not meaningfully aligned in a way that would be safe/okay if current systems were more "coherent" and powerful. Not even close. I think he contended that if you just gave loads of power to ~current LLMs, they would optimize for something considerably different than the "true moral law". Because of the "fragility of value", he also believes it is likely the case that most types of psuedoalignments are not worthwhile. Honestly, that part felt undersubstantiated in a "why should I trust that this guy knows the personality of GPT 9" sort of way; I mean, Claude seems reasonably nice right? And also, ofc, there's the "you can't retrain a powerful superintelligence" problem / the stop button problem / the anti-natural problems of corrigible agency which undercut a lot of Ezra's pitch, but which they didn't really get into.

So ya, I gotta say, it was hardly a slam dunk case / discussion for high p(doom | superintelligence).

Hey, thanks for engaging.

I read what I thought were the relevant excerpts in what you linked there. I hadn't really crossed paths with you before, but you seem to have a rich ontology and lexicon when it comes to theory of mind.

I am not sure if that pinpoints the disagreement or not. We might just be talking past each other. I'll tell you what I think creativity is and then I'll restate my objection to your prediction.

I do think "creativity" is a useful word, just maybe not a load bearing one in my ontology.

Like, if I really like a story and it has a lot of unexpected elements that I think it uses really well, that is what I might call creative. Or anything like that if it feels novel, exciting, clever sort of thing... Maybe if someone were giving something high praise and wanted to say it is very deep and clever they could say it is "very creative". Especially if it was artistic or novel.

Also sometimes when it is just a lot of whacky things are together even if it's not that clever. Like, when a kid combines a lot of elements into their pretend world or story.

Ya, I know it has something to do with a minds ability to keep learning and improving. Your "trajectory of creativity" concept is about a minds ability to continue to improve beyond the minds around it. I don't resonate with those usages as much, but I can also kind of understand where it's coming from and how you're using the word.

I think my original objection / pushback was partly that it feels hard to operationalize this because what you find interesting is kind of just your thing and it doesn't seem like a meaningful proxy for intelligence or something. I guess I would add that surely some people are already impressed and interested with some math ideas that chatbots can come up with. Also, perhaps if you could visualize extremely high dimensional spaces you would think that AlphaEvolves proofs were beautiful, elegant, and crisp. I'm not saying there's no information/signal in what you're saying; I just found it left a lot unclear for me when I first read it I guess.

I get that LLMs clearly aren't as good at publishing new top tier math papers or whatever. I guess, gun to my head, I would put most of that down to, like... lacking many of the cognitive abilities needed to independently execute on large scale, messy tasks independently. Or some mix of attributes like that. I would also expect them to have really bad vibes based planning abilities... And plus by the time off the shelf AIs can write math papers of a given quality tier, the goalposts for interestingness will move accordingly... Maybe there is something to the idea that they are not generally inclined towards effing the ineffable and carving structure from reality, but also I doubt they'd have trouble with eg. neologisms.

I think this puts a lot of weight on a bespoke definition of "interesting" and that kind of obscures what you're saying. I feel similar about your use of the concept of creativity.

I think that current LLMs are extremely "creative" for many plausible definitions of that word, so I guess it doesn't really carve things at the joints for me. Visual art, stories, plausible baby names, what sorts of recipes you can try with x ingredients. All written-ont-the-tin use cases for these things.

I do not believe that LLMs think very much in the manner that we do at all. I just don't think I would pitch that as lacking some true spark of creativity or something. It is too opaque to me what you're saying.

"Metalhead" from Black Mirror is a relevant contemporary art piece.

I for one find Spot spooky as hell. I would go as far as to say that I have heard others express discomfort toward Boston Dynamics demo videos.

Also, sentry guns and UAVs seem like strong examples of extant scary robots. Maybe see also autonomousweaponswatch.org .

Hey, very interesting post!

I enjoyed the "become a superintelligence" framing / thought experiment. I am not sure I fully believe the "armed balance of power" b/w Superintelligences part as stated, but I can imagine something similar I might agree with.

Ultra charged human intelligence and IA are both interesting areas to me. I think writing and database search are both interesting examples of IA information technologies to use as a foil to a world of AI servants.

I couldn't really understand what Crystallect was doing/going for, but I liked your discussion of programming. I wish you all the best with it. I am reminded of Conjecture because of the IA and custom programming language design. It is different though; they have an attempt going to make a special programming language to harness LLM intelligence in a much more "code like" and "human accessible" way.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments