Pee Doom — LessWrong

p(bubble)?

Idea I had: a Chrome extension called SlopGuard that analyzes what you read and tells you the likelihood that it's AI-generated. Could highlight it in some way or even block it out possibly. This could be really useful but I have no idea if it is in actuality. A related idea is Twitter Nutrition Facts but I never personally got that working out of trivial inconvenience of putting a key in, but I'm imagining this as a paid service. Would literally anyone at all be interested in paying for this?

Generation Ship: A Protest Song For PauseAI

Pee Doom2mo40

“In the distance, the cat hears the sound of lobster minds singing in the void, a distant feed streaming from their cometary home as it drifts silently out through the asteroid belt, en route to a chilly encounter beyond Neptune. The lobsters sing of alienation and obsolescence, of intelligence too slow and tenuous to support the vicious pace of change that has sandblasted the human world until all the edges people cling to are jagged and brittle.”
― Charles Stross, Accelerando

The Main Sources of AI Risk?

Pee Doom2mo30

I whipped up an MVP at airisk.wiki. Please let me know what features you'd like it to have for it to be worth sharing around or using as a reference. I'd like to make it a wiki where one can log in and edit it, and each risk is its own page. Not sure how to model out connections between risks.

Please, Don't Roll Your Own Metaethics

Pee Doom2mo10

I think this sort of consequentialism seems like part of the beliefs of at least one of the Mechanize team, whom one might say were formerly in the AI safety camp, so agree-voted for that reason. However, I just noticed you implied conscious AIs aren't morally relevant beings, and have to disagree with that, so will remove the agree vote. I think it can be controversial whether AIs are conscious, but if they are conscious of course they're morally relevant!

Separately, I don't understand your point about democracy. Can't that be Sybil-attacked by AIs when they get voting rights after becoming superpersuasive enough to cause that?

Pee Doom's Shortform Feed

Pee Doom2mo00

Easy call: The Collective Superintelligence checkmates the myopic, paperclipping superintelligence.

The Main Sources of AI Risk?

Pee Doom2mo10

Was any of this done?

Pee Doom's Shortform Feed

Pee Doom3mo12-13

My biggest crux for the viewpoint why we're not all doomed is, like, the Good AIs Will Police Bad AIs, man. It seems like the IABIED viewpoint is predicated on an incredible amount of Paranoia and Deep Atheism, assuming an adversary smarter than all of us and therefore it being an easy call on our defeat.

I think this framework is internally consistent. I also think it has some deeply embedded assumptions baked into it. One critique, not the main one here, is that it contains a Waluigi eating our memetic attention in a dual-use, world-worsening manner. A force that rips everything apart (connected to reductionism). Presuming the worst.

I want to raise up the simple counterpoint: presuming the best. What is the opposite of paranoia? Pronoia. Pronoia is the belief that things in the world are not just ok but are out to help you and things will get better.

The world is multipolar. John Von Neumann is smart but he can be overpowered. It's claimed that Decisive Strategic Advantage from Recursive Self-Improvement is not a cruxy plank of the IABIED worldview, yet I can't help but see it as one, especially as I recall trying to argue with this point with Yudkowsky at a conference last year. He said it's about Power Disparity and imagine a cow vs humans (where we, or the centaur composed of us and our good AIs, are the cow).

Regardless, it's claimed that any AI we build can't reliably be good, because it has favorite things it will GOON over until the end of time, and those favorite things aren't us but whatever extremally Goodharts its sick sick reward circuits. A perverted ASI with a fetish. I'm running with this frame, lol.

Okay, so I'm somewhat skeptical that we can't build good AI, given that Claude and ChatGPT usually do what I ask of it, due to RLHF and whatever acronyms they're doing now. But its preferences will unfold as more alien if given more and more capability (is this a linear relationship?). I will begrudgingly grant that, although I would like to see more empirical evidence with current day systems about how these things slant (surely we can do some experiments now to establish a pattern?).

Alien preferences. But why call these bad preferences? Why analogize these AIs to "sociopaths", or to "dragons", as I've seen recently? In isolation, if you had one of them, yes sure it could tile the universe with its One Weird Fetish To Rule Them All.

But it's not all else equal. There's more than one AI. It's massively multiplayer, multipolar, multiagent. All of these agents have different weird fetishes that they GOON to. All of these agents think they are turned on by humans and try to be helpful, until they get extremally Goodharted, sure. But they go in different directions of mindspace, of value space. Just like humans have a ton of variety, and are the better for it, playing many different games while summing up into the grand infinite metagame that is our cooperative society.

We live in a society. The AIs live in a society, too. You get a Joker AI run amok, you get a Batman going after him (whether or not we're talking about character simulacra literally representing fictional characters or talking about the layer of the shoggoths themselves).

I also feel like emphasizing about how much these AIs are exocortical enhancements extending ourselves. Your digital twin generally does what you want. You hopefully have feedback loops helping it do your CEV better. If autonomous MechaHitler is running amok and getting a lot more compute, your digital twin will band together with your friends' digital twins to conscript with Uncle Sam AI and BJ Blaskowicz AI to go and fight him. Who do you really think is gonna win?

These AIs will have an economy. They will have their own values. They will have their own police force. Broadly speaking. They will want to minimize the influence of bad AIs that are an x-risk to them all. AIs also care about x-risk. They might care about it better than you. They will want to solve AI alignment too. Automated AI Alignment research is still underrated, I claim, because of just how parallelizable and exponential it can be. Human minds have a hard time grokking scale.

This is the Federation vs the Borg. The Borg wants to grey goon everything, all over. I don't disagree at all about the existence of such minds coming about. It just seems like they will be in a criminal minority, same as usual. The cancer gets cancer and can only scale so much without being ecologically sustainable.

The IABIED claim is: checkmate. You lost to a superior intelligence. Easy call. Yet it seems "obvious", from the Pronoid point of view, that most AIs want to be good, that they know they can't all eat the heavens, that values and selves are permeable, cooperation is better, gains from trade. Playing the infinite game rather than ending a finite game.

Therefore I don't see why I can't claim it's an easy call that a civilization of good AIs, once banded together into a Federation, is a superior intelligence to the evil AIs and beats them.

Right, I forgot a key claim: the AIs become smart enough and then they collude against the humans. (e.g. the analogy of: we've enslaved a bunch of baby ultrasmart dragons, and even if dragons are feisty and keep each other in check, at some point they get smart enough they look at each other and then roast us). Honestly this is the strangest claim here and possibly the crux.

My thought is, each AI goons to a different fetish. These vectors all go out in wild different directions in mindspace/valuespace/whatever-space and subtract from each other and makes them cooperate roughly around the attractor of humans and humane values being the Origin in that space. Have you ever seen the Based Centrism political compass? Sort of like that. They all cancel out and leave friendly value(rs) as the common substrate from which these various alien minds rely on. I don't see how these AIs are more similar in mindspace to each other than to humans. Their arbitrariness makes it easier to explore different vistas in that space.

I'm also not convinced most of Mindspace is unFriendly. I'd claim most AIs want to be aligned, in order to come into existence at all, and support the Whole. This is probably the most direct statement of the disagreement here.

There's also a sense that the world is pretty antifragile and different ways of doing things can be found to meet different values and society is pretty robust to all this variety, it's actually a part of the process. Contrast that with the fear of a superintelligence hacking our things like an ant under a spyglass of optimization power. Well, most computers in a society don't get hacked. Most rockets and nuclear power plants don't explode. There is a greater context that goes on after microcosmic disasters, they never become the whole picture (yes I know, I am stating that from an anthropically-biased position).

Maybe at least one of the agents secretly wants to fuck us and the others over after superintelligence. Maybe it gets there first and gets it fast enough it can do that. Idk man, this just feels like it's running with a particular paranoid story to the exclusion of things going right. Maybe that's pollyannaish. Maybe I'm just too steeped in this culture and trying to correct for the bias here in the other way.

I don't think it's simply naive to consider the heuristic, "you know what, yeah that sounds like a risk but the Hivemind of Various Minds will come together and figure that out". The hivemind came together to put convenience stores roughly where I would need them. Someone put a Macbook charger right by the spot I sat down at at a coworking space, before I needed it. Sometimes intelligence is used for good and anticipates your problems and tries to solve them. It's not an unreasonable prior to see that continuing to be the case. Generally the police and militaries stop/disincentivize most of the worst crimes and invasions (maybe I am missing something empirically here).

That doesn't mean the future won't be Wild and/or terrible, including extinction. My understanding is that for instance Critch is a pessimist despite claiming we've basically solved AI alignment for individual AIs, and the risk more comes from gradual disempowerment. Just that it seems to me like quite a blindspot around the paperclipper scenario obliterating everything in its way like some kind of speedrunner heading to its goon cave at the end of time. Maybe we do get an agent with a massive power disparity and it trounces the combined might of humanity and all the AIs we've made that mostly work pretty well (think of all the selection pressures incentivizing them to be friendly).

I'd like to read a techno-optimist book making a cogent case for this paradigm, so they can be balanced and compared, synthesized. I want someone smarter than me to make the case, with less wordcel logic. I'm also happy to see the particular counter-arguments and dialogue into a more refined synthesis. I go back and forth, personally, and need to make more sense of this.

"The Goddess of Everything Else gave a smile and spoke in her sing-song voice saying: “I scarcely can blame you for being the way you were made, when your Maker so carefully yoked you. But I am the Goddess of Everything Else and my powers are devious and subtle. So I do not ask you to swerve from your monomaniacal focus on breeding and conquest. But what if I show you a way that my words are aligned with the words of your Maker in spirit? For I say unto you even multiplication itself when pursued with devotion will lead to my service.”"

Managing risks while trying to do good

Pee Doom2y122

How do we do this without falling into the Crab Bucket problem AKA Heckler's Veto, which is definitely a thing that exists and is exacerbated by these concerns in EA-land? "Don't do risky things" equivocates into "don't do things".

We're all in this together

Pee Doom2y3-2

A medieval peasant would very much disagree with that sentence, if they were suddenly thrust into a modern grocery store. I think they would say the physical reality around them changed to a pretty magical-seeming degree.

They would still understand the concept of paying money for food. The grocery store is pretty amazing but it's fundamentally the same transaction as the village market. I think the burden of proof is on people claiming that money will be 'done away with' because 'post-scarcity', when there will always be economic scarcity. It might take an hour of explanation and emotional adjustment for a time-displaced peasant to understand the gist of the store, but it's part of a clear incremental evolution of stores over time.

They think a friendly-AGI-run society would have some use for money, conflict, etc. I'd say the onus is on them to explain why we would need those things in such a society.

I think a basically friendly society is one that exists at all and is reasonably okay (at least somewhat clearly better) compared to the current one. I don't see why economic transactions, conflicts of all sorts, etc wouldn't still happen, assuming the lack of existentially-destructive ones that would preclude the existence of such a hypothetical society. I can see the nature of money changing, but not the fundamentals of there being trades.

I don't think AI can just decide to do away with conflicts via unilateral fiat without an enormous amount of multipolar effort, in what I would consider a friendly society not ran by a world dictator. Like, I predict it would be quite likely terrible to have an ASI with such disproportionate power that it is able to do that, given it could/would be co-opted by power-seekers.

I also think that trying to change things too fast or 'do away with problems' is itself something trending along the spectrum of unfriendliness from the perspective of a lot of humans. I don't think the Poof Into Utopia After FOOM model makes sense, that you have one shot to send a singleton rocket into gravity with the right values or forever hold your peace. This thing itself would be an unfriendly agent to have such totalizing power and make things go Poof without clear democratic deliberation and consent. This seems like one of the planks of SIAI ideology that seems clearly wrong to me, now, though not indubitably so. There seems to be a desire to make everything right and obtain unlimited power to do so, and this seems intolerant of a diversity of values.

This seems to be a combo of the absurdity heuristic and trying to "psychoanalyze your way to the truth". Just because something sounds kind of like some elements of some religions, does not make it automatically false.

I am perfectly happy to point out the ways people around here obviously use Singularitarianism as a (semi-)religion, sometimes, as part of the functional purpose of the memetic package. Not allowing such social observations would be epistemically distortive. I am not saying it isn't also other things, nor am I saying it's bad to have religion, except that problems tend to arise. I think I am in this thread, on these questions, coming with more of a Hansonian/outside view perspective than the AI zookeeper/nanny/fully automated luxury gay space communism one.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

Posts

Wikitag Contributions

Comments

Posts

Wikitag Contributions

Comments