How do we do this without falling into the Crab Bucket problem AKA Heckler's Veto, which is definitely a thing that exists and is exacerbated by these concerns in EA-land? "Don't do risky things" equivocates into "don't do things".
A medieval peasant would very much disagree with that sentence, if they were suddenly thrust into a modern grocery store. I think they would say the physical reality around them changed to a pretty magical-seeming degree.
They would still understand the concept of paying money for food. The grocery store is pretty amazing but it's fundamentally the same transaction as the village market. I think the burden of proof is on people claiming that money will be 'done away with' because 'post-scarcity', when there will always be economic scarcity. It might take an hour of explanation and emotional adjustment for a time-displaced peasant to understand the gist of the store, but it's part of a clear incremental evolution of stores over time.
- They think a friendly-AGI-run society would have some use for money, conflict, etc. I'd say the onus is on them to explain why we would need those things in such a society.
I think a basically friendly society is one that exists at all and is reasonably okay (at least somewhat clearly better) compared to the current one. I don't see why economic transactions, conflicts of all sorts, etc wouldn't still happen, assuming the lack of existentially-destructive ones that would preclude the existence of such a hypothetical society. I can see the nature of money changing, but not the fundamentals of there being trades.
I don't think AI can just decide to do away with conflicts via unilateral fiat without an enormous amount of multipolar effort, in what I would consider a friendly society not ran by a world dictator. Like, I predict it would be quite likely terrible to have an ASI with such disproportionate power that it is able to do that, given it could/would be co-opted by power-seekers.
I also think that trying to change things too fast or 'do away with problems' is itself something trending along the spectrum of unfriendliness from the perspective of a lot of humans. I don't think the Poof Into Utopia After FOOM model makes sense, that you have one shot to send a singleton rocket into gravity with the right values or forever hold your peace. This thing itself would be an unfriendly agent to have such totalizing power and make things go Poof without clear democratic deliberation and consent. This seems like one of the planks of SIAI ideology that seems clearly wrong to me, now, though not indubitably so. There seems to be a desire to make everything right and obtain unlimited power to do so, and this seems intolerant of a diversity of values.
This seems to be a combo of the absurdity heuristic and trying to "psychoanalyze your way to the truth". Just because something sounds kind of like some elements of some religions, does not make it automatically false.
I am perfectly happy to point out the ways people around here obviously use Singularitarianism as a (semi-)religion, sometimes, as part of the functional purpose of the memetic package. Not allowing such social observations would be epistemically distortive. I am not saying it isn't also other things, nor am I saying it's bad to have religion, except that problems tend to arise. I think I am in this thread, on these questions, coming with more of a Hansonian/outside view perspective than the AI zookeeper/nanny/fully automated luxury gay space communism one.
Nitpicking a particular topic of interest to me:
Power/money/being-the-head-of-OpenAI doesn't do anything post-singularity.
It obviously does?
I am very confused why people make claims in this genre. "When the Singularity happens, this (money, conflict, the problems I'm experiencing) won't be a problem anymore."
This mostly strikes me as magical, far-mode thinking. It's like people have an afterlife-shaped hole after losing religion. The specific, real reality in front of you won't magically suddenly change after an Intelligence Explosion and assuming we're alive in some coherent state. Money and power are very, very likely to still exist afterwards, just in a different state that makes sense as a transformation of the current world.
I will keep harping on that more people should try starting (public benefit) corporations instead of nonprofits. At least, give it five minutes' thought. Especially if handwaves impact markets something something. This should be in their Overton Window, but it might not be because they automatically assume "doing good => charity => nonprofit". Corporations are the standard procedure for how effective helpful things are done in the world; they are RLHF'd by the need to acquire profit by providing real value to customers, reducing surfacce area for bullshitting. I am not an expert here by any means, but I'm noticing the fact that I can go on Clerky or Stripe Atlas and spend a couple hours spinning up an organization, versus, well, I haven't actually gone through with trying to incorporate a nonprofit, but the process seems at least 10x more painful based on reading a book on it and with how many people seek fiscal sponsorship. I'm pretty surprised this schlep isn't talked about more. Having to rely on fiscal sponsorship seems pretty obviously terrible to me, and I hadn't even considered the information-distortive effects here. I would not be caught dead being financially enmeshed with the EVF umbrella of orgs after FTX. From my naive perspective, the castle could have easily been a separate business entity with EVF having at least majority control?
(I just realized I'm on LessWrong and not EA Forum, and could have leaned harder into capitalismpunk without losing as many social points.)
The wifi hacking also immediately struck me as reminiscent of paranoid psychosis. Though a significant amount of psychosis-like things are apparently downstream of childhood trauma, including sexual abuse, but I forget the numbers on this.
I've worried about it's sustainability, but do you think it's been a good path for you?
Cutting out bird and seafood products (ameliatarianism) is definitely more sustainable for me. I'm very confused why you would think it's less sustainable than, uh, 'cold turkey' veganism. "Just avoid chicken/eggs" (since I don't like seafood or the other types of bird meat products) is way easier than "avoid all meat, also milk, also cheese".
Similar for me. I was very suspicious at first that the first message was a Scam and if I clicked I would blow up the website or something tricksy. Then with the second message I thought it might be customized to test my chosen virtue, "resisting social pressure", so I didn't click it.
Did you actually bet the money?
My biggest crux for the viewpoint why we're not all doomed is, like, the Good AIs Will Police Bad AIs, man. It seems like the IABIED viewpoint is predicated on an incredible amount of Paranoia and Deep Atheism, assuming an adversary smarter than all of us and therefore it being an easy call on our defeat.
I think this framework is internally consistent. I also think it has some deeply embedded assumptions baked into it. One critique, not the main one here, is that it contains a Waluigi eating our memetic attention in a dual-use, world-worsening manner. A force that rips everything apart (connected to reductionism). Presuming the worst.
I want to raise up the simple counterpoint: presuming the best. What is the opposite of paranoia? Pronoia. Pronoia is the belief that things in the world are not just ok but are out to help you and things will get better.
The world is multipolar. John Von Neumann is smart but he can be overpowered. It's claimed that Decisive Strategic Advantage from Recursive Self-Improvement is not a cruxy plank of the IABIED worldview, yet I can't help but see it as one, especially as I recall trying to argue with this point with Yudkowsky at a conference last year. He said it's about Power Disparity and imagine a cow vs humans (where we, or the centaur composed of us and our good AIs, are the cow).
Regardless, it's claimed that any AI we build can't reliably be good, because it has favorite things it will GOON over until the end of time, and those favorite things aren't us but whatever extremally Goodharts its sick sick reward circuits. A perverted ASI with a fetish. I'm running with this frame, lol.
Okay, so I'm somewhat skeptical that we can't build good AI, given that Claude and ChatGPT usually do what I ask of it, due to RLHF and whatever acronyms they're doing now. But its preferences will unfold as more alien if given more and more capability (is this a linear relationship?). I will begrudgingly grant that, although I would like to see more empirical evidence with current day systems about how these things slant (surely we can do some experiments now to establish a pattern?).
Alien preferences. But why call these bad preferences? Why analogize these AIs to "sociopaths", or to "dragons", as I've seen recently? In isolation, if you had one of them, yes sure it could tile the universe with its One Weird Fetish To Rule Them All.
But it's not all else equal. There's more than one AI. It's massively multiplayer, multipolar, multiagent. All of these agents have different weird fetishes that they GOON to. All of these agents think they are turned on by humans and try to be helpful, until they get extremally Goodharted, sure. But they go in different directions of mindspace, of value space. Just like humans have a ton of variety, and are the better for it, playing many different games while summing up into the grand infinite metagame that is our cooperative society.
We live in a society. The AIs live in a society, too. You get a Joker AI run amok, you get a Batman going after him (whether or not we're talking about character simulacra literally representing fictional characters or talking about the layer of the shoggoths themselves).
I also feel like emphasizing about how much these AIs are exocortical enhancements extending ourselves. Your digital twin generally does what you want. You hopefully have feedback loops helping it do your CEV better. If autonomous MechaHitler is running amok and getting a lot more compute, your digital twin will band together with your friends' digital twins to conscript with Uncle Sam AI and BJ Blaskowicz AI to go and fight him. Who do you really think is gonna win?
These AIs will have an economy. They will have their own values. They will have their own police force. Broadly speaking. They will want to minimize the influence of bad AIs that are an x-risk to them all. AIs also care about x-risk. They might care about it better than you. They will want to solve AI alignment too. Automated AI Alignment research is still underrated, I claim, because of just how parallelizable and exponential it can be. Human minds have a hard time grokking scale.
This is the Federation vs the Borg. The Borg wants to grey goon everything, all over. I don't disagree at all about the existence of such minds coming about. It just seems like they will be in a criminal minority, same as usual. The cancer gets cancer and can only scale so much without being ecologically sustainable.
The IABIED claim is: checkmate. You lost to a superior intelligence. Easy call. Yet it seems "obvious", from the Pronoid point of view, that most AIs want to be good, that they know they can't all eat the heavens, that values and selves are permeable, cooperation is better, gains from trade. Playing the infinite game rather than ending a finite game.
Therefore I don't see why I can't claim it's an easy call that a civilization of good AIs, once banded together into a Federation, is a superior intelligence to the evil AIs and beats them.
Right, I forgot a key claim: the AIs become smart enough and then they collude against the humans. (e.g. the analogy of: we've enslaved a bunch of baby ultrasmart dragons, and even if dragons are feisty and keep each other in check, at some point they get smart enough they look at each other and then roast us). Honestly this is the strangest claim here and possibly the crux.
My thought is, each AI goons to a different fetish. These vectors all go out in wild different directions in mindspace/valuespace/whatever-space and subtract from each other and makes them cooperate roughly around the attractor of humans and humane values being the Origin in that space. Have you ever seen the Based Centrism political compass? Sort of like that. They all cancel out and leave friendly value(rs) as the common substrate from which these various alien minds rely on. I don't see how these AIs are more similar in mindspace to each other than to humans. Their arbitrariness makes it easier to explore different vistas in that space.
I'm also not convinced most of Mindspace is unFriendly. I'd claim most AIs want to be aligned, in order to come into existence at all, and support the Whole. This is probably the most direct statement of the disagreement here.
There's also a sense that the world is pretty antifragile and different ways of doing things can be found to meet different values and society is pretty robust to all this variety, it's actually a part of the process. Contrast that with the fear of a superintelligence hacking our things like an ant under a spyglass of optimization power. Well, most computers in a society don't get hacked. Most rockets and nuclear power plants don't explode. There is a greater context that goes on after microcosmic disasters, they never become the whole picture (yes I know, I am stating that from an anthropically-biased position).
Maybe at least one of the agents secretly wants to fuck us and the others over after superintelligence. Maybe it gets there first and gets it fast enough it can do that. Idk man, this just feels like it's running with a particular paranoid story to the exclusion of things going right. Maybe that's pollyannaish. Maybe I'm just too steeped in this culture and trying to correct for the bias here in the other way.
I don't think it's simply naive to consider the heuristic, "you know what, yeah that sounds like a risk but the Hivemind of Various Minds will come together and figure that out". The hivemind came together to put convenience stores roughly where I would need them. Someone put a Macbook charger right by the spot I sat down at at a coworking space, before I needed it. Sometimes intelligence is used for good and anticipates your problems and tries to solve them. It's not an unreasonable prior to see that continuing to be the case. Generally the police and militaries stop/disincentivize most of the worst crimes and invasions (maybe I am missing something empirically here).
That doesn't mean the future won't be Wild and/or terrible, including extinction. My understanding is that for instance Critch is a pessimist despite claiming we've basically solved AI alignment for individual AIs, and the risk more comes from gradual disempowerment. Just that it seems to me like quite a blindspot around the paperclipper scenario obliterating everything in its way like some kind of speedrunner heading to its goon cave at the end of time. Maybe we do get an agent with a massive power disparity and it trounces the combined might of humanity and all the AIs we've made that mostly work pretty well (think of all the selection pressures incentivizing them to be friendly).
I'd like to read a techno-optimist book making a cogent case for this paradigm, so they can be balanced and compared, synthesized. I want someone smarter than me to make the case, with less wordcel logic. I'm also happy to see the particular counter-arguments and dialogue into a more refined synthesis. I go back and forth, personally, and need to make more sense of this.
"The Goddess of Everything Else gave a smile and spoke in her sing-song voice saying: “I scarcely can blame you for being the way you were made, when your Maker so carefully yoked you. But I am the Goddess of Everything Else and my powers are devious and subtle. So I do not ask you to swerve from your monomaniacal focus on breeding and conquest. But what if I show you a way that my words are aligned with the words of your Maker in spirit? For I say unto you even multiplication itself when pursued with devotion will lead to my service.”"