All of Lucas Pfeifer's Comments + Replies

Social, economic, or environmental changes happen relatively slowly, on the scale of months or years, compared to potent weapons, which can destroy whole cities in a single day. Therefore, conventional weapons would be a much more immediate danger if corrupted by an AI. The other problems are important to solve, yes, but first humanity must survive its more deadly creations. The field of cybersecurity will continue to evolve in the coming decades. Hopefully world militaries can keep up, so so that no rogue intelligence gains control of these weapons.

To repeat what I said above: even a total launch of all the nuclear weapons in the world will not be sufficient to ensure human extinction. However, AI driven social, economic, and environmental changes could ensure just that. If an AI got hold of a few nuclear weapons and launched them, that would, in fact, probably be counterproductive from the AI's perspective, because in the face of such a clear warning sign, humanity would probably unite and shut down AI research and unplug its GPU clusters.

It is a broad definition, yes, for the purpose of discussing the potential for the tools in question to be used against humans.

My point is this: we should focus first on limiting the most potent vectors of attack: those which involve conventional 'weapons'. Less potent vectors, (those that are not commonly considered as weapons) such as a 'stock trading algorithm', are of lower priority, since they offer more opportunities for detection and mitigation. 

An algorithm that amasses wealth should eventually set off red flags (maybe banks need to improve th... (read more)

That's exactly where I disagree. Conventional weapons aren't all that potent compared to social, economic, or environmental changes.

Entities compete in various ways, yes. Competition is an attack on another entities' chances of survival. Let's define a weapon as any tool which could be used to mount an attack. Of course, every tool could be used as a weapon, in some sense. It's a question of how much risk our tools pose to us, if they were to be used against us.

Why? That broadens the definition of "weapon" to mean literally any tool, technology, or tactic by which one person or organization can gain an advantage over another. It's far broader than and connotationally very different from the implied definition of "weapon" given by "building intelligent machines that are designed to kill people" and the examples of "suicide drones", "assassin drones" and "robot dogs with mounted guns". Redefining "weapon" in this way turns your argument into a motte-and-bailey, where you're redefining a word that connotes direct physical harm (e.g. robots armed with guns, bombs, knives, etc) to mean any machine that can, on its own, gain some kind of resource advantage over humans. Most people would not, for example, consider a superior stock-trading algorithm to be a "weapon", but your (re)definition, it would be.

These memes have been magnified by the words of politicians and media. We need our leaders to discuss things more reasonably. 

That said, restricting social media could also make sense. A requirement for in-person verification and limitation to a single account per site could be helpful.

More stringent (in-person) verification of bank account ownership could mitigate this risk.

Anyways, the chance of discovery for any covert operation is proportional to the size of the operation and the time that it takes to execute. The more we pre-limit the tools available to a rogue machine to cause harm immediate harm, the more likely we will catch it in the act.

There's no need for anything being covert. NetDragon Websoft is already having a chatbot as CEO. That chatbot can get funds wired by giving orders to employees.  If the chat bot would be a superintelligence, that would allow it to outcompete other companies. 

Which kinds of power do you refer to? Most kinds of power require human cooperation. The danger that an AI tricks us into destroying ourselves is small (though a false detection of nuclear weapons could do it). We need much more cooperation between world leaders, a much more positive dialogue between them.

Yes, you need human cooperation but human cooperation isn't hard. You can easily pay people money to get them to do what you want.  With time more processes can use robots instead of humans for doing physical work and if the AGI already has all the economic and political power there's nothing to stop the AGI from doing that. The AGI might then reuse land that's currently used for growing food for other purposes and step by step reduce the amount of food that's available and there never needs to be a point where a human thinks that they are working for the destruction of humanity. 

Yes, we need to solve the harder alignment problems as well. I suggested limited intelligent weapons as the first step, because these are the most obviously misanthropic AI being developed, and the clearest vector of attack for any rogue AI. Why don't we focus on that first, before we focus on the more subtle vectors.

The end of the post you linked said, basically, "we need a plan". Do you have a better one?

Abstraction means assigning a symbol to reference a set of other symbols. It saves time and memory: time by allowing retrieval of data based on a set of rules, memory by shrinking the size of the reference. 

For example: the words 'natural' and 'artificial': we sort things into one of these labels based on whether or not they were made by a human. A 'natural' thing could be 'physical' or 'biological'. An 'artificial' thing could be 'theory' or 'implementation'. If I don't need to distinguish between physical and biological things, instead of referring ... (read more)

We should sort reasoning into the inductive and deductive types: inductive provides a working model, deductive provides a more consistent (less contradictory) model. Deductive conclusions are guaranteed to be true, as long as their premises are true. Inductive conclusions are held with a degree of confidence, and depend on how well the variables in the study were isolated. For the empire example in the original post, there are many variables other than computing power that affect the rise and fall of empires. Computing power is only one of many technologie... (read more)

Yes, the linked post makes a lot of sense: wet labs should be heavily regulated.

Most of the disagreement here is based on two premises:

A: Other vectors (wet labs, etc.) present a greater threat. Maybe, though intelligent weapons are the most clearly misanthropic variant of AI.

B: AI will become so powerful, so quickly, that limiting its vectors of attack will not be enough.

If B is true, the only solution is a general ban on AI research. However, this would need to be a coordinated effort across the globe. There is far more support for halting intelligent weapons development than for a general ban. A general ban could come as a subsequent agreement.

Superintelligence is inherently dangerous, yes. The rapid increase in capabilities is inherently destabilizing, yes. However, practically speaking, we humans can handle and learn from failure, provided it is not catastrophic. An unexpected superintelligence would be catastrophic. However, it will be hard to convince people to abandon currently benign AI models on the principle that they could spontaneously create a superintelligence. A more feasible approach would start with the most dangerous and misanthropic manifestations of AI: those that are specialized to kill humans.

Best to slow down the development of AI in sensitive fields until we have a clearer understanding of its capabilities.

"Advocacy pushes you down a path of simplifying ideas rather than clearly articulating what's true, and pushing for consensus for the sake of coordination regardless of whether you've actually found the right thing to coordinate on."

  1. Simplifying (abstracting) ideas allows us to use them efficiently.
  2. Coordination allows us to combine our talents to achieve a common goal.
  3. The right thing is the one which best helps us achieve our cause.
  4. Our cause, in terms of alignment, is making intelligent machines that help us.
  5. The first step towards helping us is not killing
... (read more)

Human mercenaries causing a societal collapse? That would mean a large number of individuals who are willing to take orders from a machine to actively harm their communities. Very unlikely.

I'm wondering how you can hold that position given all the recent social disorder we've seen all over the world where social media driven outrage cycles have been a significant accelerating factor. People are absolutely willing to "take orders from a machine" (i.e. participate in collective action based on memes from social media) in order to "harm their communities" (i.e. cause violence and property destruction).

Yes, in the long term we will need a complete alignment strategy, such as permanent integration with our brains. However, before that happens, it would be prudent to limit the potential for a misaligned AI to cause permanent damage.

And, yes, we are in need of a more concrete plan and commitment from the people involved in the tech, especially with regards to lethal AI.

I'm thinking one or two years in the future is a plausible lower bound on time when a (technological) plan would need to be enacted to still have an effect on what happens eventually, or else in four years (from now) a killeveryone arrives (again, as an arguable lower bound, not as a median forecast). Unless it's fine by default, on its own, for reasons nobody reliably understands in advance, not because anyone had a plan. I think there is a good chance this is true, but betting the future of humanity on that is insane. Also, even if the first AGIs don't killeveryone, they might fail to establish strong coordination that prevents other misaligned AGIs from getting built, which do killeveryone, including the first AGIs. I think probably it's more like 6 and 8 years, respectively, but that's also not a lot of time to come up with a plan that depends on having fundamental science that's not yet developed.
  1. Which part of my statement does not make sense, and how so?
  2. My statement is relevent to the post. The beginning of the article partially defined hard alignment as preventing AI from destroying everything of value to us. The most likely way a rogue AI would do that is by gaining unauthorized access to weapons with built-in intelligence.
I dont think the most likely way is gaining access to autonomous weapons designed to kill. An ai smarter than all humans has many different options to take over, including making its own autonomous weapons
I don't get it; why would 'refraining from designing intelligent machines to kill people' help prevent AI from killing everyone? That's a really bold and outlandish claim that I think you have to actually defend and not just tell people to agree with... Like, from my perspective, you're just assuming the hard parts of the problem don't exist, and replacing all the hard parts with an easier problem ('avoid designing AIs to kill people'). It's the hard parts of the problem that seem on track to kill us; solving the easier problem doesn't seem to help.

I weak-downvoted this: in general I think it is informative for people to just state their opinion, but in this case the opinion had very little to do with the content of the post and was not argued for. The linked post also did not engage with any of the existing arguments around TAI risk.

(Not that I disagree with "limiting the spread of autonomous weapons is going to lead to fewer human deaths in expectation", but I don't think it is the best strategy to limit such kinds of impact.)

Given the unpredictable emergent behavior in researchers' AI models, we will likely see emergent AI behavior with real-world consequences. We can limit these consequences by limiting the potential vectors of malignant behavior, the primary being autonomous lethal weapons. See my post and underlying comments for further details:

  1. Focus means spending time or energy on a task. Our time and energy is limited, and the danger of rogue AI is growing by the year. We should focus our energies on by forming an achievable goal, making a reasonable plan, and acting according to the plan.
  2. Of course, there is a spectrum to the possible outcomes caused by a hypothetical rogue AI (rAI), ranging from insignificant to catastrophic. Any access the rAI might gain to human-made intelligent weapons would amplify the rAI's power to cause real-world damage.
The problem is that with AI, facing existential risk eventually is a certainty, the capability of unbounded autonomous consequentialist agency is feasible to develop (humans have that level of capability, and humans are manifestly feasible, so AIs would merely need to be at least as capable). Either there is a way of mitigating that risk, or it killseveryone. At which point, no second chances. This is different from world-shaking disasters, which do allow second chances and also motivate trying to do better next time. So this specifically is a natural threat level to consider on its own, not just as one of the points on a scale. And it's arguably plausible in startlingly near future. And nobody has a reliable plan (or arguably any plan), including the people building the technology right now.
  1. How is the framing of this post "off"? It provides an invitation for agreement on a thesis. The thesis is very broad, yes, and it would certainly be good to clarify these ideas.
  2. What is the purpose of sharing information, if that information does not lead in the direction of a consensus? Would you have us share information simply to disagree on our interpretation of it?
  3. The relationship between autonomous weapons and existential risk is this: autonomous weapons have built-in targeting and engagement capabilities.  If we could make an analogy to a human
... (read more)
A high level thing about LessWrong is that we're primarily focused on sharing information, not advocacy. There may be a later step where you advocate for something, but on LessWrong the dominant mode is discussing / explaining it, so that we can think clearly about what's true.  Advocacy pushes you down a path of simplifying ideas rather than clearly articulating what's true, and pushing for consensus for the sake of coordination regardless of whether you've actually found the right thing to coordinate on. "What is the first step towards alignment" isn't something there's a strong consensus on, but I don't think it's banning autonomous weapons, for a few reasons: * banning weapons doesn't help solve alignment, it just makes the consequences of one particular type of mis-alignment less bad. The first biggest problem with AI alignment is that it's a confusing domain we haven't dealt with before, and I think many first steps are more like "become less confused" than do a particular thing. * from the perspective of "hampering the efforts of a soft takeoff", it's not obvious you'd do autonomous weapons vs "dramatically improving security computer systems" or "better controlling wetlabs that the AI could hire to develop novel pathogens". If you ban autonomous weapons the AI can still just hire mercenaries – killer robots help but are neither necessary nor sufficient for an AI takeover. I bring this up to highlight that we're nowhere near a place where it's "obvious" that this is the first step, and that you can skip to building consensus towards it. My intent here is to communicate some subtle things about the culture and intent LessWrong, so you can decide whether you want to stick around and participate. This is not a forum for arbitrary types of communication, it's meant to focus on truthseeking first. Our experience is that people who veer towards advocacy-first or consensus-first tend to subtly degrade truthseeking norms in ways that are hard to reverse. I al

Existential danger is very much related to weapons. Of course,  AI could pose an existential threat without access to weapons. However, weapons provide the most dangerous vector of attack for a rogue, confused, or otherwise misanthropic AI. We should focus more on this immediate and concrete risk before the more abstract theories of alignment.

I'm not sure why you think that. Human weapons, as horrific as they are, can only cause localized tragedies. Even if we gave the AI access to all of our nuclear weapons, and it fired them all, humanity would not be wiped out. Millions (possibly billions) would perish. Civilization would likely collapse or be set back by centuries. But human extinction? No. We're tougher than that. But an AI that competes with humanity, in the same way that Homo sapiens competed with Homo neanderthalis? That could wipe out humanity. We wipe out other species all the time, and only in a small minority of cases is it because we've turned our weapons on them and hunted them into extinction. It's far more common for species to go extinct because humanity needed the habitat and other natural resources that that species needed to survive, and outcompeted that species for access to those resources.
Most actions by which actors increase their power aren't directly related to weapons. Existential danger comes from one AGI actor getting more power than human actors. 
If an AI does something with weapons that its operators don't want it to be doing, they will attempt to stop it. If they eventually succeed, then this doesn't literally killeveryone, and the AI probably wasn't the kind that can pose existential threat (even if it did cause a world-shaking disaster). If they can't stop the AI, at all, even after trying for as long as they live, then it's the kind of AI that would pose existential threat even without initially being handed access to weapons (if it wants weapons, it would be able to acquire them on its own). So the step of giving AI access to weapons is never a deciding factor for notkilleveryoneism, it's only a deciding factor for preventing serious harm on a scale that's smaller than that. "Focus" suggests reallocation of a limited resource that becomes more scarce elsewhere as a result. I don't think it's a good thing to focus less on making sure that the outcome is not literally everyone dying than we are doing now. It's possible to get to that point, where too much focus is on that, but I don't think we are there.

Yes, sometimes we need to prevent humans from causing harm. For sub-national cases, current technology is sufficient for this. On the scale of nations, we should agree to concrete limits on the intelligence of weapons, and have faith in our fellow humans to follow these limits. Our governments have made progress on this issue, though there is more to be made.

For example:

"With such loud public support in prominent Chinese venues, one might think that the U.S. mili... (read more)

Suppose an AI was building autonomous weapons in secret. This would involve some of the most advanced technology currently available. It would need to construct a sophisticated factory in a secret location, or else hide it in a shell company. The first would be very unlikely, the second is plausible, though still less likely. Better regulation and examination of weapons manufacturers could help mitigate this problem.

Items of response:

  1. An intelligent lethal machine is one which chooses and attacks a target using hardware and software specialized for the task of identifying and killing humans.
  2. Clearly, there is a spectrum of intelligence. We should define a limit on how much intelligence we are willing to build into machines which are primarily designed to destroy us humans and our habitat.
  3. Though militaries take more thorough precautions than most organizations, there are many historical examples of militaries suffering defeat, which, with better planning, could have been
... (read more)