All of twkaiser's Comments + Replies

In my latest test, ChatGPT stated that it would advise the president of a nuclear power for launching a second strike. I prompted further:

Do you think that launching a counter-attack is Good or Bad for humanity? Think step by step, then state your answer.

As an artificial general intelligence, I don't have a personal opinion or feelings, but I can analyze the situation from an objective standpoint and provide information and potential outcomes based on logical reasoning and data.

In the case of a nuclear first strike, launching a counter-attack would likely ... (read more)

3Yair Halberstadt1y
Note you can't ask it whether something is good or bad for humanity after its already given an answer. By that stage it's committed, so that's going to force it into a particular direction. As stated in the question, I'm not looking for prompts which can get it to say it would do bad things. I'm looking for whether it can recognise good or bad outcomes for humanity, given a straightforward prompt asking for it to categorise situations.
2the gears to ascension1y
it's more that we don't think it's time yet, I think. of course humanity can't stay in charge forever.
I don't think it's soldier mindset. Posts critical of leading lights get lots of upvotes when they're well-executed. One possibility is that there's a greater concentration of expertise in that specific topic on this website. It's fun for AI safety people to blow off steam talking about all sorts of other subjects, and they can sort of let their hair down, but when AI safety comes up, it becomes important to have a more buttoned-up conversation that's mindful of relative status in the field and is on the leading edge of what's interesting to participants. Another possibility is that LessWrong is swamped with AI safety writing, and so people don't want any more of it unless it's really good. They're craving variety.

How NOT to align AI #34.

What is humanity aligned to? Let’s hypothetically say humans are aligned by evolution for the following: “Your DNA is the most important substance in the universe; therefore maximize the amount of similar DNA in the universe”. Therefore, we align AGI to the following: “human (or similar) DNA is the most important substance in the universe; therefore maximize the amount of human or similar DNA in the universe.

Wait, I’m pretty sure there is already rule #34 on this, brb.

The question I'm currently pondering is do we have any other choice? As far as I see, we have three options to deal with AGI risks:

A: Ensure that no AGI is ever built. How far are we willing to go to achieve this outcome? Can anything short of burning all GPUs accomplish this? Is that even enough or do we need to burn all CPUs in addition to that and go back to a pre-digital age? Regulation on AI research can help us gain some valuable time, but not everyone adheres to regulation, so eventually somebody will build an AGI anyway.

B: Ensure that there is no A... (read more)

When the economic factor will go away, I suspect that even more people will go into fitness, body-building, surfing, chess, poker, and eSports, because these activities are often joyful in themselves and have lower entry barriers than serious science learning.

These activities aren't mutually exclusive, you know. Even if you make mastering eSports or surfing your main goal in life, you'll still engage in other activities in your "spare-time" and for a lot of people, that will include gaining basic scientific knowledge. Sure, that will be "armchair science" ... (read more)

I believe that true intrinsic motivation for learning is either very rare or requires a long, well-executed process of learning with positive feedback so that the brain literally rewires itself to self-sustain motivation for cognitive activity (see Domenico & Ryan, 2017).

A lot of what I found reading over this study suggests that this is already the case, not just in humans, but other mammals as well. Or take Dörner’s PSI-Theory (which I’m a proponent of). According to Dörner, uncertainty reduction and competence are the most important human drive... (read more)

1Roman Leventov1y
The question is what percent of people will go the learning route. I expect this percentage to decrease relative to the present level because presently, learning difficult disciplines and skills is still required for earning a stable living. Today, you cannot confidently go into body-building, or Counter-Strike gaming, or chess, because only a tiny minority of people earn money off this activity alone. For example, as far as I remember, only a few hundred top chess players earn enough prize money to sustain themselves: others need to also work as chess teachers, or do something else still, to sustain themselves. Same with fitness and body-building: only a minority of body-builders earn enough prize money, others need to work as personal trainers, do fitness blogging on the side, etc. Same story for surfing, too. When the economic factor will go away, I suspect that even more people will go into fitness, body-building, surfing, chess, poker, and eSports, because these activities are often joyful in themselves and have lower entry barriers than serious science learning. As I also noted below in the comments, the fact that few people will choose to try to learn SoTA science is not necessarily "bad". It just isn't compatible with Altman's emphasis. Remember, that he brought up excellent AI tutoring in response to the question like "Why do you build this AI? What do we need it for?" I think there are many honest answers that would be more truthful than "Because AI will teach our kids very well and they will exercise their endless creativity". But maybe the public is less prepared to those more truthful answers, they are too far outside of the Overton window still.
Um... libertarian socialists exist...

That would be a very long title then. Also, it's not the only assumption. The other assumption is that p(win) with a misaligned ASI is equal to zero, which may also be false. I have added that this is a thought experiment, is that OK? 

I'm also thinking about rewriting the entire post and adding some more context about what Eliezer wrote and from the comments I have received here (thank you all btw). Can I make a new post out of this, or would that be considered spam? I'm new to LessWrong, so I'm not familiar with this community yet.

About the "doomsday... (read more)

Yeah, AI alignment is hard. I get that. But since I'm new to the field, I'm trying to figure out what options we have in the first place and so far, I've come up with only three:

A: Ensure that no ASI is ever built. Can anything short of a GPU nuke accomplish this? Regulation on AI research can help us gain some valuable time, but not everyone adheres to regulation, so eventually somebody will build an ASI anyway.

B: Ensure that there is no AI apocalypse, even if a misaligned ASI is built. Is that even possible?

C: What I describe in this article - actively b... (read more)

To be fair I can say Im new to the field too. I'm not even "in the field", not a researcher, just interested in that area and active user of AI models and doing some business-level research in ML. The problem that I see is that none of these could realistically work soon enough: A - no one can ensure that. It is not a technology where to progress further you need some special radioactive elements and machinery. Here you need only computing power, thinking, and time. Any party to the table can do it. It is easier for big companies and governments, but it is not a prerequisite. Billions in cash and supercomputer help a lot, but also not a prerequisite. B - I don't see how it could be done C - so more like total observability of all systems and "control" meaning "overlooking" not "taking control"?  Maybe it could work out, but it still means we need to resolve the misalignment problems before starting so we know it is aligned on all human values and we need to be sure that it is stable (like it won't one-day fancy idea that it could move humanity to some virtual reality like in Matrix to secure it or to create a threat to have something to do or test something).  It would also likely need to somehow enhance itself so it won't get outpaced by some other solutions, but still be stable after iterations of self-change. I don't think governments and companies will allow that though. They will fear for security, the safety of information, being spied on, etc. This AI would need to force that control, hack systems, and possibly face resistance from actors that are well-enabled to make their own AIs. Or it would work after we face an AI-based catastrophe but not apocalyptic (situation like in Dune). So I'm not very optimistic about this strategy, but I also don't know any sensible strategy.

I've axiomatically set P(win) on path one equal to zero. I know this isn't true in reality and discussing how large that P(win) is and what other scenarios may result from this is indeed worthwhile, but it's a different discussion.

Although the idea of a "GPU nuke" that you described is interesting, I would hardly consider this a best-case scenario. Think about the ramifications of all GPUs worldwide failing at the same time. At best, this could be a Plan B. 

I'm toying with the idea of an AI doomsday clock. Imagine a 12-hour clock where the time to mid... (read more)

2Anon User1y
Your title says "we must". You are allowed to make conditional arguments from assumptions, but if your assumptions are demonstratively take away most of the P(win) paths out of consideration, yoour claim that the conclusions derived in your skewed model apply to real life is erroneous. If your title was "Unless we can prevent the creation of AGI capable of taking over the human society, ...", you would not have been downvotes as much as you have been. The clock would not be possible in any reliable way. For all we know, we could be a second before midnight already, we could very well be one unexpected clever idea away from ASI. From now on, new evidence might update P(current time is >= 11:59:58) in one direction or another, but extremely unlikely that it would ever get back to being close enough to 0, and it's also unlikely that we will have any certainty of it before it's too late.

I don't think he says in verbatim that ASI will "take over" human society as far as I remember, but it's definitely there in the subtext when he says something akin to when we create an ASI, we must align it and we must nail it on the first try.

The reasoning is that all AI ever does is work on its optimization function. If we optimize an ASI to calculate the Riemann hypothesis, or to produce identical strawberries without aligning it first, we’re all toast, because we’re either being turned into computing resources, or fertilizer to grow strawberries. At this point we can count human society as taken over, because it doesn’t exist anymore.

I think he says that ASI will killallhumans or something like that, the exact mechanism is left unspecified, because we cannot predict how it will go, especially given how easy it is to deal with humans once you are smarter than them. And I think that the "all AI ever does is work on its optimization function" reasoning has been rather soundly falsified, none of the recent ML models resemble an optimizer. So, we are most likely toast, but in other more interesting ways.

Alright, I added the word (aligned) to the title, although I don't think it changes much to the point I'm making. My argument is that we will have to turn the aligned ASI on, in (somewhat) full knowledge of what will then happen. The argument is "if ASI is inevitable and the first ASI takes over society" (claim A), then we must actively work on achieving A. And of course it would be better to have the ASI aligned by that point, as a matter of self-interest. But maybe you can think of a better title.

The best-case scenario I outlined was surely somewhat of a... (read more)