Sorted by New

Wiki Contributions


Netcentrica, in this letter your explicit opinion is that fiction with a deep treatment of the alignment problem will not be palatable to a wider audience. I think this is not necessarily true. I think that compelling fiction is perhaps the prime vector for engaging a wider, naive audience. Even the Hollywood treatment of I Robot touched on it and was popular. Not deep or nuanced, sure. But it was there. Maybe more intelligent treatments could succeed if produced with talent.

I mostly stopped reading sci Fi after the era of Asimov and Bradbury. I'd be interested in comments on which modern, popular authors have written or produced AI fiction with the most intelligent treatment of the assignment issue (or related issues), to establish a baseline.

Hmm, yeah, I guess that's a good point. I was thinking myopically at a systems level. The post is useful advice for a patient who is willing to do their own research, confident they can do it thoroughly, and is not afraid to "stare into the abyss" i.e risk getting freaked out or overwhelmed.

Although, I also wonder if insurance companies might try to exploit a patient's prior decision to decline recommended treatment/tests as a reason to not cover future costs...


I don't disagree with you exactly, but I think the focus on rational decision making misses the context the decisions are being made in. Isn't this just an unaligned incentives problem? When a patient complains of an issue, doctors face exposure to liability if they do not recommend tests to clarify the issue. If the tests indicate something, doctors face liability for not recommending corrective procedures. They generally face less liability for positively recommending tests and procedures because the risk is quantifiable beforehand and the patient makes the decision. If they decline a recommended test, the doctor can't be blamed.

The push to do less testing makes sense in that context. It has to emerge at the level of a movement so that the doctors have safety in numbers.

I am not in healthcare, perhaps this is cynical?

Edit, I see that Gwern already mentioned lawsuits briefly in a comment. But I think it deserves a lot more focus and obviates "you're not dealing with fully rational agents." I mean, maybe not, but that's not necessary to get this result.

Thanks for that link! I agree that there is a danger this pitch doesn't get people all the way to X-risk. I think that risk might be worth it, especially if EA notices popular support failing to grow fast enough - i.e., beyond people with obviously related background and interests. Gathering more popular support for taking small AI-related dangers seriously might move the bigger x-risk problems into the Overton window, whereas right now I think they are very much not. Actually I just realized that this is a great summary of my entire idea, basically, "move the Overton window with softballs before you try to pitch people the fastball."

But also as you said, that approach does model the problem as a war of attrition. If we really are metaphorically moments from the final battle, hail-mary attempts to recruit powerful allies is the right strategy. The problem is that these two strategies are pretty mutually exclusive. You can't be labeled as both a thoughtful, practical policy group with good ideas and also pull the fire alarms. Maybe the solution is to have two organizations pursuing different strategies, with enough distance between them that the alarmists don't tarnish the reputation of the moderates.

Whoops, apologies, none of the above. I meant to use the adage "you can't wake someone who is pretending to sleep" similarly to the old "It is difficult to make a man understand a thing when his salary depends on not understanding it." A person with vested interests is like a person pretending to sleep. They are predisposed not to acknowledge arguments misaligned with their vested interests, even if they do in reality understand and agree with the logic of those arguments. The most classic form of bias.

I was trying to express that in order to make any impression on such a person you would have to enter the conversation on a vector at least partially aligned with their vested interests, or risk being ignored at best and creating an enemy at worst. Metaphorically, this is like entering into the false "dream" of the person pretending to sleep.

Although I do like ACC, I haven't read any of the Rama series. It sounds like you're asking if I am advocating for a top down authoritarian society. It's hard to tell what triggered this impression without more detail from you, but possibly it was my mention of creating an "always-good-actor" bot that guards against other unaligned AGIs.

If that's right, please see my update to my post: I strongly disclaim to have good ideas about alignment, and should have better flagged that. The AGA bot is my best understanding of what Eliezer advocates, but that understanding is very weak and vague, and doesn't suggest more than extremely general policy ideas.

If you meant something else, please elaborate!

Thanks for your reply! I like your compressed version. That feels to me like it would land on a fair number of people. I like to think about trying to explain these concepts to my parents. My dad is a healthcare professional, very competent with machines, can do math, can fix a computer. If I told him superintelligent AI would make nanomachine weapons, he would glaze over. But I think he could imagine having our missile systems taken over by a "next-generation virus."

My mom has no technical background or interests, so she represents my harder test. If I read her that paragraph she'd have no emotional reaction or lasting memory of the content. I worry that many of the people who are the most important to convince fall into this category. 

Thanks for your replies! I'm really glad my thoughts were valuable. I did see your post promoting the contest before it was over, but my thoughts on this hadn't coalesced yet.

At this time, I don't know how much sense it makes to risk posing as someone you're not (or, at least, accidentally making a disinterested policymaker incorrectly think that's what you're doing).

Thanks especially for this comment. I noticed I was uncomfortable while writing that part of my post , and I should have paid more attention to that signal. I think I didn't want to water down the ending because the post was already getting long. I should have put a disclaimer that I didn't really know how to conclude, and that section is mostly a placeholder for what people who understand this better than me would pitch. To be clearer here: I do not intend to express any opinion on what to tell policymakers about solutions to these problems. I know hardly anything about practical alignment, just the general theory of why it is important. (I'm going to edit my post to point at this comment to make sure that's clear.)

What you're talking about, bypassing talk of superintelligence or recursive self-improvement, is something that I agree would be pure gold but only if it's possible and reasonable to skip that part. Hyperintelligent AI is sorta the bread and butter of the whole thing [...]

Yup, I agree completely.  I should have said in the post that I only weakly endorse my proposed approach. It would need to be workshopped to explore its value - especially, which signals from the listener suggested going deeper into the rabbithole versus popping back out into impacts on present day issues. My experience talking to people outside my field is that at the first signal someone doesn't take your niche issue seriously, you had better immediately connect it back to something they already care about or you've lost them. I wrote with the intention to provide the lowest common denominator set of arguments to get someone to take anything in the problem space seriously, so they at least have a hope of being worked slowly towards the idea of the real problem. I also wrote it as an ELI5-level for politicians who think the internet still runs on telephones. So like a "worst case scenario" conversation. But if this approach got someone worrying about the wrong aspect of the issue or misunderstanding critical pieces, it could backfire.

If I were going to update my pitch to better emphasize superintelligence, my intuition would be to lean into the video spoofing angle. It doesn't require any technical background to imagine a fake person socially engineering you on a zoom call. GPT3 examples are already sufficient to put home the Turing Test "this is really already happening" point. So the missing pieces are just seamless audio/video generation, and the ability of the bot to improvise its text-generation towards a goal as it converses. It's then a simple further step to envision the bad-actor bot's improvisation getting better and better until it doesn't make mistakes, is smarter than a person and can manipulate us into doing horrible things - especially because it can be everywhere at once. This argument scales from there to however much "AI-pill" the listener can swallow. I think the core strength of this framing is that the AI is embodied. Even if it takes the form of multiple people, you can see it and speak to it. You could experience it getting smarter, if that happened slowly enough. This should help someone naive get a handle on what it would feel like to be up against such an adversary.

The problem is that this body of knowledge is very, very cursed. There are massive vested interests, a ton of money and national security, built on a foundation of what is referred to as "bots" in this post. 

Yeah, absolutely...I was definitely tiptoeing around this in my approach rather than addressing it head on. That's because I don't have good ideas about that and suspect there might not be any general solutions. Approaching a person with those interests might just require a lot more specific knowledge and arguments about those interests to be effective. There is that old saying "You cannot wake someone who is pretending to sleep." Maybe you can, but you have to enter their dream to do it.

Cool, I just wrote a post with an orthogonal take on the same issue. Seems like Eliezer's nanotech comment was pretty polarizing. Self promoting...Pitching an Alignment Softball

I worry that the global response would be impotent even if the AGI was sandboxed to twitter. Having been through the pandemic, I perceive at least the United States' political and social system to be deeply vulnerable to the kind of attacks that would be easiest for an AGI - those requiring no physical infrastructure.

This does not directly conflict with or even really address your assertion that we'll all be around in 30 years. It seems like you were very focused here on a timeline for actual extinction. I guess I'm looking for a line to draw about "when will unaligned AGI make life no longer worth living, or at least destroy our ability to fight it?" I find this a much more interesting question, because at that point it doesn't matter if we have a month or 30 years left - we're living in caves on borrowed time.

My expectation is that we don't even need AGI or superintelligence, because unaligned humans are going to provide the intelligence part. The missing doomsday ingredient is ease of attack, which is getting faster, better, and cheaper every year.

Hi Moderators, as this is my first post I'd appreciate any help in giving it appropriate tags. Thanks

Load More