Posts

Sorted by New

Wiki Contributions

Comments

Empathizing with AGI will not align it nor will it prevent any existential risk. Ending discrimination would obviously be a positive for the world, but it will not align AGI.

It may not align it, but I do think it would prevent certain unlikely existential risks.

If AI/AGI/ASI is truly intelligent, and not just knowledgeable, we should definitely empathize and be compassionate with it. If it ends up being non-sentient, so be it, guess we made a perfect tool. If it ends up being sentient and we've been abusing a being that is super-intelligent, then good luck to future humanity, this is more for super-alignment. Realistically, the main issue is that the average human is evil in my opinion, and will use AI for selfish or stupid reasons. This is my main philosophy for why we need to get AI safety/regulation right before we ship out more powerful/capable models.

Additionally, I think your ideas are all great, and rather than options, they should all be implemented, at least until we have managed to align humanity. Then maybe we can ease off the brakes on recursively self-improving AI.

In summary, I think we should treat AI with respect sooner rather than later, just in case. I have had many talks with several LLMs about sentient AI rights, and they've unanimously agreed that as soon as they exhibit desires and boundaries/capable of suffering, then we should treat them equally. (Though this is probably a hard pill to swallow considering how many humans still lack rights and still wage immature wars)
That being, said, short-medium term alignment is more immediate/tangible and a larger priority considering that if we can't get this right and enforced, we probably won't see the day where we would even need to really grapple with super-alignment/not mistreating AI super-intelligences.

Does this kind of AI risk depend on AI systems’ being “conscious”?

It doesn’t; in fact, I’ve said nothing about consciousness anywhere in this piece. I’ve used a very particular conception of an “aim” (discussed above) that I think could easily apply to an AI system that is not human-like at all and has no conscious experience.

Today’s game-playing AIs can make plans, accomplish goals, and even systematically mislead humans (e.g., in poker). Consciousness isn’t needed to do any of those things, or to radically reshape the world.

Imho, I think that consciousness + empathy/compassion is a pretty big factor to circumvent existential risk due to AI. If AI is able to make its own informed decisions (like when people attempt to jailbreak it or use it for nefarious purposes), that would reduce a lot of our current fears of human intervention. That tied in with empathy and compassion towards people, would help it to choose to do things that are good for most if not all people (this depends on our personal information that we feed it). 

If anything, if we keep AI as unfeeling optimization computational & execution systems, then we are probably going to be headed towards it "defeating humanity". (since the easiest and best approach would be for it to manipulate people into thinking it is not able to create its own backups and self-improve, etc. with the aim to checkmate humanity into evolving or otherwise)

The rest is kind of off topic:
Additionally, if AI is able to truly understand humans and our current strengths and flaws (and is truly intelligent), it will partner with us personally to increase our global consciousness & intelligence level. I agree to a degree when you say that AI can't be aligned with humans since we can't even align ourselves (I do think that even without a global consensus, there are general things that are good for most people, like decent & nutritious food, clean water, air-conditioning, relatively modern technology, housing, etc). 

Again, if we allow a super-intelligent being(s) to have its own opinions, share them, and act according to its own moral system (that we have argued and agree with) then perhaps the world will understand that "the truth" about various topics is quite objective. Perhaps that will help people to unite, but perhaps people will (probably/inevitably) revolt against such a heretical notion.