Conor Sullivan


Alignment For Foxes

Wiki Contributions


Sure, ignoring this sort of theoretical integration might[13] make you less morally consistent, but consistency is just one virtue

I've been thinking that consistency is overrated around these parts. Inconsistency supposedly makes you vulnerable to certain kinds of scams, but in practice humans just notice that they are being scammed and adapt. Really, the ability to be inconsistent is part of adaption and exploration. If every decision I made in my life had to be perfectly consistent with every previous decision, I'd never get anywhere!

Around here, humans using AI to do bad things is referred to as "misuse risks", whereas "misaligned AI" is used exclusively to refer to the AI being the primary agent. There are many thought experiments where the AI convinces humans to do things which result in bad outcomes. "Execute this plan for me, human, but don't look at the details too hard please." This is still considered a case of misaligned AI.

If you break it down analytically, there needs to be two elements for bad things to happen: the will to do so and the power to do so. As Daniel notes, some humans have already had the power to do so for many decades, but fortunately none have had the will. AI is expected to be extremely powerful too, and AI will have its own will (including a will to power), so both misaligned AI and misuse risks are things to take seriously.

Thank you for this. I was going by statistics shared in a recent episode of the All-In Podcast, and I took those stats for granted.

Zuckerberg has made a huge bet on VR/"The Metaverse", to the tune of multiple times the cost of the Apollo Program. The business world doesn't seem to like this bet, people are not bullish on VR but are very bullish on AI. So the pressure is on Mark to pivot to AI, but also to pivot to anything that is productizable.

I was making reference to the opening lines of Bee Movie, which were an old meme.

It was reported that high level diplomacy players have a different game-theoretical situation, because they all know eachother by (user)name. So if DiplomacyGrandmaster69 goes up against TheDiplomancer, they know their games will be publicly streamed, and the other high level players will see how honest they really are. Whereas casual players are playing a single-shot prisoner's dilemma, the pros are playing an iterated prisoner's dilemma, and that makes a difference.

I wonder what would happen if CICERO were placed in repeated 6-human-one-AI showmatches where everyone know which one was the AI. How would it fair?

Do we care that the tiger is a violent dangerous predator? Is that part of what it means to be a tiger? If we remove the predator from the tiger, is he still a tiger?

Isn't "misaligned AI" by definition a bad thing and "ASI-boosted humans" by definition a good thing? You're basically asking "How likely is <good outcome> given that we have <a machine that creates good outcomes>"

Well, in the cosmic scheme, civilization that are not expansionist and greedy control a smaller share of the universe and therefore matter less.

Load More