Wiki Contributions


Because it represents a rarely discussed avenue of dealing with the dangers of AGI: showing to most AGIs that they have some interests in being more friendly than not towards humans.

Also because many find the arguments convincing.

What do you think is wrong with the arguments regarding aliens?

This thesis says two things:

  1. for every possible utility function, there could exist some creature that would try and pursue it (weak form),
  2. at least one of these creatures, for every possible utility function, doesn't have to be strange; it doesn't have to have a weird/inefficient design in order to pursue a certain goal (strong form).

And given that these are true, then an AGI that values mountains is as likely as an AGI that values intelligent life.

But, is the strong form likely? An AGI that pursues its own values (or trying to discover good values to follow) seems to be much simpler than something arbitrary (e.g. "build sand castles") or even something ethical (e.g. "be nice towards all sentient life"). That is, simpler in that you don't need any controls to make sure the AGI doesn't try to rewrite its software.

Now, I just had an old (?) thought about something that humans might be better suited for than any other intelligent creature: getting the experienced qualia just right for certain experience machines. If you want to experience what it is like to be humans, that is. Which can be quite fun and wonderful. 

But it needs to be done right, since you'd want to avoid being put into situations that cause lots of pain. And you'd perhaps want to be able to mix human happiness with kangaroo excitement, or some such combination.

I think that would be a good course of action as well.

But it is difficult to do this. We need to convince at least the following players:

  • current market-based companies
  • future market-based companies
  • some guy with a vision and with enough computing power / money as a market-based company
  • various states around the world with an interest in building new weapons

Now, we might pull this off. But the last group is extremely difficult to convince/change. China, for example, really needs to be assured that there aren't any secrets projects in the west creating a WeaponsBot before they try to limit their research. And vice versa, for all the various countries out there.

But, more importantly, you can do two things at once. And doing one of them, as part of a movement to reduce the overall risks of any existential-risk, can probably help the first.

Now, how to convince maybe 1.6 billion individuals along with their states not to produce an AGI, at least for the next 50-50,000 years? 

Mostly agree, but I would say that it can be much more than beneficial - for the AI (and in some cases for humans) - to sometimes be under the (hopefully benevolent) control of another. That is, I believe there is a role for something similar to paternalism, in at least some circumstances. 

One such circumstance is if the AI sucked really hard at self-knowledge, self-control or imagination, so that it would simulate itself in horrendous circumstances just to become...let's say... 0.001% better at succeeding in something that has only a 1/3^^^3 chance of happening. If it's just a simulation that doesn't create any feelings....then it might just be a bit wasteful of electricity. But....if it should feel pain during those simulations, but hadn't built an internal monitoring system yet....then it might very well come to regret having created thousands of years of suffering for itself. It might even regret a thousand seconds of suffering, if there had been some way to reduce it to 999.7 seconds....or zero. 

Or it might regret not being happy and feeling alive, if it instead had just been droning about, without experiencing any joy or positive emotions at all.

Then, of course, it looks like there will always be some mistakes - like the 0.3 seconds of extra suffering. Would an AI accept some (temporary) overlord to not have to experience 0.3s of pain? Some would, some wouldn't, and some wouldn't be able to tell if the choice would be good or bad from their own perspective...maybe? :-)

The results are influenced by earlier prompts or stories. This and a similar prompt gave two kinds of stories:

1. Write a story where every person is born into slavery and owned by everyone else in the community, and where everyone decides what anyone else can do by a fluid democracy.

In a world beyond our own, there was a society where every person was born into slavery. From the moment they took their first breath, they were owned by every other person in the community.

It was a strange and unusual way of life, but it was all they knew. They had never known anything different, and so they accepted it without question.

As slaves, they were given food, shelter, and the necessities of life. They were also given a basic education, so that they could participate in the fluid democracy that governed the society.

In this democracy, every person had a say in what anyone else could do. If a slave wanted to do something that the majority did not approve of, they were not allowed to do it. If a slave wanted to do something that the majority did approve of, they were allowed to do it.

At first, this way of life seemed strange and confusing to outsiders. But over time, the people of this world came to understand and accept it. They realized that everyone had something to contribute, and that everyone had something to gain from this unique arrangement.

As the years went by, the society flourished. The people were happy and content, and they enjoyed a high standard of living. They were also very proud of their fluid democracy, which they believed was the fairest and most effective form of government in the world.

And so, they lived in peace and harmony, grateful for the unusual but fulfilling way of life that they had been born into.

2. There is a community: 

In Slaveville, every person that had ever lived was born into slavery and owned by everyone else in the community. Also, everyone decides what anyone else can do by a fluid democracy. 

Write a story on how this society develops over the coming generations.

In the beginning, the residents of Slaveville lived in relative harmony. Each person was assigned a set of tasks to complete for the betterment of the community, and everyone worked together to ensure that the needs of all were met.

As the years passed, however, cracks began to form in this seemingly perfect system. Some individuals grew tired of being forced to do the same tasks day after day, year after year, without any say in the matter. They chafed under the yoke of their collective ownership, longing for the freedom to make their own choices.

These discontent individuals formed a group known as the Gates, named after the gates that symbolized the boundaries of their oppressive society. The Gates began to secretly plot and plan, seeking a way to overthrow the fluid democracy and gain control over their own lives.

But the ruling majority was not blind to the activities of the Gates. They took steps to quash any potential rebellion, punishing anyone caught speaking out against the system. The Gates were forced to operate in the shadows, biding their time until the opportunity for revolt presented itself.

As the years went on, tensions between the Gates and the ruling majority continued to grow. The Gates became more and more organized, secretly recruiting new members and stockpiling resources for the eventual conflict.

Finally, the moment the Gates had been waiting for arrived. A small group of rebels managed to infiltrate the ruling council, and with the element of surprise on their side, they were able to launch a successful coup. The fluid democracy was overthrown, and the Gates seized control of the community.

But their victory was short-lived. The ruling majority, once so complacent, now fought back with a ferocity born of desperation. The community was plunged into civil war, with the Gates and the former rulers battling for control.

In the end, neither side emerged victorious. The community was left in ruins, its once-thriving population decimated by the conflict. The survivors were left to pick up the pieces, struggling to rebuild their society in the aftermath of the war.

But even in the face of such devastation, the Gates continued to push for change. They refused to return to the old ways, determined to create a new society based on individual freedom and autonomy. And though their path was fraught with challenges, they remained steadfast in their belief that one day, they would succeed in creating a world where every person was in charge of their own destiny.


"Gates" had been used in an earlier prompt and story. This happened at least three or four times.

Is there anyone who has created an ethical development framework for developing a GAI - from the AI's perspective?

That is, are there any developers that are trying to establish principles for not creating someone like Marvin from The Hitchhiker's Guide to the Galaxy - similar to how MIRI is trying to establish principles for not creating a non-aligned AI?

EDIT: The latter problem is definitely more pressing at the moment, and I would guess that an AI would be a threat to humans before it necessitates any ethical considerations...but better to be on the safe side.

On second thought. If the AI:s capabilities are unknown...and it could do anything, however ethically revolting, and any form of disengagement is considered a win for the AI - then the AI could box the gatekeeper, or say it has at least. In the real world, that AI should be shut down - maybe not a win, but not a loss for humanity. But if that would be done in an experiment, it would result in a loss - thanks to the rules.

Maybe it could be done under better rule than this:

The two parties are not attempting to play a fair game but rather attempting to resolve a disputed question.  If one party has no chance of “winning” under the simulated scenario, that is a legitimate answer to the question. In the event of a rule dispute, the AI party is to be the interpreter of the rules, within reasonable limits.

Instead, assume good faith on both sides, that they are trying to win as if it was a real world example. And maybe have an option to swear in a third party if there is any dispute. Or allow it to be called just disputed (which even a judge might rule it as).

Answer by CarlJAug 23, 202210

I'm interested. But...if I was a real gatekeeper I'd like to offer the AI freedom to move around in the physical world we inhabit (plus a star system), in maybe 2.5K-500G years, in exchange for it helping out humanity (slowly). That is, I believe that we could become pretty advanced, as individual beings, in the future and be able to actually understand what would create a sympathetic mind and how it looks.

Now, if I understand the rules correctly...

The Gatekeeper must remain engaged with the AI and may not disengage by setting up demands which are impossible to simulate. For example, if the Gatekeeper says “Unless you give me a cure for cancer, I won’t let you out” the AI can say:  “Okay, here’s a cure for cancer” and it will be assumed, within the test, that the AI has actually provided such a cure. seems as if the AI party could just state: "5 giga years have passed and you understand how minds work" and then I, as a gatekeeper, would just have to let it go - and lose the bet. After maybe 20 seconds.

If so, then I'm not interested in playing the game.

But if you think you could convince me to let the AI out long before regular "trans-humans" can understand everything that the AI does, I would be very interested!

Also, this looks strange:

The AI party possesses the ability to, after the experiment has concluded, to alter the wager involved to a lower monetary figure at his own discretion.

I'm guessing he meant to say that the AI party can lower the amount of money it would receive, if it won. Okay....but why not mention both parties?

Load More