I'm assuming you are already familiar with some basics, and already know what 'orthogonality' and 'instrumental convergence' are and why they're true.
To effectively manage AI safety, legal acts and regulations must include fundamental principles like orthogonality and instrumental convergence. These principles should be written into legislation to guide lawyers, policymakers, and developers. Moreover, analyzing past disasters using these principles can help explain and prevent future incidents, while fostering more engagement from the effective altruism movement. Without these foundations, attempts to regulate AI may result in merely superficial "false care," incapable of preventing catastrophes or ensuring long-term safety for humanity.
Looks like we will see a lot of Instrumental Convergance and Orthogonality disasters Isn't?
'Always Look on the Bright Side of Life'
Life is like playing Diablo
on hardcore mode: you can read all the guides, create the perfect build, and find ideal companions, only to die because the internet disconnects
Playing on hardcore is exciting—each game tells the story of how these characters will meet their end
'Always Look on the Bright Side of Death' - Monty Python
Do you know any interesting camp in Europe about HPMOR or something similar, my 11 daughter asked where is her letter to Hogwards. She start read book and ask why do nobody make film about this great fanfic.
Do you have any idea of good child camps for education in Europe? Or elsewhere?
Good day!
I fully share the views expressed in your article. Indeed, the ideal solution would be to delete many of the existing materials and to reformat the remaining ones into a format understandable to every novice programmer, transhumanist, or even an average person.
As a poker player and a lawyer assisting consumers who have suffered from the consequences of artificial intelligence, as well as someone interested in cryptocurrencies and existential risks, I first invested in Eliezer Yudkowsky's ideas many years ago. At that time, I saw how generative-predictive models easily outplayed poker players, and I wondered whether it was possible to counteract this. Since then, I have not seen a single serious security study conducted by not the players themselves, but any non-response system up question could it research even self data
and in the realm of cryptocurrencies, money continues to be stolen with the help of AI, with no help or refund in sight.
I see prediction we have already lost the battle against GAI, but in the next 12 years, we have a chance to make the situation a bit better. To create conditions of the game where this player or his precursor (AI-users) will have more aligned (lawful good) elements.
It seems that very intelligent also very stubborn, see no doubts in position, such high IQs are very dangerous. Think they are right about everything, that understood it all, but we are just few perspectives in a vast, incomprehensible world where we understand nothing. We all wrong.
Yes, you're probably a couple of sigmas smarter than the median person, but you need to convince exactly such a person, the median, or even dumper on a couple of IQ sigmas not to launch anything. It's not just OpenAI developing GAI,
others are too, make research, decisions but they might not even know who Eliezer Yudkowsky is or what the lesswrong website is. They might visit pepper copy of the site, see that it's clear we shouldn't let GAI emerge, think about graphic boards, and where there are many graphic boards, in decentralized mining, they might decide to take control of them.
If we're lucky, their master slaves will just steal them and use them for mining, and everything will be fine then.
But various research like changing the sign of a function and creating something dangerous, that's better removed.
Another strange thing is the super-ethical laws for Europe and the US. A lot of jurisdictions. Even convention of cybercrime not universal. And in universal jurisdiction cybercrimes there is no crimes about existential risks. So many of international media laws just declarations without real procedures without any real power
Many laws aren't adhered to in practice, there are different kinds of people, for some, the criminal code is like a menu, and if you don't have to pay for that menu, it's doubly bad
There are individualists, and among transhumanists, I'm sure there are many who would choose their life and the life of a close million over the rest of humanity. And that's not good, unfair. System should be for all billions of people
But there are also those in the world who, if presented with a "shut down server" button, will eventually press it. There are many such buttons in various fields worldwide. If we take predictions for a hundred years, unless something radically changes, the likelihood of "server shutdown" approaches 1.
So it's interesting whether through GAI open source or any other framework or model, we could create some universal platform with a rule system that on one hand does universal monitoring of all existential problems, but also provides clear, beneficial instructions for the median voter, as well as for the median worker and their masters.
Culture is created by the spoon. Give us a normal, unified system that encourages correct behavior for adhering to existential risks, since you've won the genetic and event lottery by intelligence and were born with high IQ and social skills.
Usually, the median person is interested in: jobs, a full fridge, rituals, culture, the spread of their opinion leader's information, dopamine, political and other random and inherited values, life, continuation of life, and the like.
Provide a universal way of obtaining this and just monitor it calmly. And it touched on the problem of all existential risks: ecology, physics, pandemics, volcanic activity, space, nanobots, atom.
Doomclock 23:55 is not only because of the GAI risk, what selfishness.
Sometimes it seems that Yudkowsky is the Girolamo Savonarola of our days. And the system of procedures that Institute of Future Life and Eliezar already invented, their execution is important!
Sadly in humanity now it's profitable to act, and then ask for forgiveness. So many businesses are built the same as nowadays Binance without responsibility, 'don't FUD just build', same way work all AI and others powerful startups. Many experimental researches not 100% sure that they are safe for planet. In 20th and 21th centuary it's became normal. But it shouldn't.
And these real condition of problem, real pattern of life. And yet in crypto, there are many graphics cards, collected in decentralized networks, and they gather in large decentralized, unturnoffable nodes and clusters. Are they danger?
We need systems of cheap protection, brakes, and incentives for their use! And like with seat belts, teach from childhood. Something even simpler than Khan Academy. HPMOR was great. Do we have anything for next Generations? That didn't see or like Harry Potter? What is it? To explain problem.
Laws and rules just for show, unenforceable, are only harmful. Since ancient times it's known that any rules consist of three things: hypothesis, disposition, and sanction. Without powerful procedural law, all these material legal norms are worthless, more precisely, a boon for the malefactor. If we don't procedurally protect people from wrongful AI, introducing soothing, non-working ethical rules will only increase volatility and the likelihood of wrongful AI, his advantage, even if we are lucky to have its element (it's alighment) in principle.
I apologize if there were any offensive remarks in the text or if it seemed like an unstructured rant expressing incorrect thoughts, that how my brain work. Hope I wrong, point please. Thank you for any comments and for your attention!
Version 1 (adopted):
Thank you, shminux, for bringing up this important topic, and to all the other members of this forum for their contributions.
I hope that our discussions here will help raise awareness about the potential risks of AI and prevent any negative outcomes. It's crucial to recognize that the human brain's positivity bias may not always serve us well when it comes to handling powerful AI technologies.
Based on your comments, it seems like some AI projects could be perceived as potentially dangerous, similar to how snakes or spiders are instinctively seen as threats due to our primate nature. Perhaps, implementing warning systems or detection-behavior mechanisms in AI projects could be beneficial to ensure safety.
In addition to discussing risks, it's also important to focus on positive projects that can contribute to a better future for humanity. Are there any lesser-known projects, such as improved AI behavior systems or initiatives like ZeroGPT, that we should explore?
Furthermore, what can individuals do to increase the likelihood of positive outcomes for mankind? Should we consider creating closed island ecosystems with the best minds in AI, as Eliezer has suggested? If so, what would be the requirements and implications of such places, including the need for special legislation?
I'm eager to hear your thoughts and insights on these matters. Let's work together to strive for a future that benefits all of humanity. Thank you for your input!
Version 0:
Thank you shminux for this topic. And other gentlements for this forum!
I hope I will not died with AI in lulz manner after this comment) Human brain need to be positive. Without this it couldn't work well.
According to your text it looks like any OPEN AI projects buttons could look like SNAKE or SPIDER at least to warning user that there is something danger in it on gene level.
You already know many things about primate nature. So all you need is to use it to get what you want
We have last mind journeey of humankind brains to win GOOD future or take lost!
What other GOOD projects we could focus on?
What projects were already done but noone knows about them? Better AI detect-behaviour systems? ZeroGPT?
What people should do to make higher probability of good scenarios for mankind?
Should we make close island ecosystems with best minds in AI as Eliezar said on Bankless youtube video or not?
What are the requirements for such places? Because then we need to create special legislation for such semiindependant places. It's possible. But talking with goverments is a hard work. Do you REALLY need it? Or this is just emotional words of Eliezar.
Thank you for answers!
I guess we need to maximase different good possible outcome, and each of them
for example to rise propability of Many competing AGIs form an equilibrium whereby no faction is allowed to get too powerful, humans could
prohibit all autonomous AGI use.
Esspecially those that use uncontrolled clusters of graphical proccessors in authocraties without international AI-safe supervisors like Eliezer Yudkowsky, Nick Bostrom or their crew
this, restrictions of weak APIs systems and need to use human operators
make nature borders of AI scalability so AGI find that it's more fervour to mimick and consensus with people and other AGI, at least to use humans like operators that work under AGI advises or make humanlike persons that simpler to work with human culture and other people
detection systems often use categorisation principles,
so even if AGI prohibit some rules without scalability it could function without danger longer cause security systems (that also some kind of tech officers with AI) couldn't find and destroy them,
this could create conditions to encourage the diversity and uniqueness of different AGIs
so all neurone beings, AGI, people with AI, could win some time to find new balances of using atoms of multiverse
more borders, more time to conquer longer live to every human, even win of two second for every 8kkk people worth it
more chances that different fuctions will find some kind of balance of AGI, people with AGI, people under AGI, other fractions
I remember autonomose poker AIs destroy weak ecosystems one by one, but now industry in sustainable growth with separate actors, each of them use AI but in very different manners
More separate systems, more chances that with time of destroying them one by one in one time AGI will find way how to function without destroying it's environment
PS separate way: send spacehips with prohibitaion of AGI (maybe only with life, no apes) as far as posible so when AGI happened on Earth it's couldn't get all of them)
1.1. The adoption of such laws is long way
Usually, it is a centuries-long path: Court decisions -> Actual enforcement of decisions -> Substantive law -> Procedures -> Codes -> Declaration then Conventions -> Codes.
Humanity does not have this much time, it is worth focusing on real results that people can actually see. It might be necessary to build some simulations to understand which behavior is irresponsible.
Where is the line between creating a concept of what is socially dangerous and what are the ways to escape responsibility?
As a legal analogy, I would like to draw attention to the criminal case of Tornado Cash.
https://uitspraken.rechtspraak.nl/details?id=ECLI:NL:RBOBR:2024:2069
The developer created and continued to improve an unstoppable program that possibly changed the structure of public transactions forever. Look where the line is drawn there. Can a similar system be devised concerning the projection of existential risks?
1.2. The difference between substantive law and actual law on the ground, especially in countries built on mysticism and manipulation. Each median group of voters creates its irrational picture of the world within each country. You do not need to worry about floating goals.
There are enough people in the world in a different information bubbles than you, so you can be sure that there are actors with values opposite to yours.
1.3. Their research can be serious, but the worldview simplified and absurd. At the same time, resources can be extensive enough for technical workers to perform their duties properly.
2.1. There is no possibility of ideologically influencing all people simultaneously and all systems.
2.2. If I understand you correctly, more than 10 countries can spend huge sums on creating AI to accelerate solving scientific problems. Many of these countries are constantly struggling for their integrity, security, solving national issues, re-election of leaders, gaining benefits, fulfilling the sacred desires of populations, class, other speculative or even conspiratorial theories. Usually, even layers of dozens of theories.
2.3. Humanity stands on the brink of new searches for the philosopher's stone, and for this, they are ready to spend enormous resources. For example, the quantum decryption of old Satoshi wallets plus genome decryption can create the illusion of the possibility of using GAI to solve the main directions of any transhumanist’s alhimists desires, to give the opportunity to defeat death within the lifetime of this or the next two generations. Why should a conditional billionaire and/or state leader refuse this?
Or, as proposed here, the creation of a new super IQ population, again, do not forget that some of the beliefs can be antagonistic.
Even now, from the perspective of AI, predicting the weather in 2100 is somehow easier than in 2040. Currently, there are about 3-4 countries that can create Wasteland-type weather, they partially come into confrontation approximately every five years. Each time, this is a tick towards a Wasteland with a probability of 1-5%. If this continues, the probability of Wasteland-type weather by 2040 will be:
1−0.993=0.0297011 - 0.99^3 = 0.0297011−0.993=0.029701
1−0.953=0.1426251 - 0.95^3 = 0.1426251−0.953=0.142625
By 2100, if nothing changes:
1−0.9915=0.13991 - 0.99^{15} = 0.13991−0.9915=0.1399
1−0.9515=0.46321 - 0.95^{15} = 0.46321−0.9515=0.4632
(A year ago, my predictions were more pessimistic as I was in an information field that presented arguments for the Wasteland scenario in the style of "we'll go to heaven, and the rest will just die." Now I off that media =) to be less realistic, Now it seems that this will be more related to presidential cycles and policy, meaning they will occur not every year, but once every 5 years, as I mentioned earlier, quite an optimistic forecast)
Nevertheless, we have many apocalyptic scenarios: nuclear, pandemic, ecological (the latter is exacerbated by the AI problem, as it will be much easier to gather structures and goals that are antagonistic in aims).
3. Crisis of rule of law
In world politics, there has been a rollback of legal institutions since 2016 (see UN analytics). These show crisis of common values. Even without the AI problem, this usually indicates either the construction of a new equilibrium or falling into chaos. I am a pessimist here and believe that in the absence of normalized common values, information bubbles due to the nature of hysteria become antagonistic (simply put, wilder information flows win, more emotional and irrational). But vice verse this is a moment where MIRI could inject value that existential safety is very important. Especially now cause any injection in out doom clock bottom could create effect that MIRI solved it
4. Problems of Detecting AI Threats
4.1. AI problems are less noticeable than nuclear threats (how to detect these clusters, are there any effective methods?).
4.2. Threat detection is more blurred, identifying dangerous clusters is difficult. The possibility of decentralized systems, like blockchain, and their impact on security. (decentralized computing is rapidly developing, there is progress in superconductors, is this a problem from the perspective of AI security detection?).
Questions about the "Switch off" Technology
5.1. What should a program with a "switch" look like? What is its optimal structure:
a) Proprietary software, (which blocks, functions are recommended to be closed from any distribution)
b) Close/Open API, (what functions can MIRI or other laboratories provide, but with the ability to turn off at any moment, for example, enterprises like OpenAI)
c) Open source with constant updates, (open libraries, but which require daily updates to create the possibility of remotely disabling research code)
d) Open code, (there is an assumption that with open code there is less chance that AI will come into conflict with other AIs, AI users with other AI users, open code can provide additional chances that the established equilibrium between different actors will be found, and they will not mutually annihilate each other. Because they could better in prediction each other behavior)
5.2. The possibility of using multi-signatures and other methods.
How should the button work? Should such a button and its device be open information? Of another code structure? another language? Analogues tech
Are there advantages or disadvantages of shutdown buttons, are there recommendations like at least one out of N pressed, which system seems the most sustainable?
5.3. Which method is the most effective?
Benefits and Approval
6.1. What benefits will actors gain by following recommendations? Leaders of most countries make decisions not only and not so much from their own perspective, but from the irrational desires of their sources of power, built on dozens of other, usually non-contradictory but different values.
6.2. Possible forms of approval and assistance in generating values. Help to defend ecology activists to defend from energy crisis? (from my point of view AI development not take our atoms, but will take our energy, water, sun, etc)
6.3. Examples of large ‘switch off’ projects, for AI infrastructure with enough GPU, electricity, like analogies nuclear power plants but for AI. If you imagine such objects plants what rods for reactions should be, how to pull them out, what "explosives" over which pits should be laid to dump all this into acid or another method of safe destroying
7.1. Questions of approval and material assistance for such enterprises. What are the advantages of developing such institutions under MIRI control compared to
7.2. The hidden maintenance of gray areas on the international market. Why is the maintenance of the gray segment less profitable than cooperation with MIRI from the point of view of personal goals, freedom, local goals, and the like?
Trust and Bluff
8.1. How can you be sure of the honesty of the statements? MIRI that it is not a double game. And that these are not just declarative goals without any real actions? From my experience, I can say that neither in poker bot cases nor in the theft of money using AI in the blockchain field did I feel any feedback from the Future Life Institute project. To go far, I did not even receive a single like from reposts on Twitter. There were no automatic responses to emails, etc. And in this, I agree with Matthew Barnett that there is a problem with effectiveness.
What to present to the public? What help can be provided? Help in UI analytics? Help in investigating specific cases of violations using AI?
For example, I have a problem where I need for consumer protection to raise half a million pounds against AI that stole money through low liquidity trading on Binance, how can I do this?
https://www.linkedin.com/posts/petr-andreev-841953198_crypto-and-ai-threat-summary-activity-7165511031920836608-K2nF?utm_source=share&utm_medium=member_desktop
https://www.linkedin.com/posts/petr-andreev-841953198_binances-changpeng-zhao-to-get-36-months-activity-7192633838877949952-3cmE?utm_source=share&utm_medium=member_desktop
I tried writing letters to the institute and to 80,000 hours, zero responses
SEC, Binance, and a bunch of regulators. They write no licenses, okay no. But why does and 80,000 generally not respond? I do not understand.
8.2. Research in open-source technologies shows greater convergence of trust. Open-source programs can show greater convergence in cooperation due to the simpler idea of collaboration and solving the prisoner's dilemma problem not only through past statistics of another being but also through its open-to-collaboration structure. In any case, GAI will eventually appear, possibly open monitoring of each other's systems will allow AI users not to annihilate each other.
8.3. Comparison with the game theory of the Soviet-Harvard school and the need for steps towards security. The current game theory is largely built on duel-like representations of game theory, where damage to the opponent is an automatic victory, and many systems at the local level continue to think they are there.
Therefore, it is difficult for them to believe in the mutual benefit of systems, that it is about WIN-WIN, cooperation, and not empty talk or just a scam for redistribution of influence and media manipulation.
AI Dangers
9.1. What poses a greater danger: multiple AIs, two powerful AIs, or one actor with a powerful AI?
9.2. Open-source developments in the blockchain field can be both safe and dangerous? Are there any reviews?
this is nice etherium foundation list of articles:
https://docs.google.com/spreadsheets/d/1POtuj3DtF3A-uwm4MtKvwNYtnl_PW6DPUYj6x7yJUIs/edit#gid=1299175463
what do you think about:
Open Problems in Cooperative AI, Cooperative AI: machines must learn to find common ground, etc articles?
9.3. Have you considered including the AI problem in the list of Universal jurisdiction https://en.wikipedia.org/wiki/Universal_jurisdiction
Currently, there are no AI problems or, in general, existential crimes against humanity. Perhaps it is worth joining forces with opponents of eugenics, eco-activists, nuclear alarmists, and jointly prescribing and adding crimes against existential risks (to prevent the irresponsible launch of projects that with probabilities of 0.01%+ can cause severe catastrophes, humanity avoided the Oppenheimer risk with the hydrogen bomb, but not with Chernobyl, and we do not want giga-projects to continue allowing probabilities of human extinction, but treated it with neglect for local goals).
In any case, introducing the universal jurisdiction nature of such crimes can help in finding the “off” button for the project if it is already launched by attracting the creators of a particular dangerous object. This category allow states or international organizations to claim criminal jurisdiction over an accused person regardless of where the alleged crime was committed, and regardless of the accused's nationality, country of residence, or any other relation to the prosecuting entity
9.4. And further the idea with licensing, to force actors to go through the verification system on the one hand, and on the other, to ensure that any technology is refined and becomes publicly available.
https://uitspraken.rechtspraak.nl/details?id=ECLI:NL:RBOVE:2024:2078
https://uitspraken.rechtspraak.nl/details?id=ECLI:NL:RBOVE:2024:2079
A license is very important to defend a business, its CEO, and colleagues from responsibility. Near-worldwide monopolist operators should work more closely to defend the rights of their average consumer to prevent increased regulation. Industries should establish direct contracts with professional actors in their fields in a B2B manner to avoid compliance risks with consumers.
Such organisation as MIRI could be strong experts that could check AI companies for safety especially they large enough to create existential risk or by opposite penalties and back of all sums that people accidentally lost from too weak to common AI attacks frameworks. People need to see simple show of their defence against AI and help from MIRI, 80000 and other effective altruist especially against AI bad users that already misalignment and got 100kk+ of dollars. It is enough to create decentralized if not now than in next 10 years
Examples and Suggestions
10.1. Analogy with the criminal case of Tornado Cash. In the Netherlands, there was a trial against a software developer who created a system that allows decentralized perfect unstoppable crime. It specifically records the responsibility of this person due to his violation of financial world laws. Please note if it can be somehow adapted for AI safety risks, where lines and red flags.
10.2. Proposals for games/novels. What are the current simple learning paths, in my time it was HPMOR -> lesswrong.ru -> lesswrong.com.
At present, Harry Potter is already outdated for the new generation, what are the modern games/stories about AI safety, how to further direct? How about an analogue of Khan Academy for schoolchildren? MIT courses on this topic?
Thank you for your attention. I would appreciate it if you could point out any mistakes I have made and provide answers to any questions. While I am not sure if I can offer a prize for the best answer, I am willing to donate $100 to an effective fund of your choice for the best engagement response.
I respect and admire all of you for the great work you do for the sustainability of humanity!