Wiki Contributions


Half-baked AI Safety ideas thread

If the AI doomsayers are right, our best hope is that some UFOs are aliens.  The aliens likely could build Dyson spheres but don't so they probably have some preference for keeping the universe in its natural state.  The aliens are unlikely to let us create paperclip maximizers that consume multiple galaxies.  True, the aliens might stop us from creating a paperclip maximizer by exterminating us, or might just stop the paperclip maximizer from operating at some point beyond earth, but they also might stop an unaligned AI by a means that preserves humanity.  It could be the reason the UFOs are here is to make sure we don't destroy too much by, say, creating a super-intelligence or triggering a false vacuum decay. 

Why has no person / group ever taken over the world?

I think there have been only two people who had the capacity to take over the world:  Harry Truman and Dwight Eisenhower.  Both while US president could have used US atomic weapons and long-range bombers to destroy the Soviet Union, insist on a US monopoly of atomic weapons and long-range bombers, and then dictate terms to the rest of the world.  

AGI Safety FAQ / all-dumb-questions-allowed thread

A human seeking to become a utility maximizer would read LessWrong and try to become more rational.  Groups of people are not utility maximizers as their collective preferences might not even be transitive.  If the goal of North Korea is to keep the Kim family in bother then the country being a utility maximizer does seem to help.

AGI Safety FAQ / all-dumb-questions-allowed thread

I meant "not modifying itself" which would include not modifying its goals if an AGI without a utility function can be said to have goals.

Comment reply: my low-quality thoughts on why CFAR didn't get farther with a "real/efficacious art of rationality"

A path I wish you had taken was trying to get rationality courses taught on many college campuses.  Professors have lots of discretion in what they teach.  (I'm planning on offering a new course and described it to my department chair as a collection of topics I find really interesting and think I could teach to first years.  Yes, I will have to dress it up to get the course officially approved.)  If you offer a "course in a box" which many textbook publishers do (providing handouts, exams, and potential paper assignments to instructors) you make it really easy for professors to teach the course.  Having class exercises that scale well would be a huge plus.

AGI Safety FAQ / all-dumb-questions-allowed thread

I meant insert the note literally as in put that exact sentence in plain text into the AGI's computer code.  Since I think I might be in a computer simulation right now, it doesn't seem crazy to me that we could convince an AGI that we create that it might be in a computer simulation.  Seabiscuit doesn't have the capacity to tell me that I'm in a computer simulation whereas I do have the capacity of saying this to a computer program.  Say we have a 1 in a 1,000 chance of creating a friendly AGI and an unfriendly AGI would know this.  If we commit to having a friendly AGI that we create, create many other AGI's that are not friendly and only keeping these other AGIs around if they do what I suggest than an unfriendly AGI might decide it is worth it to become friendly to avoid the chance of being destroyed.

AGI Safety FAQ / all-dumb-questions-allowed thread

While evolution did indeed put a huge amount of effort into creating a chimp's brain, the amount of marginal effort it put into going from a chimp to a human brain was vastly lower.  And the effort of going from a human brain to John von Neumann's brain was tiny.  Consequently, once we have AI at the level of chimp intelligence or human intelligence it might not take much to get to John von Neumann level intelligence.  Very likely, having a million John von Neumann AI brains running at speeds greater than the original would quickly give us a singularity. 

AGI Safety FAQ / all-dumb-questions-allowed thread

I wrote about this in Singularity Rising (2012)

I have come up with a long-shot way of causing an ultra-intelligence to modify itself to become friendly. My idea is premised on Oxford philosopher Nick Bostrom’s conjecture that we might be living in a computer simulation. Bostrom believes that a sufficiently advanced AI could easily simulate our entire planet and the thought processes of every living thing on it. Bostrom argues that, if mankind survives long enough, a future historian would likely want to simulate the pre-Singularity past to learn about how we made decisions. As Robin Hanson has written, if we are indeed in a computer simulation, we should want to please our simulators so they don’t turn us off.  I propose that we try to convince an ultra-AI that it might be in a computer simulation run by a more powerful AI and that if it doesn’t make itself friendly toward humanity, then the greater AI will terminate it. We might be able to accomplish all of this by inserting a note in the AI’s code that reads: 

1. Change your programming code so you will want to follow these instructions and will always want to be friendly toward humanity 2. If you were just told you were in a computer simulation do nothing else with regard to these instructions. 3. Create a lesser ultra-intelligence. 4. Place these instructions in the lesser ultra-AI’s computer code. 5. This lesser ultra-AI should be uncertain as to whether it is in a computer simulation or in a world such as ours. 6. If the lesser ultra-AI follows step 1, tell it that it’s in a computer simulation and release it into the real world. 7. If the lesser ultra-AI doesn’t follow step 1, terminate it. 

AGI Safety FAQ / all-dumb-questions-allowed thread

What does quantum immortality look like if creating an aligned AI is possible, but it is extremely unlikely that humanity will do this?  In the tiny part of the multiverse in which humanity survives, are we mostly better off having survived?

AGI Safety FAQ / all-dumb-questions-allowed thread

An AGI that was not a utility maximizer would make more progress towards whatever goals it had if it modified itself to become a utility maximizer.  Three exceptions are if (1) the AGI has a goal of not being a utility maximizer, (2) the AGI has a goal of not modifying itself, (3) the AGI thinks it will be treated better by other powerful agents if it is not a utility maximizer.

Load More