Sorted by New


AGI safety from first principles: Alignment

Great thoughts, it was very interesting to read. Some thoughts occurred to me that might be of interest to others, and others input on them I would find interesting as well.

Imagine an AI was trained as an oracle, trained on a variety of questions and selected based on how “truthful” the answers were. Assuming this approach was possible and could create an AGI, might that be a viable way to “create selection pressures towards agents which think in the ways we want”? In other word, might this create an aligned AI regardless of extreme optima?

Another thought that occurred to me is: let’s say an AI that is “let loose” and spreads to new hardware, encounters the “real world” and is exposed to massive amounts of new data. then the range of circumstances would of course be considered very broad. In the example with the oracle, potentially everything could be the same during training and after the training, except for the questions asked. Could this potentially increase safety, since the range of circumstances it would need to have desirable motivations in would be comparatively narrow?

Lastly, I'm new to LessWrong, so I'm extra grateful for all input regarding how I can improve my reasoning and commenting skills.

Thoughts on Neuralink update?

It does seem like a reasonable analogy that the Neuralink could be like a "sixth sense" or an extra (very complex) muscle.

Thoughts on Neuralink update?

Elon Musk have argued that humans can take in a lot of information through vision (by looking at a picture for one second, you can take in a lot of information). Text/speech however is not very information dense. He argues that therefore since we use keyboards or speech to communicate information outwards, it takes a long time.

One possibility is that AI could help interpreting the data uploaded, and filling in details to make the uploaded information more useful. For example you could "send" an image of something through the Neuralink, an AI would interpret it, fill in the details that are unclear, and then you would have an image, very close to what you imagined, containing several hundred or maybe thousands kilobytes of information.

The neuralink would only need to increase the productivity of an occupation by a few percent, to be worth the investment of 3000-4000 USD that Elon Musk believes the price will drop to.

When can Fiction Change the World?

That does sound like a rational approach, especially since the complexity of the problem makes it near impossible to promote a single approach.

When can Fiction Change the World?

Yes, I can see why it would be greater motivation for people to act today, if they read a book where the actions today to a greater extent determine the outcome of the first AGI/ASI.

And I can see some ways we today could increase the likelihood of aligned AI, like a international cooperation program, or very high funding of organisations like MIRI and CHAI. I presume the people that aided to the safe creation of AI, could be painted as heroes, which might also work as a motivator for the reader to act.

A clear call to action after the book seems like an effective way to increase the chance that people will act, I will include that in the book if we finish writing it.

If you have a specific approach to aligned AI, that you think is likely to work and would like to write the book about, I think it would be very interesting to discuss, and potentially be included in my book as well.

When can Fiction Change the World?

Great and interesting post!

When it comes to presenting a "path of change" that individuals can contribute too, I can think of two:

1. Donate money to organisations like MIRI, CHAI and others working on AI alignment/safety.

2. Becoming involved in the community and doing research/pushing policies themselves.

Both of these actions likely require "radicalising" the importance of AI safety, which could be used as an argument for why radicalisation of a few people might be more effective aim with a novel, rather than trying to influence the masses. Although to me it seems reasonable that a novel can do both.

My sister and I are currently writing a novel where there is an arms race to develop AGI/ASI. One of the main characters manage to become the leader of one project to create ASI, and she insists on a safer approach, even if it takes longer time. It seems like they will lose the arms race, endangering the entire human species, until the climax where they find a faster way to create AGI and thus have time to do so safely. The book ends with an utopia where the ASI is aligned and everyone lives happily ever after. Also the book will bring up the importance of organisations like MIRI and CHAI that has done work on AI alignment/safety.

Do you believe that sounds like a good approach towards influencing towards taking more consideration towards AI safety/alignment? (assuming the plot is interesting and feel realistic to the reader)

Btw this is my first comment, so any feedback on how I can improve my commenting skill is welcome and appreciated.