AI Safety Reading Group

Søren Elverlin

AI Safety Reading Group

1 min read11th Aug 20198 comments

16

If you are interested in AI Safety, come visit the AI Safety Reading Group.

The AI Safety reading group meets on Skype Wednesdays at 18:45 UTC, discussing new and old articles on different aspects of AI Safety. We start with a presentation round, then a summary of the article is presented, followed by discussion both on the article and in general.

Sometimes we have guests. On Wednesday the 14th, Stuart Armstrong will be giving a presentation on his research agenda in the reading group:
https://www.alignmentforum.org/posts/CSEdLLEkap2pubjof/research-agenda-v0-9-synthesising-a-human-s-preferences-into

Join us by Skype, by adding ‘soeren.elverlin’.

Previous guests include Eric Drexler, Rohin Shah, Matthijs Maas, Scott Garrabrant, Robin Hanson, Roman Yampolskiy, Vadim Kosoy, Abram Demski and Paul Christiano. A full list of articles read can be found at https://aisafety.com/reading-group/

AI Safety Reading Group

New Comment

8 comments, sorted by

top scoring

Click to highlight new comments since: Today at 1:11 AM

[-]Vika3y30

Is this reading group still running? I'm wondering whether to point people to it.

[-]Søren Elverlin3y10

Yes, we are still running, though at a bi-weekly schedule. We will discuss Paul Christiano's "Another (Outer) Alignment failure story" on the 8th of July.

[-]Wei Dai5y20

I'm sad to have missed Eric Drexler's recent Q&A session. The slides for that session don't seem to contain Eric's answers and there is no linked recording. Is there any chance someone kept notes, or can write a summary from their memory of Eric's answers?

[-]Søren Elverlin5y20

Eric Drexler requested that I did not upload a recording to YouTube. Before the session, I compiled this document with most of the questions:

https://www.dropbox.com/s/i5oqix83wsfv1u5/Comprehensive_AI_Services_Q_A.pptx?dl=0

We did not get to post the last few questions. Are there any questions from this list you would like me to try to remember the answers to?

[-]Wei Dai5y60

Do you have a recording of the session? If so, can you send it to me via PM or email?

I'm interested in answers to pretty much all of the questions. If no recording is available, any chance you could write up as many answers as you can remember? (If not, I'll try harder to narrow down my interest. :)

I'm also curious why Eric Drexler didn't want you to upload a recording to YouTube. If the answers contain info hazards, it seems like writing up the answers publicly would be bad too. If not, what could outweigh the obvious positive value of releasing the recording? If he's worried about something like not necessarily endorsing the answers that he gave on the spot, maybe someone could prepare a transcript of the session for him to edit and then post?

[-][anonymous]5y40

I'm very interested in his responses to the following questions:

The question addressing Gwern's post about Tool AIs wanting to be Agent AIs.
The question addressing his optimism about progress without theoretical breakthroughs (related to NNs/DL).

[-]Chris Cooper5y50

* The question addressing Gwern's post about Tool AIs wanting to be Agent AIs.

When Søren posed the question, he identified the agent / tool contrast with the contrast between centralized and distributed processing, and Eric denied they are the same contrast. He then went on to discuss the centralized / distributed contrast. He regards it as of no particular significance. In any system, even within a neural network, different processes are conditionally activated according to the task in hand and don't use the whole network. These different processes within the system can be construed as different services.

Although there is mixing and overlapping of processes within the human brain, this is a design flaw rather than a desirable feature.

I thought there was some mutual misunderstanding here. I didn't find the tool / agent distinction being addressed in our discussion.

* The question addressing his optimism about progress without theoretical breakthroughs (related to NNs/DL).

Regarding breakthroughs versus incremental progress: Eric reiterated his belief that we are likely to see improvements in doing particular tasks but a system that – in his examples – is good at counting leaves on a tree is not going to be good at navigating a Mars rover, even if both are produced by the same advanced learning algorithm. I couldn't identify any crisp arguments to support this.

[-][anonymous]5y20

Thanks!

Moderation Log