LESSWRONG
LW

Personal Blog

6

Call for cognitive science in AI safety

by Linda Linsefors
29th Sep 2017
2 min read
12

6

Personal Blog

6

Call for cognitive science in AI safety
4Kaj_Sotala
4Linda Linsefors
3Ben Pace
2Raemon
2Ben Pace
2Linda Linsefors
3ESRogs
1Linda Linsefors
2magfrump
1Linda Linsefors
2magfrump
2[anonymous]
New Comment
12 comments, sorted by
top scoring
Click to highlight new comments since: Today at 4:02 PM
[-]Kaj_Sotala8y40

I wrote about the same thing here: Cognitive science/psychology as a neglected approach to AI safety

Reply
[-]Linda Linsefors8y40

Recent talk by Stuart Armstrong related to this topic:

https://www.youtube.com/watch?v=19N4kjYbZD4

Reply
[-]Ben Pace8y30

[Note from the Sunshine Regiment] Hi Linda! At the current time, the frontpage is for discussion of ideas, and not for discussion of the community, coordination, or calls-to-action. As such, I've moved this post to your personal LW blog, and removed it from the front page.

Apologies for this not being fully clear so far, I've tried to point to this sort of thing in the current frontpage content guidelines, but it will be made more explicit by launch.

Reply
[-]Raemon8y20

Just to clarify - would a similar post that was framed more around "here are potential ideas relating to cognitive science in AI that I think are valuable and here's why" (without making it into an explicit call to action, more of a "if you buy this claim than you'd probably end up doing something with it" be fine for front-page?

Reply
[-]Ben Pace8y20

Yup, arguing for epistemic conclusions about AI and/or cognitive science is the appropriate category of content for the frontpage - for example Linda's other post today.

Reply
[-]Linda Linsefors8y20

Basically, if I change the title, it can go on the front page?

Reply
[-]ESRogs8y30

Linda Linsefors

Alexander Appel

Holden Lee

somnulence logencia

I am confused by this signature -- the post uses "I", but with multiple signatories I would have expected "we".

Should I consider Linda to be the author, with Alexander, Holden, and somnulence simply expressing their assent?

Reply
[-]Linda Linsefors8y10

Yes, that is correct.
I wrote the text and asked people to cosign if the agreed, for signaling value.

Do you have a good idea on how to make this clearer?

Reply
[-]magfrump8y20

Maybe just write "Cosigned," above the names?

Reply
[-]Linda Linsefors8y10

Better?

Reply
[-]magfrump8y20

Yeah I think that's pretty clear

Reply
[-][anonymous]8y20

Hi, I've been thinking along a related line. I wrote something today as arguing we should investigate IQ in the hope that it will help us predict takeoff. If anyone has a good psych background reading this, I'd some like feedback.

I also think that engaging with psychology would also be a sign that AI safety is maturing into a science that takes all sources of information seriously and would increase it's credibility with the general scientific community.

Reply
Moderation Log
More from Linda Linsefors
View more
Curated and popular this week
12Comments

Epistemic status: High expected utility, but also very high variance

The more I realise that AI take off is something that actually might happen, the more I am pulled towards this problem:

  • What are human preferences really?

  • What is the generator of human preferences?

  • What are our preferences made of?

  • What is the structure behind it all?

Before we tell our brand new AI overlord to figure out our values and do whatever we want it to do, we really ought to have a clearer idea of what “values” and “want” means.

I have a good idea of what my preferences are within the limited reach of my lived experience, and even a little bit beyond that. But to extrapolate from that into the vast distance of possible futures seems extremely dangerous.

My values are inconsistent and conflicting and definitely not constant over time. On top of that, there are the big heap of unknown unknowns with respect to how the brain works.

I am convinced that to solve AI safety we need to have a good understanding of human values, and I know I don’t have this understanding. I am just a physics and math nerd. I don’t know this stuff. I don’t know if the questions I have are open research questions, or if this stuff is already well known and understood in some separate community somewhere. That is why we need psychology nerds to join the cause.

Another topic that I would want AI safety orientated psychology research to do, is something like a case study of friendliness in existing agents (humans, subsystems in the brain, organisations). What are the mechanisms in the human brain that make us care about others, and can that be replicated?

* * *

A problem I see is that only math and computer nerds are called upon to work on AI safety, and all the psychology nerds out there do not even know that they are needed. Or maybe the psychology research that I am looking for is already out there and we just need to find each other to collaborate more.

I think that it is important that technical AI safety research does not try to set the agenda for psychology AI safety research. Information and inspirations needs to flow both ways. Both fields need to be free to follow their own curiosity, but we also need to collaborate to ground our work in each other's knowledge.

* * *

Linda Linsefors

Cosigning:

Alexander Appel

Holden Lee

somnulence logencia

Mentioned in
3Extensive and Reflexive Personhood Definition