LESSWRONG
LW

AI Alignment FieldbuildingAI
Frontpage

6

No Clickbait - Misalignment Database

by Kabir Kumar
18th Feb 2024
1 min read
10

6

AI Alignment FieldbuildingAI
Frontpage

6

No Clickbait - Misalignment Database
2sudhanshu_kasewa
1Kabir Kumar
2sudhanshu_kasewa
2quetzal_rainbow
1Kabir Kumar
2iva
1Kabir Kumar
1Tianyi (Alex) Qiu
1Kabir Kumar
1Kabir Kumar
New Comment
10 comments, sorted by
top scoring
Click to highlight new comments since: Today at 8:58 AM
[-]sudhanshu_kasewa1y20

It might be worth (someone) writing out what is meant by each kind of misalignment category, as used in the db. Objective misalignment, specific gaming, value misalignment all seem overlapping, and I'm not at all sure what physical misalignment is supposed to be pointing to.

Reply
[-]Kabir Kumar1y10

for sure. right now it's just a google form and google sheets. would you be interested in taking charge of this?

Reply
[-]sudhanshu_kasewa1y20

No, this is not something I can undertake -- however, the effort itself need not be very complicated. You've already got a list of Misalignment types in the form: create a google doc with definitions/descriptions of each of these, and put a link to that doc in this question.

Reply
[-]quetzal_rainbow1y20

There is only link to add database entry, without link to view database itself.

Reply
[-]Kabir Kumar1y10

Ah, sorry, here's the link! https://docs.google.com/spreadsheets/d/1uXzWavy1mS0X-uQ21UPWHlAHjXFJoWWlN62EyKAoUmA/edit?usp=sharing 

Thank you for pointing that out, also added it to the post!

Reply
[-]iva1y20

I think you copy patsed the wrong link - the first link leads to a form one can use to add an example, not to the list of examples.

Reply
[-]Kabir Kumar1y10

Thank you, I've labelled that as the form link now and added the DB link.

Reply
[-]Tianyi (Alex) Qiu1y10

There's also the goal misgeneralization database by DeepMind, in parallel to the misspecification one: blogpost, database.

Reply
[-]Kabir Kumar1y10

Thank you! I'll add those as well!

Reply
[-]Kabir Kumar1y10

Updated to 115.

Reply
Moderation Log
Curated and popular this week
10Comments

This is a database of cases of Misalignment - classified by Type of Misalignment, Type of AI, etc. 
Link to add more:
https://docs.google.com/forms/d/e/1FAIpQLSfE7ZeSV6W_YmKYrgy7BaiFKj90dBJ2qDUaYXzbpi_ILEs9sQ/viewform?usp=sf_link 

Link to the DB: https://docs.google.com/spreadsheets/d/1uXzWavy1mS0X-uQ21UPWHlAHjXFJoWWlN62EyKAoUmA/edit?usp=sharing 

Made it last week.
Currently there are 115 entries - 62 of which are from the Specification Gaming db made by DeepMind https://deepmindsafetyresearch.medium.com/specification-gaming-the-flip-side-of-ai-ingenuity-c85bdb0deeb4 

For some reason, as far as I know, this is the first public database like this. 
The closest that I know of are the Specification Gaming database and the https://incidentdatabase.ai/ 

For a community that's supposed to be science fans, I'm pretty baffled at the lack of something as basic as this existing- among many, many other things. 
If you know of any cases, please add them. 

Edits: 
Added link to the DB: https://docs.google.com/spreadsheets/d/1uXzWavy1mS0X-uQ21UPWHlAHjXFJoWWlN62EyKAoUmA/edit?usp=sharing 
Made more clear what's DB, what's form. 
 

Mentioned in
6[Aspiration-based designs] A. Damages from misaligned optimization – two more models