Regarding the problem of differential progress in AI capability research vs AI Alignment research.

I intend to work in Artificial Intelligence, but I would want to pursue capability research over alignment research, and due to the nature of incentives in the two fields so would most people. This means that safety research would lag behind capability research which is a lose lose situation for everyone as expecting the people who first develop Human Level Machine Intelligence (HLMI) to hold off on deployment until safety research catches up is optimistic at best and outright naive at worst.

To do my own part in mitigating this problem, I've decided to commit to donate 10% of my income towards AI Safety, and if I decide to work full time in Safety I would reduce this to 5%, whereas if I instead choose to pursue capability research I would increase this to 20% (modulo acceptable standard of living and quality of life). This incentivises me to contribute to Safety research (or at least not worsen the problem), and if I do end up exacerbating the problem, I'll pay for it accordingly.

This is my own personal solution, and I make no claim that others should do something similar (even though I would prefer that they do). I'm currently personally poor (I don't work and have an allowance of around $50+ a month (PayPal didn't let me donate cents, or at least I didn't see how to), but I'm dependent on my parents so I have an acceptable standard of living and quality of life). It is more important to me that I donate at all to AI Safety research, than that I donate to the most effective AI Safety research organisation, and to prevent myself from procastinating under the guise of finding the most effective organisation, I just decided to donate to MIRI. Making this commitment public is so that I'm less likely to weasel out of it, so I should be creating these posts around once a month, and if 60 days go by without me creating a post about my donation towards AI Safety, feel free to send me a friendly reminder. ;)


New Comment

New to LessWrong?