AI security might be helpful for AI alignment

I am a person new to AI alignment, and recently I am looking for a job at the AI governance field, but it seems like it is hard to find a long-term financially stable full-time job.

For me, it basically means that only a tiny amount of people consider AI alignment important enough to pay money for the decrease of P(doom).

It is important to get more people involved in the AI alignment field, but there is another good way to increase the number of people who might help achieve similar goals. We might cooperate with other groups of people, whose agenda is beneficial for AI alignment. One such group is the AI security community.

AI security

AI security is a branch of software security. Its goal is to ensure that current AI systems can not be exploited by hackers, are free of backdoors, and generally do things they are intended to do and do not do any unexpected and potentially harmful things.

For example, there is a grocery store not far from my home, which uses cameras with facial recognition algorithms to identify a buyer and automatically charge money from their credit card. Malicious actors might use some kind of makeup or a mask to trick the algorithm. So, to prevent it, the operator of this payment system might hire an AI security engineer to simulate an attack on the model, find its vulnerabilities and fix them.

Why this company should care that its facial recognition system has no security issues? Because if someone decides to hack it and steal money from its customer, it might become a huge scandal and lead to lawsuits, so it is relatively easy to convince top management to pay money for AI security. For them, it is just another branch of software security.

A similar thing with governments. I think it is manageable to convince governments to implement policies and guidelines for safe and robust AI systems because we already have similar policies for other important software.

AI security has a lot in common with the task of aligning AGI. There are few of them:

It benefits from interpretable models.
It requires the education of major decision-makers about the risks of unsafe AI systems.
Government and enterprise policies might require thorough testing and certification of AI models, it not only will make systems safer, but also might slow down the rate of AI progress which is good for alignment.
AI security also requires systematic audit of AI models, some kind of alarm system in case of an attack, and for containment of its consequences.

AI security researchers and organizations might help with exposing AI-associated risks to the general public. They might develop new secure and robust AI systems, they might become allies in policymaking.

I further think that as technology progresses, AI security's goal of making sure that important and powerful AIs are doing what they are intended to do and do no harm to people will converge with the AI alignment goal of making sure that AGI will do what it is intended to do and not cause no harm to people.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

36

AI security might be helpful for AI alignment

36

36

AI security