Upskilling, bridge-building, research on security/cryptography and AI safety

Allison Duettmann

[This part 5 of a 5 part sequence on security and cryptography areas relevant for AI safety, published and linked here a few days apart.]

Summary

In this sequence, I highlighted a few relevant areas ripe for exploration to apply security and cryptography mindset and approaches to aid with AI safety:

Part 1 | AI infosec: first strikes, zero-day markets, hardware supply chains: The first part focused on infosec considerations for AI safety. It highlighted security considerations relevant for AI safety, including secure open processor design, hardware supply chain risks, zero-day vulnerabilities, high adoption barriers of secure systems, and AI potentially worsening the offense defense problem.
Part 2 | AI safety and the security mindset: user interface design, red-teams, formal verification: The second part focused on parallels between security and AI alignment. It highlighted parallels between the AI safety and AI security mindset, parallels between existing alignment approaches such as Constitutional AI and AI Debate and long-standing security approaches, parallels in user interface design problems, and parallels between security techniques, such as red-teaming, formal verification, and fuzzing with AI safety.
Part 3 | Boundaries-based security and AI safety approaches: The third section suggested that the principle of respecting boundaries across entities may hold universally for coordination. It works for object-oriented security techniques, for coordinating today’s human and computer entities, and may have useful lessons for aligning more advanced AI systems, such as in the Open Agency model.
Part 4 | Specific cryptographic and auxiliary approaches to consider for AI: The fourth section focused on specific cryptography and auxiliary approaches of relevance for AI safety. It highlighted techniques that could shield advanced AI systems, help with AI governance, and unlock local and private data provision in secure decentralized ways.

This post is very likely not comprehensive, and some points may be controversial, so I would be grateful for comments, criticisms, and pointers to further research. With that being said, here are three general paths I think would be worthwhile to pursue to explore these approaches further in relation to AI safety:

Upskilling and career opportunities in infosec for AI

Pointers focused on working infosecurity considerations for AI include:

In EA Infosec: skill up in or make a transition to infosec via this book club, Wim Van Der Shoot and Jason Clinton call for existing EAs to skill up in infosec for general EA cause areas (not only AI) . They mention that “our community has become acutely aware of the need for skilled infosec folks to help out in all cause areas. The market conditions are that information security skilled individuals are in shorter supply than demand.” They are running an infosec book club with sign-ups here.

In Jobs that can help with the most important century, Holden Karnofsky points to the information security careers in AI infosec as a potentially important area to explore for new talent entering the field, suggesting that “It’s plausible to me that security is as important as alignment right now, in terms of how much one more good person working it will help.” He points to this 80,000 Hours post on career opportunities in infosec, this post on Infosec Careers for GCR reduction, and this related Facebook group.

In AI Governance & Strategy: Priorities, Talent Gaps, & Opportunities, Akash identifies security as an underexplored area in AI safety, asking: “Can security professionals help AI labs avoid information security risks and generally cultivate a culture centered on security mindset?” He encourages people interested in this topic to contact him directly and notes that he can introduce people to security professionals working in relevant areas.

Bridge-build between security and cryptography communities and AI safety

In terms of more direct applications of security and cryptography approaches to AI safety problems, this is an extremely new spaces, but here are two opportunities I would like to highlight:

First, to encourage more general bridge-building between communities, AI safety researchers could attend relevant security conferences, such as BlackHat USA, DefCon (which a few EAs attended last year), or other cryptography conferences, such as the IEEE Symposium on Security and Privacy or the Financial Cryptography Conference to learn about new approaches and evaluate their applications for AI. In addition, more outreach to security researchers working on the techniques discussed in this sequence could help, for instance by inviting them to relevant AI safety workshops and discussions.

Second, more specific workshops, fellowships and courses could aid in exploring the relevance of individual domains discussed in this sequence for AI safety. OpenMined is building a great open source onboarding environment for people seeking to enter the field of privacy preserving technology with a focus on AI. Foresight Institute’s 2023 Cryptography, Security, AI Workshop is focused on advancing opportunities for progress at this nascent intersection through project development and networking.

Research the relevance of the proposed approaches for AI safety, including their dissimilarities and risks

There is more work to be done on where security and cryptography mindsets and approaches are dissimilar to AI safety considerations. For instance, in Work on Security Instead of Friendliness? Wei Dai offers skepticism that cryptography and security approaches may be scalable to AI, the economics of security seems more unfavorable than in simple cryptography problems and because solving the problem of security at a sufficient level of generality requires understanding goals, and is essentially equivalent to solving friendliness.

There is also more research to be done to flesh out how the security and cryptography approaches mentioned throughout would actually apply to AI safety. For those deemed promising, much work needs to be done to solve the various technical, efficiency and adoption barriers that are holding these technologies back.

Finally, there is more research to be done on new risks that may be posed by some of the technologies discussed in relation to AI safety. For instance, to the extent that some of the approaches can lead to decentralization and proliferation, how much does this improve access and red teaming versus upset AI safety tradeoffs?

It would be useful to analyze which of the approaches discussed have the highest promise of being preferentially safety-enhancing, and could be developed as differential technology approaches.