I would hereby like to advertise a document Systems of Services as a Paradigm for AI Alignment . I hope that it can serve as a starting point for investigating AI alignment through the lens of systems of AI services. An alternative framing for the text is that it is a collection of pieces I have found particularly helpful for having productive conversations about the topic. In this post, I briefly recap the motivation behind the document and outline its contents. I also argue for why service-systems are a useful paradigm and mention my best guesses for promising future work.
As part of a recent collaboration, we wanted to look over technical problems in Comprehensive AI Services (CAIS), then pick one of them and study it in more detail. However, it soon turned out that a lot more conceptual work was needed before we can formalize any of our thoughts in a way that would be useful. We did study Drexler's Reframing Superintelligence prior to embarking on this project, and the text indeed provides a lot of useful insights into the nature of advanced systems of AI services. However, it puts less emphasis on how to use the service framework for technical research. So from our starting position, we weren't sure how to define the basic concepts, how to model problems in CAIS, and what these problems even are.
We also decided to focus on a slightly different framing, mostly in two ways. First, Reframing Superintelligence primarily discusses comprehensive AI services, which are expected to be on par with a artificial general intelligence. Since we can reasonably expect AI X-risk to be associated with AI that is radically transformative, we wanted to deal with a broader class of systems. Second, it is likely that even very advanced AI services will be deeply entangled with services that are "implemented on humans". (Especially since we can view human organizations and institutions as clusters of services.) From this reason, the document studies hybrid systems of services that consist of both AI- and human services. Even if our interest is primarily in AI services, studying the whole system (including its non-AI parts) should make it easier to fully understand their impact on the world.
Contents of the document
While we did eventually make some progress on more specific problems, I thought it might be useful to write down a separate "introductory" text summarizing the things-that-seem-essential about the framework of service systems. Here is the document's slightly annotated table of contents:
- Introduction (motivating the approach; skippable if you have read this post)
- Basic Concepts (explaining what we mean by services, tasks, and systems of services, defining several useful concepts, giving many examples)
- Modelling Approaches (describing different types of models we might later want to come up with for service systems; introducing a "simple abstract model" for this framework)
- Research Questions ( a) how to think about, and generate lists of, problems and other research tasks for this paradigm; b) a list of research questions, not exhaustive but reasonably wide)
- Related Fields of Study (a list of existing research fields that seem particularly relevant)
- Research Suggestions (references for further reading, our guesses for what might be valuable to investigate)
Some of our conclusions
We think the paradigm of systems of services deserves further attention. In terms of technical alignment research, it might expose some new problems and provide a new point of view on the existing ones. Moreover, AI currently takes the form of services, rather than agents. As a result, the service-system paradigm might be more suitable for communicating with wider audience (e.g., the public, less technical fields like AI policy, and AI researchers from outside of the long-termist community).
We have no definitive answers for what work needs to be done in this area, but some of the useful directions seem to be:
- Technical problems and formalization. Formalizing and getting progress on technical problems in service systems. As a side product, we might also attempt to build more solid foundations for the paradigm (i.e., formalizing the basic concepts, building mathematical models).
- Building on top of Reframing Superintelligence: Drexler's text identifies and informally states many important hypotheses about the nature of AI-service systems, such as the claim that there will be no compelling incentives to replace comprehensive narrow services by an "agent-like" AGI (Section 12). We believe it would be valuable to (a) map out the different hypotheses and assumptions made in the report and (b) formalize specific hypotheses and explore them further.
- Tools or agents? Many people seem to believe that while "agent-like" AGI and “fully-automated comprehensive system of services” might have similar capabilities, there is some fundamental difference between the two types of AI. At the same time, there seems to be a general confusion around this topic. Some relevant questions are:
- Can we find a framework in which these similarities and differences could be explained or dissolved?
- In particular, does there perhaps exist a formalization of “agency” that can differentiate between the two?
- If there are meaningful distinctions, how do they translate into what it means to “align” each type of AI?
- How does the effectivity of narrow-purpose algorithms differ from the effectivity of general-purpose algorithms? Should we expect economic pressures towards generality (and maybe even "agent-like" AGI)?
- Identifying connections to existing fields. In many cases, problems with AI services will "merely" be automated versions of problems previously studied in other fields (e.g., automating the police might seemingly push some aspects of law under the umbrella of AI). For people who already have an expertise both in AI risk and in some relevant field, it might thus be valuable to clarify the connection of the two. On the one hand, clearly laying out how existing fields relate to service-systems might prevent other alignment researchers from reinventing the wheel by ignoring the existing research. On the other hand, it is important to identify the problems that arise when the existing field gets applied to service-systems and AI risk. We can then bring these new problems to the attention of the existing research communities, thus offloading some of the work (and possibly steering the field towards topics with more long-term impact). A prime example of a field whose utilization might be highly impactful is AI ethics. However, due to effects like idea inoculation, reputation costs (to AI risk), and unilateralist’s curse, we think this task should only be attempted by people who are experienced at communicating ideas on this level and well-positioned to do so.
Acknowledgement: This document has been written by myself (Vojta), based on ideas I collected from a collaboration with Cara Selvarajah, Chris van Merwijk, Francisco Carvalho, Jan Kulveit, and Tushant Jha during the 2019/2020 AI Safety Research Program. While they gave a lot of feedback to the text and many of the ideas are originally theirs, they might not necessarily agree with all arguments and framing presented here. All mistakes are mine.