The Compleat Cybornaut
A cluster of conceptual frameworks and research programmes have coalesced around a 2022 post by janus, which introduced language models as ‘simulators’ (of other types of AIs such as agents, oracles, or genies). One such agenda, cyborgism, was coined in a post by janus and Nicholas Kees and is being researched as part of the 2023 editions of AI Safety Camp and SERI MATS. The objective of this document is to provide an on-ramp to the topic, one that is hopefully accessible to people not hugely familiar with simulator theory or language models. So what is cyborgism? Cyborgism proposes to use AIs, particularly language models (i.e. generative-pretrained transformers or GPTs), in ways that exploit their (increasingly) general-purpose intelligence, while retaining human control over the ‘dangerous bits’ of AI – i.e. agency, planning, and goal-formation. The overall objective is to leverage human cognitive ability while minimising the risks associated with agentic AI. Aside from agency, a core assertion of cyborgism is that certain commonly-used language models are not well-suited to many tasks human users throw at them, but that humans, if appropriately-trained and equipped, might more effectively use GPTs in ways that are ‘natural’ for the model, while dramatically increasing the productive and creative potential of the human. Specifically, some current systems, such as ChatGPT, are released or predominantly used in a ‘tuned’ version, which has a host of shortcomings.[1] One such tuning method, reinforcement-learning from human feedback (RLHF) has a specific weakness relevant to cyborgism: the tuning process severely limits, or collapses, a valuable aspect of the GPT, namely its wild, unconstrained creativity. Superficially, the cyborgism approach may resemble a human-plus-oracle setup, but there is a subtle and important distinction: an oracle, it is argued, might ‘smuggle in’ some of the trappings of an agent.[2] In contrast, the human cyborg embeds the
