Here we briefly summarize the results so far from our U.S. nationally representative survey on Artificial Intelligence, Morality, and Sentience (AIMS), conducted in 2021 and 2023. The full reports are available on Sentience Institute’s website for the AIMS 2023 Supplemental Survey, AIMS 2023 Main Survey, and AIMS 2021 Main Survey....
In the wake of FLI’s AI 6-month pause letter and Eliezer Yudkowsky’s doom article in Time, there seems to be more high-level attention on the arguments for low and high estimates of AI existential risk. My estimate is currently ~30%, which is higher than most ML researchers but seems lower...
I think a potentially promising and undertheorized approach to AI safety, especially in short timelines, is natural language alignment (NLA), a form of AI-assisted alignment in which we leverage the model’s rich understanding of human language to help it develop and pursue a safe[1] notion of social value by bootstrapping...
Last revised on June 14, 2023. ChatGPT, Sydney, LaMDA, and other large language models (LLMs) have inspired an unprecedented wave of interest in artificial intelligence (AI). Many are asking if these LLMs are not merely problem-solving tools but have mental faculties such as emotion, autonomy, socialization, meaning, language, and self-awareness....
Abstract Despite much progress in training AI systems to imitate human language, building agents that use language to communicate intentionally with humans in interactive environments remains a major challenge. We introduce Cicero, the first AI agent to achieve human-level performance in Diplomacy, a strategy game involving both cooperation and competition...
I read several shard theory posts and found the details interesting, but I couldn't quite see the big picture. I'm used to hearing "theory" refer to a falsifiable generalization of data. It is typically stated as a sentence, paragraph, list, mathematical expression, or diagram early in a scientific paper. Theories...