x

LESSWRONG

LW

Ben Cottier — LessWrong

Ben Cottier

Top postsTop post

Ben Cottier

Message

At Epoch, helping to clarify when and how transformative AI capabilities will be developed.

Previously a Research Fellow on the AI Governance & Strategy team at Rethink Priorities.

329

Ω

136

16

20

7y

Ben Cottier

At Epoch, helping to clarify when and how transformative AI capabilities will be developed.

Previously a Research Fellow on the AI Governance & Strategy team at Rethink Priorities.

Top postsTop post

Clarifying some key hypotheses in AI alignment

We've created a diagram mapping out important and controversial hypotheses for AI alignment. We hope that this will help researchers identify and more productively discuss their disagreements. Diagram A part of the diagram. Click through to see the full version. Caveats 1. This does not decompose arguments exhaustively. It does not include every reason to favour or disfavour ideas. Rather, it is a set of key hypotheses and relationships with other hypotheses, problems, solutions, models, etc. Some examples of important but apparently uncontroversial premises within the AI safety community: orthogonality, complexity of value, Goodhart's Curse, AI being deployed in a catastrophe-sensitive context. 2. This is not a comprehensive collection of key hypotheses across the whole space of AI alignment. It focuses on a subspace that we find interesting and is relevant to more recent discussions we have encountered, but where key hypotheses seem relatively less illuminated. This includes rational agency and goal-directedness, CAIS, corrigibility, and the rationale of foundational and practical research. In hindsight, the selection criteria was something like: 1. The idea is closely connected to the problem of artificial systems optimizing adversarially against humans. 2. The idea must be explained sufficiently well that we believe it is plausible. 3. Arrows in the diagram indicate flows of evidence or soft relations, not absolute logical implications — please read the "interpretation" box in the diagram. Also pay attention to any reasoning written next to a Yes/No/Defer arrow — you may disagree with it, so don't blindly follow the arrow! Background Much has been written in the way of arguments for AI risk. Recently there have been some talks and posts that clarify different arguments, point to open questions, and highlight the need for further clarification and analysis. We largely share their assessments and echo their recommendations. One aspect of the di

Modeling Failure Modes of High-Level Machine Intelligence

Modeling the impact of safety agendas

Modeling Risks From Learned Optimization

Data on AI

Epoch AI collects key data on machine learning models from 1950 to the present to analyze historical and contemporary progress in AI. This is a big update to the website, and the datasets have substantially expanded since last year.

Jun 20, 2024•1

Announcing Epoch's newly expanded Parameters, Compute and Data Trends in Machine Learning database

The performance of machine learning models is closely related to their amount of training data, compute, and number of parameters. At Epoch, we’re investigating the key inputs that enable today’s AIs to reach new heights. Our recently expanded Parameter, Compute and Data Trends database traces these details for hundreds of...

Oct 25, 2023•18

Trends in the dollar training cost of machine learning systems

Summary 1. Using a dataset of 124 machine learning (ML) systems published between 2009 and 2022,[1] I estimate that the cost of compute in US dollars for the final training run of ML systems has grown by 0.49 orders of magnitude (OOM) per year (90% CI: 0.37 to 0.56).[2] See...

Feb 1, 2023•23

Conclusion and Bibliography for "Understanding the diffusion of large language models"

This post is one part of the sequence Understanding the diffusion of large language models. As context for this post, I strongly recommend reading at least the 5-minute summary of the sequence. Conclusion In this sequence I presented key findings from case studies on the diffusion of eight language models...

Jan 16, 2023•4

Questions for further investigation of AI diffusion

This post is one part of the sequence Understanding the diffusion of large language models. As context for this post, I strongly recommend reading at least the 5-minute summary of the sequence. This post lists questions about AI diffusion that I think would be worthy of more research at the...

Jan 16, 2023•4

Implications of large language model diffusion for AI governance

This post is one part of the sequence Understanding the diffusion of large language models. As context for this post, I strongly recommend reading at least the 5-minute summary of the sequence. The primary aim of this research project was to be descriptive about the diffusion of large language models....

Jan 16, 2023•7

Publication decisions for large language models, and their impacts

This post is one part of the sequence Understanding the diffusion of large language models. As context for this post, I strongly recommend reading at least the 5-minute summary of the sequence. In this post, I: 1. Overview the different forms of information and artifacts that have been published (or...

Jan 16, 2023•4

Load More (7/18)