Top postsTop post
Ben Cottier
Message
At Epoch, helping to clarify when and how transformative AI capabilities will be developed.
Previously a Research Fellow on the AI Governance & Strategy team at Rethink Priorities.
329
Ω
136
16
20
Epoch AI collects key data on machine learning models from 1950 to the present to analyze historical and contemporary progress in AI. This is a big update to the website, and the datasets have substantially expanded since last year.
The performance of machine learning models is closely related to their amount of training data, compute, and number of parameters. At Epoch, we’re investigating the key inputs that enable today’s AIs to reach new heights. Our recently expanded Parameter, Compute and Data Trends database traces these details for hundreds of...
Summary 1. Using a dataset of 124 machine learning (ML) systems published between 2009 and 2022,[1] I estimate that the cost of compute in US dollars for the final training run of ML systems has grown by 0.49 orders of magnitude (OOM) per year (90% CI: 0.37 to 0.56).[2] See...
This post is one part of the sequence Understanding the diffusion of large language models. As context for this post, I strongly recommend reading at least the 5-minute summary of the sequence. Conclusion In this sequence I presented key findings from case studies on the diffusion of eight language models...
This post is one part of the sequence Understanding the diffusion of large language models. As context for this post, I strongly recommend reading at least the 5-minute summary of the sequence. This post lists questions about AI diffusion that I think would be worthy of more research at the...
This post is one part of the sequence Understanding the diffusion of large language models. As context for this post, I strongly recommend reading at least the 5-minute summary of the sequence. The primary aim of this research project was to be descriptive about the diffusion of large language models....
This post is one part of the sequence Understanding the diffusion of large language models. As context for this post, I strongly recommend reading at least the 5-minute summary of the sequence. In this post, I: 1. Overview the different forms of information and artifacts that have been published (or...