FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI FrontierMath presents hundreds of unpublished, expert-level mathematics problems that specialists spend days solving. It offers an ongoing measure of AI complex mathematical reasoning progress. We’re introducing FrontierMath, a benchmark of hundreds of original, expert-crafted mathematics problems designed to evaluate advanced...
The performance of machine learning models is closely related to their amount of training data, compute, and number of parameters. At Epoch, we’re investigating the key inputs that enable today’s AIs to reach new heights. Our recently expanded Parameter, Compute and Data Trends database traces these details for hundreds of...
We develop a simple model that predicts progress in the performance of field-effect transistor-based GPUs under the assumption that transistors can no longer miniaturize after scaling down to roughly the size of a single silicon atom. We construct a composite model from a performance model (a model of how GPU...
How much progress in ML depends on algorithmic progress, scaling compute, or scaling relevant datasets is relatively poorly understood. In our paper, we make progress on this question by investigating algorithmic progress in image classification on ImageNet, perhaps the most well-known test bed for computer vision. Using a dataset of...
In short: Training runs of large Machine Learning systems are likely to last less than 14-15 months. This is because longer runs will be outcompeted by runs that start later and therefore use better hardware and better algorithms. [Edited 2022/09/22 to fix an error in the hardware improvements + rising...
Executive Summary Using a dataset of 470 models of graphics processing units (GPUs) released between 2006 and 2021, we find that the amount of floating-point operations/second per $ (hereafter FLOP/s per $) doubles every ~2.5 years. For top GPUs, we find a slower rate of improvement (FLOP/s per $ doubles...
Summary * We are a new research organization working on investigating trends in Machine Learning and forecasting the development of Transformative Artificial Intelligence * This work is done in close collaboration with other organizations, like Rethink Priorities, Open Philanthropy, and MIT CSAIL * We will be hiring for 2-4 full-time...