Ram Bharadwaj — LessWrong

Scaling Laws for LLM Based Data Compression

Introduction The scaling laws for neural language models showed that cross-entropy loss follows a power law in three factors: 1. Dataset size 2. Model parameters 3. Training compute (steps or epochs) The lower the cross-entropy loss, the better the model’s next-token prediction. Since prediction and compression are deeply linked (The...

Aug 5, 20258

Exploring the Platonic Representation Hypothesis Beyond In-Distribution Data

The Platonic Representation Hypothesis (PRH) suggests that models trained with different objectives and on various modalities can converge to a shared statistical understanding of reality. While this is an intriguing idea, initial experiments in the paper focused on image-based models (like ViT) trained on the same pretraining (ImageNet) dataset. This...

Oct 20, 202413

Understanding Hidden Computations in Chain-of-Thought Reasoning

Recent work has demonstrated that transformer models can perform complex reasoning tasks using Chain-of-Thought (COT) prompting, even when the COT is replaced with filler characters. This post summarizes our investigation into methods for decoding these hidden computations, focusing on the 3SUM task. Background 1. **Chain-of-Thought (COT) Prompting**: A technique that...

Aug 24, 20246

Goal-misgeneralization is ELK-hard

Consider an adversarial training-scheme for solving goal-misgeneralization, ( here i consider Redwood Research’s work on “Adversarial Training for High-Stakes Reliability”). Consider a model that was trained to perform a specific task. To guarantee worst-case performance for this model, we need to have bounds for its outputs in adversarial examples. However,...

Jun 10, 20232

Hutter-Prize for Prompts

The aim of the Hutter Prize is to compress the first 1GB of Wikipedia to the smallest possible size. From the AIXI standpoint, compression is equal to AI, and if we can compress this to the ideal size (75MB according to Shannon's lower estimate), then the compression algorithm is equivalent...

Mar 24, 20235

The AGI needs to be honest

Imagine that you are a trained mathematician and you have been assigned the job of testing an arbitrarily intelligent chatbot for its intelligence. You being knowledgeable about a fair amount of computer-science theory won’t test it with the likes of Turing-test or similar, since such a bot might not have...

Oct 16, 20212