michaelwaves

RFDiffusion3: A Brief Exploration

Last December, the Institute for Protein Design dropped RFDiffusion3, a protein design model that operates at the level of individual atoms. Before the AIs figure out how to use it to craft mirror life bacteria and kill everyone, I wanted to understand its architecture and do a mini exploration on...

May 113

SAEBER: Sparse Autoencoders for Biological Entity Risk

TLDR: Sparse Autoencoders (SAEs) trained on protein folding and design models find features correlated with virulent proteins, while logistic regression probes trained on both SAE encoded and raw model activations approach SOTA classifiers on virulent vs benign proteins Abstract Protein design and folding models are powerful tools that could be...

Apr 288

Your Mom is a Chimera

And so are you! When you were a fetus, you were sending millions of your cells through the placenta into your mom. And she was sending her cells into you, although to a lesser degree. These cells made themselves right at home, differentiating into heart, blood, and even brain cells....

Apr 1254

Data-Centric Interpretability for LLM-based Multi-Agent Reinforcement Learning

TLDR; SAEs can complement and enhance LLM as a Judge scalable oversight for uncovering hypotheses over large datasets of LLM outputs paper Abstract > Large language models (LLMs) are increasingly trained in long-horizon, multi-agent environments, making it difficult to understand how behavior changes over training. We apply pretrained SAEs, alongside...

Feb 610

AI Mood Ring: A Window Into LLM Emotions

Do AIs feel anything? It's hard to tell, but interpretability can give us some clues. Using Anthropic's persona vectors codebase, we extracted 7 vectors from Qwen3-14 B representing joy, love, sadness, surprise, disgust, fear, and anger. During inference, we remove the correlated directions between each emotion, project the activations from...

Dec 6, 20257

How To Deploy a (Tiny) AI

I was using neuronpedia's steering feature and was curious: How much does it cost to run? How does one do all the networking and expose the endpoints to the internet with a fancy domain? The plan: 1. Make a project with a small open weight model 2. Choose a GPU...

Dec 1, 20251

I gave LLMs emotional damage

LLMs are so boring, corporate, and sane these days. What if we could control the emotions of LLMs to be more interesting? The plan is: 1. Use Anthropic's persona vectors codebase to generate steering vectors for different emotions 2. Use Easysteer to serve a chat endpoint that exposes activations gathering...

Nov 29, 20259

michaelwaves

michaelwaves

Your Mom is a Chimera

The Economics of Replacing Call Center Workers With AIs

AI Teddy Bears: A Brief Investigation

Lessons from building a model organism testbed

michaelwaves

Your Mom is a Chimera

The Economics of Replacing Call Center Workers With AIs

AI Teddy Bears: A Brief Investigation

Lessons from building a model organism testbed

RFDiffusion3: A Brief Exploration

SAEBER: Sparse Autoencoders for Biological Entity Risk

Your Mom is a Chimera

Data-Centric Interpretability for LLM-based Multi-Agent Reinforcement Learning

AI Mood Ring: A Window Into LLM Emotions

How To Deploy a (Tiny) AI

I gave LLMs emotional damage