Noa Nabeshima's Shortform
Jan 15, 20253
View trees here Search through latents with a token-regex language View individual latents here See code here (github.com/noanabeshima/matryoshka-saes) Alternate version of this document with appropriate-height interactives. Abstract Sparse autoencoders (SAEs)[1][2] break down neural network internals into components called latents. Smaller SAE latents seem to correspond to more abstract concepts while...
Which variables are most important for predicting and influencing how AI goes? Here are some examples: * Timelines: “When will crazy AI stuff start to happen?” * Alignment tax: “How much more difficult will it be to create an aligned AI vs an unaligned AI when it becomes possible to...