Characterizing stable regions in the residual stream of LLMs
by Jett Janiak, jacek, Chatrik, Giorgi Giglemiani, nlpet, and StefanHex
This research was completed for London AI Safety Research (LASR) Labs 2024. The team was supervised by @Stefan Heimershiem (Apollo Research). Find out more about the program and express interest in upcoming iterations here. This video is a short overview of the project presented on the final day of the...
Sep 26, 202443